Skip to main content
SAGE
Search form
  • 00:10

    FRANCISCA GROMME: I'm Francisca Gromme.I work at Goldsmiths University of Londonas a post-doctoral researcher at the Department of Sociology.I'm part of an ERC-funded project which is abouthow statistics and data science are at this momentmeeting each other, and how that leads to characterizations

  • 00:35

    FRANCISCA GROMME [continued]: of national or European identities.And we study these in an ethnographic projects--in a team-- so we cover different countries.In my research I was interested in new technologies that

  • 00:55

    FRANCISCA GROMME [continued]: would be applied in crime control,and those new technologies were thenapplied to collect new types of data about people.So census, and in that case I'll be talking about,it was to use a data mining softwarepackage in a policy practice to learn about problem youth.So they were testing this out-- this was in 2011--

  • 01:21

    FRANCISCA GROMME [continued]: in a Department of Community Safety in a Dutch city.And the idea was that you could combine marketing data--so the type of data that Experian collects.So this is about the newspapers people read,or where people go on holiday, what types of media

  • 01:42

    FRANCISCA GROMME [continued]: you would use.To combine this with police data-- on suspect--young suspects.And also data from the municipal register.So this is where people live, their ages, sortof basic personal information.And if you would combine that using this data mining

  • 02:03

    FRANCISCA GROMME [continued]: software package the idea was youwould know in more detail why youth would commit offenses.So they wanted to make what they would call local theoriesabout problem youth.And now I will use problem youth in talking about this,but it's of course, a particular label

  • 02:23

    FRANCISCA GROMME [continued]: they gave to a group of people.A word that's also used with youth as a risk.It wasn't about major crimes, so itwas about smaller offenses like shoplifting,or types of nuisance.And so I started the research in 2011when I met somebody from the data mining company that

  • 02:47

    FRANCISCA GROMME [continued]: would test their software package at this Departmentof Community Safety.And they agreed to let me sit-in with their meetings.And I was very interested at the timein how data analysis would actually work in practice,because I'm an ethnographer, I'm a social scientist.

  • 03:08

    FRANCISCA GROMME [continued]: And in my research I studied how actually facts and knowledgewere made using different types of knowledgein particular situations.And I was really interested in howthey would construct and shape what they would actuallymean by problem youth.

  • 03:29

    FRANCISCA GROMME [continued]: And I was also interested in actuallyknowing how all these promises that wewere hearing about big data--data mining, data science--would work out in practice.And what the social consequences of that would be.And my approach was, well, the only wayto know what type of consequencesit would have is to look at it as a social practice.

  • 03:60

    FRANCISCA GROMME [continued]: So my aim in this video is to talkabout how you can analyze data mining and dataanalysis as a social and situated practice.So it would be to add some more depth, and some more situatedanalysis to all these promises about data mining--

  • 04:21

    FRANCISCA GROMME [continued]: about digging up nuggets of gold,or zooming in, or connecting the dots.What is it actually that people do when they connect the dots?So I'll give one example of how I did that in my research.Of course, taking into account that this is a larger process.In other types of research we'll talkabout how algorithms are made, and the type of interests

  • 04:45

    FRANCISCA GROMME [continued]: that feed into that.And I'll only discuss analysis here.So what happens when results are presented on the screen, whena few people are sitting in the roomfrom different disciplines, different professions,and they make sense of what the algorithms present them.And so what I'd like to point out with regard to this case--

  • 05:07

    FRANCISCA GROMME [continued]: that problem youth wasn't something or a group waitingout there in advance just to be studied and clarifiedby this technology.It was something that was really actively constructedby people using their professional knowledges,but also their local policy interests.

  • 05:30

    FRANCISCA GROMME [continued]: And this doesn't-- my point will not be that this doesn't makethe type of knowledge created worse or useless,but we do need to take into account that it can be improvedin many ways, and there are different ways to do it that gofurther than just collecting more data.So I started the fieldwork by going to conferences

  • 05:58

    FRANCISCA GROMME [continued]: for people that were selling their technologies,and policy officers interested in acquiring themand testing them out.And this is how I met the company that I worked with.And to start out with I started talking to the director.I wanted to do an ethnographic case study, whichmeans that you observe people, and you

  • 06:21

    FRANCISCA GROMME [continued]: sit-in with their practices, and youtry to understand what is happening as much as youcan from their position.And in my case, from their community of expert practices.So I want to know the types of theoriesthey use, the interests they have,also their tacit knowledge-- sort

  • 06:41

    FRANCISCA GROMME [continued]: of the implicit understandings of what they were doing.And I wanted to do this for the technology company,but also for the policy officers.So my first contact was with the company.I had some initial meetings both at this--with these policy officers and at the company.I sat in with some project set-up meetings,

  • 07:04

    FRANCISCA GROMME [continued]: and slowly the ball got rolling.They got to know me, what interested me.We arrived at some common understandingsof what would be interesting for us both.And in the end, we decided that I would do a type of internshipat this policy office-- at this Department of Community Safety.

  • 07:26

    FRANCISCA GROMME [continued]: And for an ethnographer, in doing participant observation,that was a very good option for me,because these policy officers-- they'requite used to having interns.So they're very used to strangers practically sittingin at meetings.You don't have to explain much-- you alwayshave to explain who you are.And they were really very happy to get into conversation

  • 07:48

    FRANCISCA GROMME [continued]: with you because they were really figuring out--what type of knowledge is it that weneed to solve our problems?If we want to use sort of preventive strategieshow do we set them up?I think they were quite happy to just discussthese kind of things openly.As for issues of access, it was less easy

  • 08:10

    FRANCISCA GROMME [continued]: for me to hang out at the company,obviously, because they have their livelihoodsand their knowledge to protect.So I really respected that, and I did as many interviewsas I can--could-- and just to do--I did as many observations as I could.So those were the two major parts of the observations.

  • 08:33

    FRANCISCA GROMME [continued]: And sitting in with meetings, but also just gettingto know everyday policy practices, howthey use data, how they already use statistics, and whattheir problems was.But also how they understood what problem youth was.So in the end I had four key activities

  • 08:55

    FRANCISCA GROMME [continued]: that my fieldwork was made of.So I interviewed, then I did these observationsas part of an internship.I followed meetings in and outside the projectsI was following.I also quite often asked for demonstrationsof the software they were using, which

  • 09:18

    FRANCISCA GROMME [continued]: is not a natural situation-- they wouldn't do it anyway.But for me, it was quite useful.Because a lot of computer usage and interaction with software--it's not very visible.People don't talk to their computerso you don't know what they're doing, or why.So in a demonstration, if you just keep in mind that it's nota natural situation-- you're really asking people to do

  • 09:40

    FRANCISCA GROMME [continued]: stuff--it can be a good way to make tacit knowledges or unspokenknowledges to--to clarify them.And to learn how people interact with the software they use.And of course, I did a document study.So I kept track of all the documentsthat were used in the project, but also

  • 10:00

    FRANCISCA GROMME [continued]: around in the policy office.And it also took quite some emailing and work,so you want to set some time aside for that.So along the way, as I was sitting in meetingsand being interested in how this analysis worked,it caught my attention that everybody was alwaystalking about zooming in as soon as somethingwas presented on the screen.

  • 10:21

    FRANCISCA GROMME [continued]: They would also always ask, well, can you zoom in?Or, can we get more detail on that?Can you look at it for this particular district?So I got really interested in these sortof metaphors of vision.And that was also on the basis of workI had read by Donna Haraway and others,who said that metaphors we use and apply using technologies,

  • 10:45

    FRANCISCA GROMME [continued]: they often rely on these metaphors of visionto make it seem that when we gain knowledge in more detailit's an automated process that sort of occursfrom outside of the situation without any interest, or bias--unafflicted bias.

  • 11:05

    FRANCISCA GROMME [continued]: Sort of the time and place you're in.So part of Donna Haraway's point wasit obscures the way we actually make knowledge as humans,and why that is relevant.So I really hooked into that.I started to follow this zooming in.And to observe what actually happened

  • 11:28

    FRANCISCA GROMME [continued]: in the room, what people were doing whenthey said they were zooming in.So for example, they would gesture to a screen,and then gesture to a map to make a connection between twothings.So this, in my idea, was what the practiceof zooming in actually was.And to sort of focus my attention on this bit

  • 11:49

    FRANCISCA GROMME [continued]: I used a word by Charles Goodwin, whouses the term of "situated improvisations"to say how the way that people produce knowledgeis very time and place specific, and is actuallyan outcome of what he calls the interplay between screens,gestures, talk, and sort of artifacts--

  • 12:13

    FRANCISCA GROMME [continued]: local knowledges at a certain time and space.So that is how I really got to more sort of focusand structure in my field work.Yeah, so by "situated improvisations"I do not mean to say that data analysis israndom, or necessarily biased, or less true.

  • 12:35

    FRANCISCA GROMME [continued]: But the point is that if we take into account these sortof tacit knowledges--ways of agreeing on what good evidence is--they also feed into what in the endis the results, or the outcome of data analysis.

  • 12:56

    FRANCISCA GROMME [continued]: Another way in which saying "situated"doesn't really mean randomness, isto say that this situatedness is alsopart of longer histories of practice,and is rooted in professional knowledgesthat-- structure and occupation.So it doesn't mean random, but it means present at the time.

  • 13:27

    FRANCISCA GROMME [continued]: Of course, when you are observing peopleand interviewing you sort of have to make surethat they know who you are, what you're doing.I did an internship but I didn't present myselfas a student or an intern.I was quite clear that I was doingPhD research-- we just shaped the thing as an internship.

  • 13:50

    FRANCISCA GROMME [continued]: And usually when you go to meetingsand there are people there who you haven't met yetyou always start with an introduction round, in whichI told them what I did.Before I started to research we had agreements about consent,about the types of data I was using, what I was looking for.

  • 14:11

    FRANCISCA GROMME [continued]: And yeah, this is sort of how we coveredthese issues of knowing what I was writing down and why.And then yeah, once you're in the field,and you are talking to people, and you're in meetings,you can't always take notes.And taking notes-- that takes a lot of time,

  • 14:31

    FRANCISCA GROMME [continued]: and you need to do it or you'll forget.So during meetings I would take brief notes, whichis what almost everybody does, so you don't really stand out.And then I also just spent a lot of timeat the policy office with my laptop working, which is whatyou do when you're an intern.

  • 14:52

    FRANCISCA GROMME [continued]: So that was my time to type up notes,and that wasn't very strange.People also would just--as you do when you are at work-- get up, walk by your desk,and see what you're typing.And that would be also an occasionto talk it over-- what happened today.

  • 15:12

    FRANCISCA GROMME [continued]: My main finding was that algorithms do notwork on their own.What you see is that in analysis it'sreally a matter of pointing at the screen, connecting it,for example, to a map.Connecting what you see on the map about a neighborhood,

  • 15:33

    FRANCISCA GROMME [continued]: in this case, to earlier findings, or early experience,or existing categories of policy knowledge,just to make sense of the data presented on the screen.Because a number, for example--how many youth from neighborhood Acaused trouble or issues in neighborhood B.

  • 15:54

    FRANCISCA GROMME [continued]: Yet it doesn't lead to a theory of whyproblem youth are problem youth you need to do more about this.And what I found is there's a lotof talking through, a lot of makingnarratives, a lot of connecting to other categoriesof practice.So I found that in this case to zoom in

  • 16:17

    FRANCISCA GROMME [continued]: required a lot of zooming out again.So you find something about youth in neighborhood a.To make sense of that they often hadto connect it to what type of neighborhood is this.So to another higher up category.So I can give two examples of how data actually

  • 16:38

    FRANCISCA GROMME [continued]: brought to light--of the life.And the first is that it is oftena matter of zooming out-- of applyingan already existing label.So what they often did is once theyfound a result somebody would indeed point at the map,because this wasn't there.

  • 16:58

    FRANCISCA GROMME [continued]: The others would recognize the neighborhood,think about their previous experiencesin that neighborhood, and also thinkabout the socioeconomic status of this neighborhood.Because in the Dutch policy context neighborhoodsare really a basic part of making policies.

  • 17:18

    FRANCISCA GROMME [continued]: And this is because they actually historically referto a class of people.So policies are still based on assumptionsabout the class of people.So they connected these ideas about class,how some neighborhoods would be socially economically more

  • 17:39

    FRANCISCA GROMME [continued]: worse off, or they would-- for example--know, well, in that neighborhood thereare a lot of older people living there.And we know that older people complain more,so this is why you might have morereported offences in the neighborhood,or reported suspects.So these were all the ingredients

  • 18:01

    FRANCISCA GROMME [continued]: of how a particular piece of data was brought to lightor made relevant.So a lot happened once something was presented on the screen.A second example was to think about the neighborhood in termsof the facilities that were there.

  • 18:22

    FRANCISCA GROMME [continued]: So you would think about why do youth actually do that?How do these data make sense?Because the difference is we're oftensmall between neighborhoods, so how many offenders were there.So they would find out, OK, there is a swimming pool there,so that could explain it.Or there are no facilities at all in that neighborhood,so this is why they move from place to place.

  • 18:43

    FRANCISCA GROMME [continued]: Or to think in terms of these Experian categories.So there's a dart club over there.Might that be significant in a way?It wasn't actually guessing, but itwas sort of making things kind of come more whole togetherin order to select what was relevant,

  • 19:03

    FRANCISCA GROMME [continued]: and in order to think about or profile problem youth in termsof their motivations, their backgrounds.But also to weave in policy interests into that story.Whether you should build more facilitiesat a particular place.I think a crucial part of zooming in or making

  • 19:28

    FRANCISCA GROMME [continued]: results count was to decide what counted as good evidence.And in this case, there were some discussions around that.So I mentioned the result of a dart club.At some point they found out that in this neighborhood wherethere was a dart club there was a relatively high occurrence

  • 19:49

    FRANCISCA GROMME [continued]: of offences, or a high number of suspects.So they found some results related to this interestthey had in lifestyles.But actually, a neighborhood is quite a small entity.In the Netherlands neighborhood counts were 300 to 800 people.So out of that group of people-- out of that population--

  • 20:14

    FRANCISCA GROMME [continued]: there might be 10 young suspects that they found.So the numbers were quite small, whichled to discussions-- what are the right numbers,or the right differences between numbersthat you find between neighborhoodsthat can be evidenced-- that can be used for making policy?And the policy officers, they were

  • 20:36

    FRANCISCA GROMME [continued]: quite happy to go with the smaller numbers.And this was because they used different regimes of evidence.So they used what I call more of a detectivelogic, in which a lead was enough to start offa new initiative.

  • 20:57

    FRANCISCA GROMME [continued]: Or they would use a sort of marketing logic, thatwould be enough basis for them to goand be proactive, and preventive,and provide information to darts clubs or other organizations.And they had a difference of opinionwith this-- with the data mining company--because they relied more on a social science

  • 21:17

    FRANCISCA GROMME [continued]: type of evidence, and relying on a larger set of data.And also really relying on statistical significance.So it was so--yeah, this discussion went on for a while.And in the end, sort of the professional logics of the data

  • 21:41

    FRANCISCA GROMME [continued]: miners it prevails.In the final reports it was decided that you cannot makepolicies on the basis of this.But it sort of illustrates that when you come togetherwith a group of people in a data analysis practice,and you start doing data analysis in different typesof domains, such as policy making,

  • 22:02

    FRANCISCA GROMME [continued]: you're going to have to think about howdifferent types of professional regimes of evidence can clash,and how you can decide on what counts as good evidence.Another finding was that to zoom in-- it actuallyincludes a lot of normative work,

  • 22:22

    FRANCISCA GROMME [continued]: a lot of norms of practice.So I already mentioned this focus on neighborhoodsas a key point of analysis, as sortof the focus of making policies.Well, you could also think of incomeas a basis of doing this analysis,but that wasn't the focus of it.So that's one idea of what counts as a good analysis that

  • 22:46

    FRANCISCA GROMME [continued]: was practiced here.And another example is that a result needs to be surprising.A good analysis is surprising.It was really a problem in this case as well.The policy officers were asking [INAUDIBLE] give usresults that are surprising.But it's an assumption as well.This whole rhetoric of digging into data, zooming in,

  • 23:09

    FRANCISCA GROMME [continued]: it's about revealing.But I think data analysis can also deliver to you somethingthat you already knew, maybe.So that's what I mean to say by data analysis is normativework, in the sense that all these norms of whatcounts as a good analysis come to play in.And usually, in this case, we're not questioned.

  • 23:40

    FRANCISCA GROMME [continued]: It is a challenge for doing an ethnography.Because usually when you do an ethnography, or very often,you follow a routine.So you would follow the everyday life of a teacher,or of a police officer.But in this case, and I think in a lot of data science casesbecause they often revolve around innovation--

  • 24:02

    FRANCISCA GROMME [continued]: so they're done in the form of a project.And a project has a beginning and an end,so as an ethnographer you have to be thereat the right time and place.And also the policy officers, and the peopleworking for this company, they weren't always doingthis particular project.So how are you going to sort of divide your attention

  • 24:23

    FRANCISCA GROMME [continued]: over this time span?Because for me, it didn't make senseto always sit in their policy office,but I needed to be there when something was happening.So it wasn't only sort of being in a place physically anymore,but also following email lists, for example,regularly calling people to sort of remind them

  • 24:44

    FRANCISCA GROMME [continued]: that you are still interested, asking them what is happening,and all this kind of work.And then also with projects.They might be delayed, so that can influence your planning.If you rely on funding you need to make sureyou still have funding for the duration of the project.

  • 25:07

    FRANCISCA GROMME [continued]: It might happen that people change course--exactly when it concerns innovation you learnthat something isn't working.But that can have huge consequences for your research.Because if I decide to study problem youthbut they decide we're going to extend our case to adults--

  • 25:28

    FRANCISCA GROMME [continued]: these are all things that can happen.So you need to be flexible in termsof your organization, your funding, and your planning,but also in terms of your research focus sometimes.So sometimes, yeah, you may have to shift along a little bit,but still acknowledge what is interesting about the case.

  • 25:48

    FRANCISCA GROMME [continued]: So in this type of analysis and in this type of ethnographyit is also very relevant to keep in mindthat you are responsible for people's professional statusin a way.For making sure you make agreements about consentand confidentiality.So in my research, I chose not to mention the names

  • 26:10

    FRANCISCA GROMME [continued]: of the city and the company.And I did that also because, as I said,a lot of these data science projectsthey're quite innovative, they're experimental.So people also make themselves vulnerableand do things they normally wouldn't do.So I thought it was important in my position

  • 26:30

    FRANCISCA GROMME [continued]: there not to take advantage of it.Because it's easy to write about data breaches, and it's--not that they said that happened--but you can easily take advantage of peopledoing something experimental.So I made it very clear and made very sure that everythingwould be anonymous.

  • 26:51

    FRANCISCA GROMME [continued]: Sometimes people would ask me whyI was protecting the name of the company so much.Because they would think that I was doing research on Google,or another large company.But in my case it was actually a small company--

  • 27:13

    FRANCISCA GROMME [continued]: a startup just making a name for itself.People were trying to sort of get enough fundingto keep their company going, so thisis why I thought anonymity was so important.And another important factor to do with confidentialityin these type of cases as well is

  • 27:34

    FRANCISCA GROMME [continued]: that while you are observing and interviewingyou might come across personal data,because that's what often happens in data science.And so you need to sort of prepare for that as well,and I think take it up in confidentiality agreements.Because also the people who you are working with

  • 27:55

    FRANCISCA GROMME [continued]: may not always be aware of that, and they may notbe aware of their responsibilitiesfor not revealing any personal data--from people from the outside.And often you can make use of agreementsthat such organizations already have in place.So having a background in qualitative studies,

  • 28:20

    FRANCISCA GROMME [continued]: in political science, and then anthropology.I wasn't of course a data scientistand a data miner coming into this study.So I think it is important that yougo in the details of this profession.That doesn't mean that I'm now a data scientistor understand everything about it.

  • 28:40

    FRANCISCA GROMME [continued]: But you need to know the basic terms,and you need to know sort of the contextual literature.So I did a lot of reading up in terms of reading the backgroundliterature, sort of catching up with my statistical knowledgeagain.But also just--I had some people around me, as it happened,

  • 29:01

    FRANCISCA GROMME [continued]: who were trained data scientists,and I could have informal conversations about that.And then I also followed some of the publicationsthat people in the organizations I was studyinghad, so that, I think, gave me a good basis to follow the case.And it doesn't mean I think if you study such expert

  • 29:22

    FRANCISCA GROMME [continued]: practices they always need to understand everything in depth.I think it's like--ethnography is where you go to a country you don't know.You need to maintain that element of surprise.So it shouldn't seem normal to you.Because otherwise I would have neverbeen struck by the idea of zooming in,otherwise it would have been normal for me.

  • 29:43

    FRANCISCA GROMME [continued]: But still, you need to sort of have a basisto follow what is happening.So in this case study I highlighted one specific datamining practice, and which revolvesaround this metaphor of zooming in, and going into more detail,

  • 30:05

    FRANCISCA GROMME [continued]: and working in small numbers.Now of course, this is specific to one practice.There are many types of data science.There are many reasons why you would do data mining.And in other cases, you would have big numbers.But I think some of the lessons of the case studiesstill hold, because in all of these

  • 30:26

    FRANCISCA GROMME [continued]: they might-- in most of these casesthey might use other metaphors of practice, such as creatingan overview, or connecting the dots.And I think the point to sort of keepin mind that these are situated practices,and that we might go into discussion about the termsof evidence, and the types of local knowledges,

  • 30:46

    FRANCISCA GROMME [continued]: and the types of labels you use, theycould all be open for improvement or discussion.Or at least be more elicited during the practice.Also, how different professions would approach them.So that leads me, I suppose, to my final point--

  • 31:08

    FRANCISCA GROMME [continued]: is that seeing better is not always--you do not always need more data to do it.Sometimes you need to think of other ways of doing analyzesto see in a better way.What are other examples of metaphorsused by data scientists, and how would you study them?

  • 31:31

    FRANCISCA GROMME [continued]: What do you think of problem youth as a category now?How should it and how should it not be used?[THEME MUSIC]

Abstract

Francisca Grommé, a post-doctoral Sociology researcher at Goldsmiths University of London, presents her ethnographic study of using data mining practices to develop Dutch policy for at-risk youth.

Looks like you do not have access to this content.

Studying Data Mining Practices: An Ethnography of Dutch Policy & Problem Youth

Francisca Grommé, a post-doctoral Sociology researcher at Goldsmiths University of London, presents her ethnographic study of using data mining practices to develop Dutch policy for at-risk youth.

Copy and paste the following HTML into your website