Skip to main content
Search form
  • 00:00

    [MUSIC PLAYING][Studying Elite Knowledge ProductionUsing Wikipedia Data]

  • 00:10

    SORIN ADAM MATEI: My name is Sorin Adam Matei.I'm a professor at Brian Lamb School of Communicationat Purdue University in Indiana.[Sorin Adam Matei, Professor, Brian Lamb Schoolof Communication, Purdue University]I study online groups and online organizations,especially those groups and organizationsthat come together through social media platforms,wikis, social networking to produce

  • 00:33

    SORIN ADAM MATEI [continued]: knowledge and new insights.Now with the advent of social mediafor the first time in the history of humankind,we can have complete records or a complete recordof what people did working together on a certain project.I'm particularly interested in Wikipedia,which is a little bit like the La Brea Tar Pits.Basically everything you ever wrote on Wikipedia

  • 00:56

    SORIN ADAM MATEI [continued]: is still there with very small exceptions.And will be there until the site will be brought downby some natural cataclyst.With that you can study in great detail who did what when,and to what effect.My particular interest is in understandingthe role of functional elites.

  • 01:16

    SORIN ADAM MATEI [continued]: These are the most productive individuals.Where'd they come from?How long individuals survive in these production elites,how the production elites evolve over time.Is their natural progression in the emergence of these elites?And so on and so forth.We took the Wikipedia edit database between 2001 and 2010,

  • 01:42

    SORIN ADAM MATEI [continued]: 250 million edits.Each edit is attributed to a person,either identified by a username or by an IP address.We put the IP addresses aside, whichare a minority in terms of quantity of production.We look at logged in users.And on a weekly basis we identified

  • 02:03

    SORIN ADAM MATEI [continued]: who were the most productive individuals.We compare each week to the previous weekto see what percent of the most productive individuals survive.Then we call that elite stickiness.At the same time, for each week wemeasured the amount of contribution unevennessor entropy.Which is just a measure of is the work even or uneven.

  • 02:28

    SORIN ADAM MATEI [continued]: And then we tried to understand if elite stickiness leadsto unevenness or the opposite.If the former, you have a little bit somethinga little bit like a power grab.If it's the latter, then you have a much moreorganic natural process.And what we found is that indeed, you

  • 02:48

    SORIN ADAM MATEI [continued]: have a natural process, an organic process.Not only that, but we succeeded in segmentingthese 10 years worth of data into discrete phases.And we uncovered three or four phasesthat speak about the fact that Wikipedia is justlike any other organization.It goes through a phase of inception, coalescence,

  • 03:12

    SORIN ADAM MATEI [continued]: maturation, and then stabilization.One of the most fascinating findingswas that after the third or fourth yearWikipedia stabilized, both in termsof the unevenness of collaborationand stickiness of the leads.Meaning that it entered a very steady state, whichis still there today.It's basically cruising.Many people identify or think that data science

  • 03:34

    SORIN ADAM MATEI [continued]: or computational social science is about complicated computeralgorithms.We did use some of that.But not something that anybody with some understandingof computer science might not have.When needed we used Excel, or Google Spreadsheets,or whatever made sense.

  • 03:55

    SORIN ADAM MATEI [continued]: A very important part of doing good researchis not only collecting data and storing it,but synthesizing it.And making it understandable to just about everybody.And for that, you need to make good charts, too.We worked a lot on our very nice charts.Working with very big data is like doing genetic research.

  • 04:15

    SORIN ADAM MATEI [continued]: I don't know how many people are aware of the factthat when we manipulate genetic material,it's not as if you have a pair of tweezersand just pluck a strand of DNA from another and allthat looking at the microscope.It's all chemistry.It's all indirect work.You mix this ingredient with thatand hope against hope at the end you'll get something.It's the same with big data.

  • 04:35

    SORIN ADAM MATEI [continued]: It's not as if you have a big spreadsheet at which youlook all the time to see if the data is still there,if it works and all that.Just pieces of code and scripts that you run.And then at the end you hope that you got it right.You have to do a lot of checking, double checking,triple checking.So the amount of time to make sure that you got it rightis something that I did not expect in the beginning

  • 04:57

    SORIN ADAM MATEI [continued]: and I had to learn to live with.Speaking about research in general with big datasets, with social media, with vast amounts of dataI want to forewarn the future practitionersthat it's not just about having the right methods,or the right tools, or the right computers.It's also having the right theories.This data is a gift given to us.

  • 05:19

    SORIN ADAM MATEI [continued]: But it's a gift that we need to takegood care of in the sense of we need to facet it,we need to interpret it, and we make sense out of itin view of some theory that is checked at the endrather than to take this data to be efficient expedition thatwill tell us something at the end just because.That's just not what happens in reality.

  • 05:43

    SORIN ADAM MATEI [continued]: A good computational social scientistis by the end of the day a social scientist.And I don't know how many people rememberthat actually social science is a branch of philosophy.It deals with human motivations, and human goals,and human acts.And it's always grounded in understanding humans rather

  • 06:05

    SORIN ADAM MATEI [continued]: than understanding tools or methods.The world is changing.Wikipedia is here today, might not be here tomorrow.However new tools and new methodsof investigating the world will come about.And whatever we do in the future,it's very, very important to think again, about people.

  • 06:27

    SORIN ADAM MATEI [continued]: And to focus on human motivations,human intentions, human acts.And to have a reasoned and grounding wayto think about these motivations thatdo not transform these individuals into simple cogsin the machine.Humans are humans by the end of the day.We need to understand them as such.They need to train themselves very well in social theory,

  • 06:48

    SORIN ADAM MATEI [continued]: and social sciences, and sociology, and psychologydepending on what they want to do.That is the foundation on which their career will build.At the same time, they really needto build up very solid computer science skills.Practical skills, not necessarily very theoretical

  • 07:09

    SORIN ADAM MATEI [continued]: skills.Because you might end up investing a lot in thingsthat you are not going to use too much.And again the idea is to explain people.Not to build better algorithms just for the sake of doing so.Keep an open mind.The world is an open book, right?The world of social media as an open book.It's without answers yet.

  • 07:30

    SORIN ADAM MATEI [continued]: There's a lot to discover.Let's not try to prove that which has been proven alreadyover and over again.Also be prepared for the unexpected.Again, 10 years ago AOL was all the rage.Where is AOL today?Just yesterday Time Warner, which bought AOLwas bought by another company.

  • 07:51

    SORIN ADAM MATEI [continued]: It's just gone.So things will happen in your lifetime two, three times.Be prepared for those changes.[MUSIC PLAYING]


Sorin Adam Matei, Professor at the Brian Lamb School of Communication, Purdue University, discusses his research on contributions to Wikipedia, specifically elite contributors and their productivity. Also discussed is how to work with and present big data, as well as practical advice for social scientists interested in working with big data.

Looks like you do not have access to this content.

Studying Elite Knowledge Production Using Wikipedia Data

Sorin Adam Matei, Professor at the Brian Lamb School of Communication, Purdue University, discusses his research on contributions to Wikipedia, specifically elite contributors and their productivity. Also discussed is how to work with and present big data, as well as practical advice for social scientists interested in working with big data.

Copy and paste the following HTML into your website