Skip to main content
Search form
  • 00:11

    LOUISE CORTI: My name's Louise Corti.I am one of the associate directors at the UK DataArchive.There, I'm head of collections, development, and producerrelations, which means we go out, and we look for good data.And we work with depositors.We negotiate data coming in here,and we also support them in terms of guidance and trainingon how to create good, shareable data.

  • 00:32

    LOUISE CORTI [continued]: So in this tutorial, I'm going to be talkingabout sharing and curating research data,and I'm going to be talking about some of the key points.So, first of all, why is it important to share data,and why is it important to reuse data as well?Secondly, what does a really good shareable data setlook like?And, finally, what are some of the practical thingswe can do to make data shareable?

  • 00:55

    LOUISE CORTI [continued]: So thinking about the kinds of research data thatare collected in social and economic science, first of all,governments will often carry out large-scale surveysto survey the population on attitudes and behavior.We also have a census every 10 yearsthat helps us find out how many people live in the countryand how many people live in each household.

  • 01:15

    LOUISE CORTI [continued]: And we do collect a lot of very high quality surveys in the UK.Examples of that are the Labour Force Survey,the Health Survey for England, the British Crime Survey.They are all very good surveys that are carried outevery year, and, actually, they'remade available to researchers who can use them.So for the Family Expenditure Survey,we have data going back to the 1960s.

  • 01:36

    LOUISE CORTI [continued]: So you can see that we can build up a really important profileover time.Also in the UK, we're very lucky that our survey collections,it's quite well funded, and it's done to a very high standard.So we have a whole set of data coming outof kind of policy research.The other area is many academics will do their own research aswell, and they'll do a variety of thingslike qualitative studies.

  • 01:57

    LOUISE CORTI [continued]: They'll do field work and anthropology.Psychologists will tend to do experiments on people.And then we have a whole load of other kinds of methods,really, including smaller surveys, postal surveys.So researchers collect a whole load of data, and all of thatreally can be shared, but it all requiresslightly different ways of dealing with that information.

  • 02:20

    LOUISE CORTI [continued]: All the research data that's collected,you can imagine, in the social-economic sciences,is a lot of data being collected, and in the past,it really wasn't shared.So what tended to happen in academic researchis people would collect their information,use it, publish it, but it wasn't really routinely shared.And it's only really been since the 1990swhere people have started to share data,and that's really come about through someof the funders of research insistingthat data that's collected using public money is shared.

  • 02:47

    LOUISE CORTI [continued]: And there's quite a lot of data policies in placethat mandate people to share dataat the end of their research.So in some sense, it's researchers have to share data,depending on what domain they work in.The other reason why it's simple to shareis thinking about public accountability.Governments want transparency.They may commission research, but theywant greater transparency in some of the methods.

  • 03:09

    LOUISE CORTI [continued]: So more and more data is being made available,and, actually, people have a rightto obtain more information about themselves.There's been a Freedom of Information Actso we can get hold of more data.So all in all, there's a kind of trendtowards people wanting more data and also people sharingmore data.So in terms of data sharing and data archiving,it's actually quite a good time.

  • 03:31

    LOUISE CORTI [continued]: Another area why it's important to share data is,increasingly, there's been people kind of faking results.So in psychology, we've had some big casesin the Netherlands and Belgium wherepeople have made up data and been discredited for it.So the whole academic community isworried about kind of trustworthinessof data and publications, and increasingly journalsare mandating people to submit data alongwith their publication so that peoplecan check the methods and the dataand maybe rerun some of the analysis.

  • 04:02

    LOUISE CORTI [continued]: And finally, I guess, a more positive spin on sharingis that if you show your data, you'remore likely to get collaborators.You're more likely to get more citations from that data,and it gets your original researchproject more visibility.So there are some of the really positive thingsto do with sharing.So there's kind of carrots, in thatyou can get more visibility, and maybe somesticks, that you're asked to do that by your funders.

  • 04:26

    LOUISE CORTI [continued]: What kind of counterbalances sharingis the fear of data loss.There's often reported various data losseswhere government departments leave data sticks on a train.So there is a fear of data loss by accident,but also maybe people revealing dataor maybe there's disclosure risk in data that kind of wasn'tchecked out.

  • 04:46

    LOUISE CORTI [continued]: So there's a whole area to do with making surethat state data's very well managedso that rather than just sharing anything-- and as you know,many people are sharing more information than they should,probably, on Facebook and Twitter.There's a lot of the culture of data sharing, particularlyin younger generations, versus making sure weare keeping things safe.So it's important that places like data archivescan teach people how to manage data well,to think about storage, and to think about data securityand looking after data, particularly meeting the DataProtection Act.

  • 05:21

    LOUISE CORTI [continued]: As a researcher, we're collecting lots of dataourselves, and we know what our own research data look like.But when we come to present it to make it available for userswho haven't collected our data but are going to be viewingsomething they're not familiar with, what does that look like?So we can really go back to the 1970s,when Hyman produced a book on the social surveyand secondary analysis, and that wasone of the kind of seminal books on how do yougo about doing secondary analysis of datayou didn't collect?

  • 05:48

    LOUISE CORTI [continued]: And what do you expect as a user?And then following on from that, Angela Dale's bookin the '80s really documented in detailabout how to do secondary analysis of survey data.And in that, it kind of set out some of the requirementsfor creating a really high quality data set.So for a survey as well at the survey data file,you'd expect all the variables to be very clearly labeled.

  • 06:11

    LOUISE CORTI [continued]: So, for example, male and female,you'd have the proper labels in thereso that people know what the data are.And as well as the data being very well labelledand understandable, you'd create some really detaileddocumentation about what the questionnaire looked like,what the instructions to interview were.If they were given show cards to pick from,the show cards should be available.

  • 06:34

    LOUISE CORTI [continued]: The sampling and the waiting and the kindof more technical details, all of that thoughtshould be made available and packaged upas part of the data file so we can really begin to understandwhat the data mean.So shareable data looks like a really rich descriptionof data, together with some documentation from the researchprocess, together with the data file.

  • 06:56

    LOUISE CORTI [continued]: So thinking about some of the practical ways of making datashareable, if we think about the data collection processand actually the research process as a life cycle,we could think about we plan our research.We plan our data collection.We go and collect our data.We then collate our data and process itand get it ready for analysis.We then analyze our data.We then, if we like, once we've published our data,we can share it.

  • 07:19

    LOUISE CORTI [continued]: So there's a kind of life cycle because once you've shared it,someone else can reuse it.So within that life cycle or that production process,there are a number of different pointswe call intervention points wheredecisions you make at those pointscan have a dramatic effect on whether data can be sharedor not.And there are five points that I'd really like to cover.So the first one is really when you're setting up a researchproject, and you're thinking about planning it,you're going to be limited by how much money you'vegot to spend on data collection, which will probablylimit the kind of sample size you have.

  • 07:50

    LOUISE CORTI [continued]: But also how much you're going to spend on how many researchoffices you have, how much time you'regoing to spend in the field, all those things actuallycost money, so really thinking about practicallywhat you're going to do and how realistic that's going to be.In part of that planning, we needto think about if you're going to do interviews with someone,are you going to transcribe them?

  • 08:11

    LOUISE CORTI [continued]: Or are you just going to listen to the interviewwhen you come to analyze them?If you don't transcribe them, it makes it much harder to share.So thinking about making that data shareable is goingto be an added cost to get them transcribedor transcribe them yourself.So there's a number of different thingsto think about that you might do to data to make them moresharable, and that's always some extra costs on topof what you were going to do in your research itself.

  • 08:37

    LOUISE CORTI [continued]: So a second major point is at the point at which you'regoing to do the interviews, you needto think about your consent, the form of consentthat you're going to use.Most people use a consent form to gain interviewsor to conduct a survey.And if you think about when you're getting consent,the things that you say to people or thingsyou promise to people really needto be upheld throughout the life cycle of the data.

  • 09:01

    LOUISE CORTI [continued]: So many people in the past have said to their participants,it's only me who's going to be seeing your information on whatyou tell me, which immediately presents a huge problemfor sharing data later on.And if you've promised that someone,you can't even share the data with your colleagues.So thinking about really the wording and types of wordingyou can put in a consent form thatdoesn't include sharing with other people,and at the data archive, we presentquite a lot of actual sets of wordingyou could use to make sure data can be shared.

  • 09:31

    LOUISE CORTI [continued]: So that's really important.Another important intervention point in the data collectionprocess is the point at which you're actually entering data.So if you think about you're making an observation,and you're writing down some of the codes in an Excel sheet,you need to make sure that the data you're capturingare valid.So if you're coding something from one to five,you need to make sure you're writingin one to five and not six.

  • 09:54

    LOUISE CORTI [continued]: So there's kind of data consistencyto make sure you're capturing things that are meaningful.Another example of kind of consistencythere is if you're doing transcriptions of interviews,to make sure you have a standardized templateso that all your transcriptions are covering the same thing.You have the same information.And the way that you're maybe writing your speakers--you know, I'm speaking, you're speaking--they're quite consistent.

  • 10:17

    LOUISE CORTI [continued]: So we're not saying that everythinghas to fit into one template, but itshould be kind of more consistent across the kindof data that you're collecting.So it makes it easier to collate and analyze at the end.Quite often in the research process,we are collecting personal data, and there is a Data ProtectionAct I think many students are not particularly aware of.They kind of think that's, you know, formal data.

  • 10:39

    LOUISE CORTI [continued]: But if you're collecting information about people,about where they live, about characteristics about themand their household, that is personal data.So we need to think very carefullyin research about kind of separating outthe data, the information you're asking,versus their personal details.And there's a whole load of regulationswith data protection about how youlook after data, how you should only keep itfor a certain amount of time.

  • 11:01

    LOUISE CORTI [continued]: So it's important that you do know what data protection meansand that you're looking after personal data very carefully.It's also important that you're storing data safely,so you're not just having a load of transcripts sitting aroundon your desk that everyone can read if they docontain personal information.And you're not maybe transmitting data by emailto people if it's got confidential information in it.

  • 11:24

    LOUISE CORTI [continued]: So thinking very carefully about transmitting datais important, also making sure that you have things backedup and in a safe place so that if your computer crashes,you've got another copy somewhere.So all those things about kind of security and storageare very, very important.You need to have the research processand the data explained in such a waythat someone coming to the data can understand it.

  • 11:50

    LOUISE CORTI [continued]: This means describing the methods of data collection,how you've checked and processed the data.What it looks like at the end needsto be described very well.There are a number of different ways of presenting information.For a survey, you need to make sureall the variable labels are completeand that there's a whole, full description of whatall the codes mean.

  • 12:11

    LOUISE CORTI [continued]: You need to have the documentation associatedwith the survey made available.For example, the questionnaire, the show cards, the interviewerinstructions, the guide on sampling,all those things need to be made availableand need to be very well described.For a qualitative study, if you'vedone a number of interviews, we like a data list.That's just describing each interview,who the interview is with, maybe how long the interview was,maybe where it was carried out.

  • 12:35

    LOUISE CORTI [continued]: So you've got this listing of dataso people can see really what is contained in the datacollection.Also for a qualitative study, you'dprobably need to give a copy of the topic guide,the kind of questions you were covering,and a little bit maybe about the interview settings,whether you interview people in their homesor in their workplaces.So some of that context needs to be described on paper.

  • 12:58

    LOUISE CORTI [continued]: And for qualitative data, you clearlycan't kind of write down every single piece of context,but you can try and describe the interview setting and whyyou did the project.And users at the other end will find these piecesof information really important and actuallyreally interesting as well, to get a bettersense of what the data mean and what the researcher wastrying to do.

  • 13:22

    LOUISE CORTI [continued]: Data management is a really useful devicefor capturing all this information.Many funders of research actuallyrequire you to submit a data management plan.That's really setting out in advancethe kind of data collection tasks you're going to be doing,who holds the roles and responsibilities, who'sis going to do what.Who is taking responsibility for storing your data?

  • 13:44

    LOUISE CORTI [continued]: Is it your department, or is it you?So it really clarifies who's doing what in the researchprocess so everyone is clear.We've seen why it's important to share dataand also why it's important to reuse data.We've covered some of the issues about makingdata shareable and the very practical thingsyou can do to make your data shareable.

  • 14:06

    LOUISE CORTI [continued]: Sharing data is a good thing to do.It does offer further visibility for your researchbeyond the published findings.And increasingly, people are wanting to do it,so it's a really good skill to knowhow to make data shareable.Planning for sharing is one of the most important thingsyou can do to make your own data shareable.And that's, rather than waiting to the end to share data,it's thinking right at the beginningabout the kind of things you needto do along the way to make data shareable,so writing down a data management plan,making sure you're clear and the people you'reexpecting to do things, maybe like look after data for you,making sure they're clear as well.

  • 14:43

    LOUISE CORTI [continued]: Setting out roles and responsibilities in a planis a really, really important thing to do.


Louise Corti discusses data sharing and how to make sure your data is shareable. Funding agencies insisted that data be shared after discovering that people were faking the results of their research. By sharing data there is more transparency and more knowledge sharing.

Looks like you do not have access to this content.

An Introduction to Data Collection: Sharing And Reusing Data

Louise Corti discusses data sharing and how to make sure your data is shareable. Funding agencies insisted that data be shared after discovering that people were faking the results of their research. By sharing data there is more transparency and more knowledge sharing.

Copy and paste the following HTML into your website