Skip to main content

Data Collection

Encyclopedia
Edited by: Published: 2007
+- LessMore information
Download PDF

The word data is the Latin plural of the word datum, which itself is the past participle of the verb dare (DAH-reh), meaning “to give.” So, it literally means “things given.” Data is often used as a singular noun in English—what we call an “uncountable,” or a “mass term,” like “water,” “energy,” “information,” and so on, although the Oxford Advanced Learner's Dictionary states that there is uncertainty with “data” as to whether it is singular or plural, and both are acceptable. But careful writers only ever use it as plural. Although data are useful to generate information, knowledge, and wisdom, they in themselves are not treated as information or knowledge. What, then, are data?Although the lexicon meaning of data is facts or information, data need not be facts or information. Data are subjective and objective human experiences, feelings, attitudes, beliefs, values, perceptions, views, opinions, judgments, and so on. They are also objective facts in the universe, interactions between human beings and objective facts, and human subjective construction of objects and facts, irrespective of their object and factual reality. Thus, some data are readily available as “things given,” whereas some data need to be diligently discovered and collected with ethical considerations, depending upon the research problem, need, and the researcher.

Many of us naturally and generally collect data, make sense of them, and use the same for better living. In the research world, purposeful and systematic data collection is an important and essential activity. It is one of the significant elements or phases within the research design that is followed by the research problem formulation (objectives, hypotheses, research questions, concepts, and variables); the selection of research; and sampling methods. It is preceded by data analysis and interpretation, and report reporting phases. Because data collection occupies a crucial phase in the research design, no research can be conducted without data. To a significant extent, the quality and impact of research depends upon high-quality, accurate, and uncontaminated data. In view of the significance and relevance of the data collection process for researchers, it may be delineated and discussed by addressing the following questions: Why do researchers collect data? What are the types of data? What are the data collection methods? When should data be collected? What are the ethical issues in collecting data, and how should researchers deal with them? What factors are likely to affect the quality of data? How can researchers minimize the factors that are likely to negatively affect the quality of data?

The Necessity of Data

Because social needs, problems, and causes keep constantly changing, new data need to be collected to understand and address these emerging changes. Toward this, some researchers collect data to explore and gain an in-depth understanding of the phenomenon, whereas others do so to answer their bold research questions, to test or formulate new hypotheses, or to validate or falsify existing theories by refining casual relationships or discovering new ones. At the extreme, new data are also useful to destroy existing paradigms and erect new ones. Data are also needed to formulate, implement, and evaluate appropriate policies, programs, and products of government and nongovernmental organizations and corporations. They also can be used effectively to inform or educate people and organizations about new trends that are relevant to them. From the postmodern perspective, data also play an important role in demonstrating multiple realities.

Types of Data

The universe is filled with a huge amount of data, so it needs to be categorized broadly for the systematic conduct of research and synthesis of research outcomes. All of the available data may be classified broadly into primary and secondary data, and each of these in turn may be categorized into quantitative and qualitative data. Primary data are collected directly from the field by observing, interviewing, or administering a questionnaire. Secondary data are collected from already available sources (see examples in Table 1). Data that cannot be measured by assigning a value or by ordering them in ascending or descending order are generally considered qualitative data, and data that can be subjected to some kind of quantification or measurement are generally considered quantitative data. Furthermore, data may also be categorized as tangible and intangible (e.g., smell, air, unexpressed feelings or emotions). However, it may be noted that these categorizations or dichotomies are researchers' creations. In reality, these exist together, and thus in research, all types of data need to be collected and diligently integrated, if they enhance the understanding of reality.

Table 1 Types of Data
DataQualitativeQuantitative
PrimaryField observations, narrativesAge, income, educational level
SecondaryLetters, diariesCensus, annual reports
Data Collection Methods

Researchers often employ a specific method or several data collection methods to collect data, such as observation, case study, questionnaire, interview, focus groups, rapid rural appraisal, and secondary data. Some of these methods overlap with others, and some are more popular than the others. Many of these data collection methods have different variations within them; for example, the observation method has been further delineated into structured, unstructured, participant, and nonparticipant observation, and the case study method into intrinsic, instrumental, and collective case studies. There are different types of questionnaires and ways of administering them (one to one, in groups, or through mail, including e-mail). Interviews have been classified into structured, semistructured, and unstructured, which may be organized through face-to-face, by telephone, or by any other electronic mode. The focus group also has several types, including group interviews, group discussion, nominal group, and so on. These methods are very important because it is through these methods that data are collected.

Generally, research methodology books discuss details on these methods. Nonetheless, it is crucial to note a few points on them. First, researchers need to carefully select a method or a combination of data collection methods in such a way that they capture reality appropriately and accurately in order to answer the research questions and achieve the research objectives. Inappropriate or incorrect selection of data collection methods results in incorrect and misleading outcomes that distort the reality. Second, after having selected the most appropriate data collection method(s), researchers need to develop adequate knowledge and skills through training, practice, or some other relevant means to use the data collection methods effectively. These may include sharpening observation skills, memorizing, taking notes, constructing a questionnaire or an interview schedule, asking questions, listening, moderating, dealing with diversions and interruptions, and respecting respondents' privacy and self-determination. Third, it is important to be aware of the strengths and limitations of various data collection methods and where and when they can be best used. Fourth, we should be aware of and effectively use in moderation data collection means with which we are all gifted. These are our five sense perceptions: eyes (seeing/observing), ears (hearing/listening), nose (smell), tongue (taste), and skin (touch). Just as some data collection methods are more often used than others (e.g., questionnaire or interview schedule), we might have gotten accustomed to using some sense perceptions more intensively than the others (e.g., too much speaking, not enough listening; observing but not noting; or too much listening/carried away with the field or respondent without observing and speaking). These sensory perceptions need to be employed effectively to collect data rather than relying only on the data collection instruments. Finally, while collecting data through the chosen method(s), researchers should ponder the following questions to keep the data collection process on track.

  • What am I trying to discover?
  • Why have I chosen the methods (research, sampling, data collecting) I have chosen?
  • Do these methods help or hinder my efforts toward understanding reality?
  • Are there any alternative methods to understand the phenomenon I am trying to understand?
  • Do these categories of methods make any sense in understanding the reality?
Resources for and Timeliness in Data Collection

Data collection is a resource-intense activity in terms of time, money, and other resources, more so in the case of primary data collection. Researchers need to liberally estimate time, budget, personnel, and other resources, and make arrangements for the same in advance to ensure a smooth data collection process. Most important, timeliness is very important in data collection. Researchers need to approach respondents, whether individuals, families, groups, communities, or organizations, at a time that is convenient to them and they are available and willing to provide data. Another important aspect of timeliness is that researchers need to be in the field when the events occur so as to collect data in the natural setting, if the research issue/design requires such an approach. For example, field data on mass protests, mob behavior, village fares, or indigenous methods of harvesting cannot be collected whenever researchers desire to collect. They have to be timely in collecting these types of data, just like a natural scientist can collect data on an eclipse only when it occurs.

Ethical Considerations

Researchers need to collect data according to the set ethical standards, which are often based on certain values and principles: honesty, truthfulness, privacy and confidentiality, self-determination and voluntary involvement, zero physical and psychological harm, dignity and worth of human beings, accountability, right to know on the part of respondents, fairness and impartiality on the part of researchers, and informed consent. On the other hand, researchers should avoid breach of confidence and agreements, absence of informed consent or self-determination/autonomy of respondents, deception, risk of harm or offense, acts involving conflict of interest, and any unethical act.

Many government and nongovernment organizations, universities, and research firms have well-developed research ethics committees and ethics clearance application forms. Before beginning the data collection process, researchers should adhere to these ethical requirements and collect data accordingly. Those researchers who do not belong to any organizations or whose organizations have not developed such ethical standards and requirements should also collect data by setting their own ethical standards based on the above stated values and principles. They should explain the nature and purpose of research, provide satisfactory answers to all questions, inure that respondents are involved voluntarily and that no force is used, and allow the respondent to withdraw from the research at any time if he or she wishes to do so.

Impediments in Data Collection

Data collection is a planned, purposeful, and systematic activity. Despite choosing appropriate data collection methods; meticulously developing data collection instruments; planning adequate resources, including time; and meeting ethical standards, researchers may encounter several impediments in the data collection process. One probable reason for these impediments is that the nature of the setting, the research problem, the researcher, the researched, the time of research, and the prevailing social conditions vary every time. Thus, the data collection impediments may be analyzed by looking at three “R” factors: the researcher; the research problem; and the researched, or a combination of these factors.

Because the researcher is the main actor in the data collection process, he or she can contribute significantly to reducing or increasing field difficulties. Data collection experiences suggest that there are three main issues related to the researcher. First, researchers' state of mind affects the data collection process because they may sometimes feel nervous, anxious, incapacitated, irritated, uncomfortable, overwhelmed, frightened, frustrated, tired, and at times less confident. Several factors within and outside the researcher may contribute to such a state that might affect researchers' observation, interviewing, responding, and note-taking abilities. Second, researchers' negative attitudes, prejudices, and preconceived notions toward the research problem, the field, respondents, and communities may interfere with the data collection process and reduce the quality of data. Finally, researchers' action (i.e., how they actually behave in the field and with respondents) is also important and may obstruct the data collection process if not appropriate.

The second factor is the research problem. Some data collection difficulties are related to the nature of the research problem and the decision researchers make to enter particular settings. If research problems deal with sensitive issues such as drug addiction, bankruptcy, the accused awaiting trial in the criminal justice system, ethnicity, development of toddlers and children, and so on, researchers often experience several challenges while collecting data. Data collection experiences have demonstrated that some respondents or communities may feel threatened and insecure because of the sensitivity of the issue. In some cases, data are simply not available, accessible, or discloseable. For example, while tracing genealogies of families, information on women may not be available in some cultures. In some regions and towns, it may not be possible to locate the universe of the community. Census reports may not have a particular type of information. A complete, upto-date, and accessible list of agencies, organizations, and companies may not be available. At times, researchers may not have access to needed data or organizations. Information may not be well recorded and kept. These are real problems in the field that are beyond the control of researchers, and they can affect the quality of the data collection process. The difficult nature of the setting and lack of information about the setting (e.g., widely dispersed respondents or communities in rural, remote, and hilly areas; unclear addresses and road maps, etc.) may also lead to exhaustion and thereby weaken the data collection process, including its pace.

The third source of data collection impediments is the researched (i.e., respondents and communities). Data collection experiences suggest that researchers have faced the most common problem of making an entry (into the community) and gaining acceptance. Every means or way of approaching the respondent and the community (e.g., through written letters; health officials; government officials; local leaders, political or otherwise; friends/relatives; or independently without anybody's introduction) has pros and cons and may affect the accuracy of data being collected. Equally important is gaining acceptance. If the respondent's suspicions and doubts are not cleared, and acceptance is not gained, the data collection process will be hampered significantly, and that, in turn, may lead to inconsistent and incomplete data.

Experiences of interviewing respondents have revealed that an unsuitable location for the interview, lack of functional trust, refusal to give an interview, difficulty in convincing the respondents, interference by friends or members of the family, respondents' keenness to complete the interview quickly, more talkative respondents, and not knowing the local language can pose several impediments to the data collection process. In terms of the questionnaire, faulty design of the questionnaire, low return rates, difficulties in collecting a group of respondents at one place, lack of organizations' support to employees in completing the questionnaire, and approaching busy professionals at their workplace have hampered the data collection process. Ethical issues in observation studies, planned or arranged observations, and lack of prompt recording of observations appear to affect the quality of collected data. Delays in obtaining permissions to collect data from organizations, particularly from the government, and lack of cooperation of staff members to give access to the available data also create problems in data collection. Other factors such as adverse weather conditions, high sample mortality rates, lack of adequate resources, isolation, and health issues of the researcher also may get in the way of data collection.

Strategies to Ensure High-Quality Data Collection

Although the above presented impediments can affect the data collection process and reduce the quality of data, researchers can consciously employ some systematic strategies to ensure the collection of accurate data. In regard to the impediments stemming from the researcher, first, researchers need to be aware of their state of mind and reflect on it by raising the following questions: Why do I feel this way? What am I doing here? What are my attitudes toward respondents and communities? How am I behaving with people in the field? To what extent does my state of mind affect my data collection process? Is it blocking my efforts to understand field realities? How can I overcome these contextual feelings (state of mind) and change my attitude and behavior, if necessary? Second, these reflections should result in enhancing the competence of researchers by acquiring needed knowledge, by developing practice skills and appropriate attitudes, and by taking right actions. It is important for researchers to feel comfortable and confident in the field, and enhanced competence will help achieve it. Finally, researchers' experiences suggest that additional reading, better information about the issue, adequate practice, acquaintance with the field, use of professional skills, anticipation of problems, and preparation of possible remedies will help. Regardless of respondents' background, status, communities and conditions, and cooperation or noncooperation, researchers should respect them. They also should be free from their own prejudices and preconceived notions about the field so as to develop conducive attitudes and behave appropriately in the field. In addition, researchers need to be assertive and flexible.

Several creative strategies need to be explored to prevent and to deal with data collection difficulties emanating from the research problem and setting. When the research issue is sensitive and respondents feel insecure and threatened, it is less likely that a good data collection process will begin. Strategies toward this issue will be discussed shortly. If the research problem and setting-related data collection difficulties are beyond the control of researchers, first, they should not get perturbed; second, they should study the problem; and third, they should look at possible alternatives. Once they analyze the possible alternatives, the most appropriate alternative can be chosen and changes can be introduced in the data collection strategies. Thorough pilot study should certainly signal such potential problems. Researchers need to anticipate and plan well, including logistics to cope with some of the realistic difficulties in the field. Careful use of local guides/volunteers and resources may reduce some of the problems. When research is undertaken in rural and remote communities and tribal areas, researchers must learn to live happily with limited facilities and without the luxuries of urban life. The pace of research work needs to be organized in such a way that it takes care of physical exhaustion. If it is not possible to collect data on some issues and from some settings, it may be necessary to alter the whole research design.

With regard to respondent-based data collection difficulties, a few strategies may be recommended. Because making an appropriate entry is a critical issue and there is no foolproof strategy to address it, researchers need to be conscious of how they are going to make an entry and how they will access respondents, and they need to make an assessment about likely implications on the quality of data. An analysis of the consequences of each entry option on data to be collected may be undertaken, and an entry approach that has minimum consequences on the data may be followed. It is also important to develop systematic plans to overcome those consequences. Another approach is that when the researcher feels confident that initial data were inconsistent and unreliable, such data may be excluded once the reliable data pattern is established. To gain acceptance and to deal with sensitive issues, researchers need to build functional trust and rapport, and establish credibility. Toward this, researchers need to provide simple, straight, and honest information to respondents, communities, and organizations, and answer all questions so as to overcome their suspicions and doubts. Efforts to overcome this problem might include ensuring direct contact with the respondent, rather than using a second person or intermediary to approach the respondent; maintaining strict confidentiality; suppressing actual names; exploring the respondent's version of the events, opinions, and so on; and avoiding using anything (e.g., tape recorder) that the respondent particularly finds threatening. Researchers should avoid defensive arguments with the respondents. They also must follow ethical guidelines that are appropriate to respondents' cultural practices. Most important, researchers should demonstrate warmth, empathy, friendliness, and pleasantness; show interest in what respondents say; and allow additional questions and discussion that may not be related to instruments and the research problem. These strategies are likely to facilitate a better data collection process to obtain rich, reliable, and valid data.

A mutually convenient location should be chosen for the data collection, whether it is an interview, administration of a questionnaire, or a focus group discussion. In the case of the respondent's refusal to provide data, researchers should politely thank him or her and withdraw from the process. It is also important to anticipate a range of interruptions from people other than respondents (e.g., relatives, friends, etc.) and prepare well to minimize them. Researchers need to prepare and plan well to work with the language difficulty, if they do not know the local language. They need to learn and develop local basic vocabulary. Most important, they need to identify, train, and employ neutral interpreters (who do not take the side of the researcher or the researched) who do not affect respondents and their responses. Long and exhausting data collection instruments should be avoided. By pretesting, the optimum length should be estimated. If an instrument takes a long time, breaks should be planned at appropriate stages of the data collection. In-depth or long interviews may be conducted in two to three separate sessions. If particular items of the interview/questionnaire do not work, the researcher should be flexible enough to consistently drop them from the schedule.

Recording of data, whether through handwritten notes or electronic devices, should be avoided if it is implicitly or explicitly resisted by respondents. An overreliance on electronic gadgets is not recommended because they may not work when researchers need them the most. If the data collection is based on the researcher's memory, the researcher must expand his or her notes and then write down his or her memories immediately after interviews. Delay would cause memories to fade and thus the collected data as well.

If questionnaire respondents are located in government, nongovernment, or business organizations, researchers may ask the organization head to issue a cover letter advising the respective employees to cooperate with the survey. This approach may facilitate the data collection process in organizations. Avoid contacting professionals during their busy hours, and approach them according to their availability and convenience.

In the case of a questionnaire, administering, completing, and collecting it in one session will yield better return rates than giving a questionnaire to respondents and asking them to return it later. Researchers must have some autonomy in observing so that they can get an adequate picture of the phenomenon being observed. Research experiences show that sometimes meaningful data may be collected through casual experiences, observations, and conversations. Researchers may not be able to capture such meaningful data when they approach respondents with a questionnaire/interview schedule in a formal way. If permission is required, it should be obtained well in advance. If the research topic is sensitive and securing permission is doubtful, the researcher may start work on the topic only after obtaining the permission. If high sample mortality is expected, researchers should plan for a larger sample size. They should also consciously plan opportunities to overcome the problem of isolation in the field. Modern communication technologies (e-mail, Internet chat, etc.) may also be used to achieve this purpose, if they are accessible. Finally, researchers need to take necessary steps to take care of themselves and to maintain good health.

Conclusion

As stated in the introduction, it may be reiterated that data collection activity is a crucial aspect of the research design. This entry has discussed the necessity of data collection, types of data, several data collection methods, resources required for and timeliness in data collection, ethical considerations, impediments, and strategies to ensure the collection of high-quality and accurate data. It may be noted that this discussion is neither comprehensive nor conclusive. The suggested strategies may work for some and not for others. However, this entry may provide important leads to researchers to further explore data collection methods, impediments, and strategies.

Applying Ideas on Statistics and Measurement

The following abstract is adapted from Shields, C. M. (2003). Giving voice to students: Using the Internet for data collection. Qualitative Research, 3(3), 397–414.

Good data collection techniques are essential for a research project to run smoothly and for the data to be trusted. This article explores the use of a Web-based survey as a means of data collection with more than 450 adolescents in an American school district with approximately 50 percent visible ethnic minority students. After describing the context of the study, the author explores issues related to the ease of data collection, the potential challenges and promise of the Web-based format, and the quantity and quality of data collected. Carolyn Shields demonstrates that the data collected were extremely rich, and that students appeared to be more comfortable with the electronic data collection than with an in-person interview. Moreover, the inherent issues of power differential related to race, class, and position may be overcomeusing this strategy for data collection.

ManoharPawar
Further Reading
Briggs, C. L.(1986).Learning how to ask: A sociolinguistic appraisal of the role of the interview in social science research.Cambridge, UK: Cambridge University Press.
Cowie, A. P.(1989).Oxford advanced learner's dictionary.Oxford, UK: Oxford University Press.
Kuhn, T.(1962).The structure of scientific revolution.Chicago: University of Chicago Press.
Kuhn, T.(1974).The structure of scientific revolution (2nd ed.).Chicago: University of Chicago Press.
Lee, R. M., & Renzetti, C. M.(1993).The problem of researching sensitive topics: An overview and introduction. In C. M. Renzetti & R. M. Lee (Eds.), Researching sensitive topics.Newbury Park, CA: Sage.
Pawar, M.(2004).Data collecting methods and experiences: A guide for social researchers.Chicago: New Dawn Press.
Pawar, M.(2004).Learning from data collecting methods and experiences: Moving closer to reality. In M. Pawar, Data collecting methods and experiences: A guide to social researchers.Chicago: New Dawn Press.
Popper, K.(1965).The logic of scientific discovery.New York: Harper & Row.
Smith, C. D., & Carolyn, D.(1996).In the field: Readings on the field research experience (2nd ed.).Westport, CT: Praeger.

Reader's Guide

  • All
  • A
  • B
  • C
  • D
  • E
  • F
  • G
  • H
  • I
  • J
  • K
  • L
  • M
  • N
  • O
  • P
  • Q
  • R
  • S
  • T
  • U
  • V
  • W
  • X
  • Y
  • Z

      Copy and paste the following HTML into your website