- By: | Edited by: Paul Atkinson, Sara Delamont, Alexandru Cernat, Joseph W. Sakshaug & Richard A.Williams
- Publisher: SAGE Publications Ltd
- Publication year: 2020
- Online pub date:
- Discipline: Anthropology, Business and Management, Communication and Media Studies, Computer Science, Counseling and Psychotherapy, Criminology and Criminal Justice, Economics, Education, Engineering, Geography, Health, History, Marketing, Mathematics, Medicine, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy, Science, Social Work, Sociology, Technology
- Methods: Data cleaning, Type I errors, Type II errors
- Length: 10k+ Words
Data cleaning is the process of quality checking quantitative data to ensure a data set contains accurate information. Data cleaning involves a number of practical approaches to dealing with data such as checking data coding, checking data inputting, examining data distributions, and identifying issues such as extreme values. Data cleaning may be an important step of the research process in order to meet statistical assumptions for analytic techniques and is particularly important to reduce the impact of any errors made during data collection or imputation. This entry provides a detailed overview of data cleaning processes and techniques to support accurate and reliable data analysis. This includes screening data, dealing with extreme values, dealing with missing or incomplete data, and data distributions. Data cleaning is not an objective exercise and subjective decisions may need to be made during the data cleaning process. It is imperative that researchers are transparent about data cleaning processes and decisions.