Research data: a practical guide
Types of data
Research generally produces these broad types of data:
- Quantitative data – numerical information that measures, counts, or compares variables.
Examples: age, income, population size, test scores.
Use: analysed statistically to test hypotheses or identify trends. - Qualitative data – descriptive and interpretive information that explores experiences, meanings, and perspectives.
Examples: interview transcripts, observations, field notes, policy documents.
Use: analysed thematically or interpretively to understand context, values, and social processes.
Both quantitative and qualitative data can be big or small. Big data consists of extremely large or complex data sets generated through digital platforms, sensors, administrative systems or numerous occasions of data collection. Examples: social media data, health records, mobile phone data and satellite imagery. Use: analysed using computational or mixed methods to identify large-scale trends and correlations. Researchers normally deal with small data.
This post focuses on qualitative data. The objective is to make introduce and simplify data nomenclauture (language) and make it easier for researchers to communicate when the write research proposals, and create data collection, processing and storage plans. I hope you will find this useful.
Phases of qualitative research data
a. Data collection phase
- Primary sources of data – original information collected directly from participants.
Examples: interviews, focus groups, participant observation, oral histories. - Secondary sources of data – pre-existing materials created by others but relevant to the research.
Examples: books, reports, policy papers, archives, journal articles. - Demographic data – background characteristics of participants such as age, gender, education, occupation, or residence.
Purpose: helps describe and contextualise the study population. - Field data – information gathered in the natural research setting through observation, informal interaction, or documentation.
Purpose: captures context, setting, and lived realities.
b. Data organisation and preparation phase
- Raw data – all unprocessed materials collected during research.
Examples: audio recordings, photographs, notes, videos, transcripts. - Spoilt data – data that has been damaged or recorded incorrectly and cannot be used for analysis.
Examples: corrupted files, incomplete recordings, or notes destroyed by error.* - Missing data – portions of information that were not captured or are absent from the data set.
Examples: skipped survey questions, lost files, or unrecorded responses.* - Biased data – data that systematically overrepresents or underrepresents certain perspectives or outcomes.
Examples: selective sampling, leading questions, or researcher influence.* - Corrupted data – files or datasets that have been altered, distorted, or lost due to technical faults or mishandling.
Examples: data loss through system crashes, malware, or incompatible software.* - Retracted data – data that must be withdrawn or excluded after collection because of ethical violations or identified errors.
Examples: falsified responses, unauthorised material, or invalid consent.* - Unreliable data – data that lacks credibility due to inconsistencies, unverified sources, or questionable accuracy.
Examples: unverifiable documents, inconsistent transcripts, or untrustworthy reports.* - Data cleaning – the process of reviewing and correcting raw data to remove errors, inconsistencies, or duplicates.
Purpose: ensures the accuracy and quality of data before analysis. - Data validation and checking – verifying the authenticity, completeness, and reliability of data collected from different sources.
Purpose: enhances credibility and trustworthiness. - Data set – the complete and organised body of all cleaned data for a study.
Purpose: serves as the foundation for analysis. - Database – the storage or management system where data are securely archived.
Examples: NVivo, Excel, Google Drive, institutional repositories. - Data management – systematic organisation, storage, protection, and retrieval of data throughout the research process.
Purpose: ensures accessibility, confidentiality, and long-term preservation. - Field notes – descriptive and reflective notes written by the researcher to document observations and impressions.
Purpose: add depth and context to raw data.
c. Data analysis and interpretation phase
- Data coding – assigning labels or categories to pieces of data that represent key ideas, patterns, or meanings.
Purpose: organises qualitative data into manageable segments for interpretation. - Extracted data – selected quotations or text segments illustrating specific ideas, meanings, or findings.
Purpose: provide evidence for analysis and interpretation. - Memo data – researcher reflections and analytical thoughts developed during coding and interpretation.
Purpose: document the researcher’s evolving understanding and theoretical reasoning. - Data analysis – systematic process of identifying patterns, relationships, and meanings within the data.
Purpose: answers research questions through interpretation and synthesis. - Data visualisation – use of charts, diagrams, models, or maps to display patterns, relationships, or findings.
Examples: conceptual frameworks, thematic maps, flow diagrams, word clouds. - Triangulated data – integrated findings that draw on multiple data sources to confirm patterns or reveal contrasts.
Purpose: strengthen credibility, consistency, and depth of interpretation.
The movement of data in qualitative research typically follows this path:
Fieldwork → raw data → cleaning → validation → data set → database → management → coding → memo data → analysis → visualisation → triangulated data → interpretation and reporting
Use the form below to subscibe to Owia Bulletin.
Discover more from Africa Social Work & Development Network | Mtandao waKazi zaJamii naMaendeleo waAfrika
Subscribe to get the latest posts sent to your email.
