Data terminologies

Questionnaire: A set of written questions used for collecting information from respondents.

Respondents: Individuals who respond to the questions in a questionnaire.

Dataset: The information collected from respondents. The numbers to be analyzed.

Full wording of question: The exact text of a question as it appears in the questionnaire.

Variable name: Unique words assigned to each question. We use variable names in data analysis software.

Values: Numbers such as 1, 2, 3, etc., that appear in the dataset representing specific responses.

Labels: What those values (numbers) mean, e.g., 1: yes, 2: no, etc.

Response set: The combination of values and their corresponding labels.

A conceptual diagram showing the research process. On the left is an icon labeled “questionnaire,” representing a set of survey questions. In the center is an icon labeled “respondents,” representing a group of people who answer the questions. On the right is an icon labeled “data,” representing recorded responses. A sentence below reads: “We ask a set of questions to a group of people and record their responses.”
A three-part diagram illustrating how survey data are created. On the left is a questionnaire with questions about unfair treatment, such as being unfairly fired, treated by police, or treated badly at restaurants. In the center is an icon representing respondents, labeled “Latino National Survey: 8,634 self-identified Latino or Hispanic residents of the United States.” On the right is a dataset shown as a table with columns labeled DFIRED, DBADPOLC, DHOUSING, and DRESTAUR, containing numeric response codes.
A diagram showing how survey questions become dataset variables. On the questionnaire side, the full wording of a question is shown, along with its response options. The variable name, such as DFIRED, is identified as the column name used in the dataset. Numeric values such as 1, 2, and 3 are shown as the coded responses stored in the dataset. Labels explain what each number means, for example, 1 equals Yes, 2 equals No, and 3 equals DK or NA. The response set is defined as the complete list of possible answers for the question. Arrows indicate how wording, response options, values, and labels correspond to dataset columns.

[Variables in GSS]

A table titled “Discrimination and harassment at work” illustrating how survey variables are analyzed. The variable name “wkageism” is shown and described as measuring perceived discrimination at work because of age. The full wording of the question is displayed: “Do you feel in any way discriminated against on your job because of your age?” Numeric values are listed, where 1 means Yes and 2 means No. Labels explain the meaning of these values. Below, a frequency table shows counts and percentages for each response category, demonstrating how coded values and labels are used in statistical analysis.

Last updated