Code templates

install and load multiple packages

while (dev.cur() > 1) dev.off()
packages <- c("corrplot", "tidyverse", "ggpubr",
              "Hmisc", "parameters", "performance",
              "psych", "see", "sjlabelled", "sjmisc", "sjPlot")
installed_packages <- rownames(installed.packages())
for (pkg in packages) {if (!(pkg %in% installed_packages)) {
  message(paste("Installing package:", pkg))
  install.packages(pkg, dependencies = TRUE)} else {
    message(paste("Package already installed:", pkg))}
  library(pkg, character.only = TRUE)}

install and load a single package

if (!require("PackageNameHere")) install.packages("PackageNameHere", dependencies = TRUE); library("PackageNameHere")

load GSS

temp <- tempfile()
download.file("https://drive.google.com/uc?export=download&id=1mF7gMY4aU9amTgYLSVOyVQaHT_opDUbj",temp, mode = "wb")
unzip(temp, files="OrigData/2022/GSS2022.dta",exdir = "OrigData")
gss <- haven::read_dta("OrigData/OrigData/2022/GSS2022.dta")
key <- as.data.frame(get_label(gss))

frequency table (for categorical variables)

descriptive table (for continuous variables)

recoding

(1) merging values (categorical to categorical)

1.2. recoding (merging values with 2 values)

1.3. recoding (merging values with 3 values)

1.4. recoding (merging values with 4 values)

1.5. recoding (merging values with 5 values)

1.6. recoding (merging values with 6 values)

1.7. recoding (merging values with 7 values)

(2) reversing values (categorical to categorical)

2.2. recoding (reversing values with 2 values)

2.3. recoding (reversing values with 3 values)

2.4. recoding (reversing values with 4 values)

2.5. recoding (reversing values with 5 values)

2.6. recoding (reversing values with 6 values)

2.7. recoding (reversing values with 7 values)

(3) transforming continuous variables into groups (continuous to categorical)

3.2. recoding (transforming continuous variables into groups with 2 values)

3.3. recoding (transforming continuous variables into groups with 3 values)

3.4. recoding (transforming continuous variables into groups with 4 values)

3.5. recoding (transforming continuous variables into groups with 5 values)

3.6. recoding (transforming continuous variables into groups with 6 values)

3.7. recoding (transforming continuous variables into groups with 7 values)

3.8. recoding (transforming continuous variables into groups with 8 values)

computing

computing 1

computing 2 (with recoding) - sample 1

computing 3 (with recoding) - sample 2

chi square

sampling: data creation for subsamples

non-random (last 100 cases)

25% simple random sample

10% systematic random sample

ttest

visualization

bar graph (for categorical variables)

histogram (for continuous variables)

stacked bar graphs for multiple variables

stacked bar graphs for multiple variables (flip coordination)

stacked bar graphs by different groups

bar graphs between groups (margin=row)

scatterplot with two continuous variables

scatterplot with two continuous variables by groups

correlation analysis

correlation analysis structure

Correlation analysis examines the linear relationship of two continuous variables.

IF the p-value is statistically significant (<0.05);

  • less than |0.3| … weak correlation

  • 0.3 < | r | < 0.5 … moderate correlation

  • greater than 0.5 ………. strong correlation

The order of the variables does not matter.

(1) correlation analysis table

(2) correlation scatterplot graph

xlab: "what it measures column" of variable 1 (x)

ylab: "what it measures column" of variable 2 (y)

(3) correlation matrix

(4) scatterplot matrix

(5) correlogram

linear regression

(1) linear regression with 1 independent variable

(2) linear regression with 2 independent variables

(3) linear regression with 3 independent variables

(4) linear regression with 4 independent variables

add more independent variables with a plus (+)

logistic regression

(1) logistic regression with 1 independent variable

(2) logistic regression with 2 independent variables

(3) logistic regression with 3 independent variables

(4) logistic regression with 4 independent variables

add more independent variables with a plus (+)

dummy variables

DUMMY EXAMPLE

First step: Check the frequency distribution of the original variable to see what the values (1, 2, 3, etc.) mean.

Code:

Second step: Create dummy variables for each category.

Codes:

Third step: Do not include (omit) one of the dummy variables in your model. The omitted dummy variable is called “comparison category” and should be used in interpretation as well.

dummy variable: categorical (binary)

dummy variable: nominal/ordinal

dummy variable: nominal/ordinal 1 (merging categories)

dummy variable: nominal/ordinal 2 (merging categories)

dummy variable: nominal/ordinal 3 (merging categories)

dummy variable: nominal/ordinal 4 (merging categories)

dummy variable: continuous

scientific notations (e.g., 2e-16)

mean centering

delete the environment

remove categories from a variable

remove label

rename variables

change the variable from continuous to categorical

change the variable from categorical to continuous

show codebook

remove packages

assigning labels

Last updated