YEAR THREE DATA

Load the data needed to process the second year survey results.

Required R Packages

This chapter will utilize the following packages:

library( haven )
library( dplyr )
library( tidyr )
library( epoxy )
library( memisc )
library( labelled )

Load the Data

# LOAD DATA DICTIONARY 
dd <- 
  readxl::read_xlsx( 
   "../data-dictionaries/dd-nptrends-year-03.xlsx", 
   sheet = "data dictionary" )

# LOAD QUALTRICS SURVEY DATA 
fpath <- "DATA-PREP/03-year-three/01-data-raw/"
fname <- "year-03-qualtrics-download-combined.csv"
survey_df  <- readr::read_csv( paste0( fpath, fname ) )

# DROP INCOMPLETE CASES
survey_df <- 
  survey_df %>% 
  dplyr::filter( Completion_Status %in% c("Complete","Partial_keep") )

Note the raw data is challenging because it contains qualtrics encodings and missing values need context to be used correctly (e.g. were the questions skipped by skip logic, or by the respondent?).

survey_df[ 1:6, 51:55 ] %>% pander::pander()   # data peek
Q3.1_14_TEXT Q4.1#1_1_1 Q4.1#1_2_1 Q4.1#2_1_1 Q4.1#2_2_1
NA 489 0 NA NA
NA 70000 NA NA 1
NA 2000 NA NA 1
NA 1500 NA NA 1
NA NA NA 1 1
NA 5000 NA NA NA

Data Workflow

The following chapters describe the workflow used to import qualtrics data and apply cleaning and transformation steps to prepare the restricted use file and public use file for subsequent analysis:

  1. Renaming columns
  2. Drop nuisance columns (survey deployment attributes)
  3. Add meaningful labels to response values
  4. Dropping duplicates, incomplete responses and test responses

Renaming Columns

Columns referencing survey questions are renamed with the help of a data dictionary to improve readability.

torename <- 
  dd %>% 
  dplyr::select( vname, vname_raw ) %>% 
  tidyr::drop_na()

torename <- dplyr::filter( torename, vname_raw != "Q10.2_4" )

# torename$vname_raw %>% setdiff( names(survey_df) )

Examples:

[1] "Q4.1#1_1_1" "Q4.1#1_2_1" "Q4.1#2_1_1" "Q4.1#2_2_1"

Give the data meaningful names so that it is easier to work with.

# USE RAW VNAME IF VNAME IS EMPTY:
dd$vname[ is.na(dd$vname) ] <- dd$vname_raw[ is.na(dd$vname) ]

survey_df <- 
  survey_df %>% 
  dplyr::rename_at(
     vars( torename$vname_raw ), 
     ~ torename$vname )

Examples:

[1] "PplSrv_NumServed"      "PplSrv_NumWait"        "PplSrv_NumServed_NA_X"
[4] "PplSrv_NumWait_NA_X"  

Add survey questions as an attribute in the data frame:

temp          <- readr::read_csv( paste0( fpath, fname ) )
question.txt  <- as.character( temp[ 1, ] )
rm( temp )

names( question.txt ) <- names( survey_df )
attr( survey_df, "question_txt" ) <- question.txt

Drop Nuisance Fields

Many of the exported qualtrics fields contain non-useful metadata or are empty. These have been labeled as “DROP” in the group field. Remove these for convenience.

# SELECT COLUMNS TO DROP: 
DROP_THESE <- dd$vname[ dd$group == "DROP" ] |> na.omit()

survey_df <- 
  survey_df %>% 
  dplyr::select( -any_of( DROP_THESE ) )

Add Survey Weights

# ADD SURVEY WEIGHTS 
fpath <- "DATA-PREP/00-sample-framework/"
fname <- "year3wt.csv"
wt3  <- readr::read_csv( paste0( fpath, fname ) )
wt3  <- unique( wt3)

survey_df <- merge( survey_df, wt3, by.x="ExternalReference", by.y="ein", all.x=TRUE )

Groups of Variables

Each group of survey questions comes with its own set of valid inputs that must be recoded separately. For example, “N/A”’s are options for some survey questions and not for others, and some survey questions allow for manual text inputs.

The below code chunk separates all survey questions into their respective categories before further separating each category into numeric, text or NA inputs.

NA questions here refer to “Check here if not applicable to your organization” questions in the survey, where a “C” indicates that the respondent has checked the N/A box.

  • 15 questions about CHANGES TO PROGRAMS AND SERVICES
  • 4 questions about the NUMBER OF PEOPLE EACH ORGANIZATION SERVES
  • 1 question about OVERALL PROGRAM DEMAND
  • 27 questions about STAFF NUMBERS
  • 2 questions about DONOR AND VOLUNTEER IMPORTANCE
  • 11 questions about CHANGES TO LEADERSHIP
  • 26 questions about THE RACE AND GENDER OF CEOS AND BOARD CHAIRS
  • 8 questions about CHANGES TO ORGANIZATIONAL FINANCES
  • 2 questions about CARES FUNDING
  • 2 questions about FINANCIAL RESERVES
  • 8 questions about REVENUE SOURCES
  • 26 questions about FUNDRAISING SOURCES
  • 2 questions about DONOR TYPES IN FUNDRAISING
  • 7 questions about FUNDRAISING YIELDS
  • 11 questions about FUNDRAISING STRATEGY CHANGES
  • 1 questions about MAJOR GIFT AMOUNTS
  • 13 questions about EXTERNAL AFFAIRS
  • 1 questions about FUTURE CONCERNS
  • 2 questions about FISCAL YEAR END
  • 12 questions about REGULATION
  • 10 questions about STAFFING PLANS