library( haven )
library( dplyr )
library( tidyr )
library( epoxy )
library( memisc )
library( labelled )
YEAR TWO DATA
Load the data needed to process the second year survey results.
Required R Packages
This chapter will utilize the following packages:
Load the Data
# LOAD DATA DICTIONARY
<-
dd ::read_xlsx(
readxl"../data-dictionaries/dd-nptrends-wave-02.xlsx",
sheet = "data dictionary" )
# USE RAW VNAME IF VNAME IS EMPTY:
$vname[ is.na(dd$vname) ] <-
dd$vname_raw[ is.na(dd$vname) ]
dd
# LOAD QUALTRICS SURVEY DATA
<- "DATA-PREP/02-year-two/01-data-raw/"
fpath <- "wave-02-qualtrics-download-29mar23.csv"
fname <- readr::read_csv( paste0( fpath, fname ) )
survey_df <- survey_df[ -(1:2), ] # drop qualtrics headers
survey_df
<- "YEAR-02-COMPLETE-CASE-CODES.csv"
fname <- read.csv( paste0(fpath,fname) )
cases
<- dplyr::select( cases, EIN, Completion_Status )
cases <- merge( survey_df, cases, by="EIN", all.x=T )
survey_df
<-
survey_df %>%
survey_df ::filter( Completion_Status %in% c("Complete","Partial_keep") ) %>%
dplyr::select( - Completion_Status ) dplyr
Note the raw data is challenging because it contains qualtrics encodings and missing values need context to be used correctly (e.g. were the questions skipped by skip logic, or by the respondent?).
1:6, 51:55 ] %>% pander::pander() # data peek survey_df[
PeopleServed#2_2_1 | DemandNextYear | Staff#1_1_1 | Staff#1_2_1 | Staff#1_3_1 |
---|---|---|---|---|
-99 | Increase | 60 | 30 | -99 |
-99 | Increase | 3 | 4 | 9 |
-99 | Increase | 3 | 1 | 10 |
-99 | Increase | 13 | 2 | 11 |
N/A | Increase | 1 | -99 | 14 |
N/A | Increase | 1 | 8 | 11 |
Data Workflow
The following chapters describe the workflow used to import qualtrics data and apply cleaning and transformation steps to prepare the restricted use file and public use file for subsequent analysis:
- Renaming columns
- Drop nuisance columns (survey deployment attributes)
- Add meaningful labels to response values
- Dropping duplicates, incomplete responses and test responses
Renaming Columns
Columns referencing survey questions are renamed with the help of a data dictionary to improve readability.
<-
torename %>%
dd ::select( vname, vname_raw ) %>%
dplyr::drop_na() tidyr
Examples:
[1] "DemandNextYear" "Staff#1_1_1" "Staff#1_2_1" "Staff#1_3_1"
Give the data meaningful names so that it is easier to work with.
<-
survey_df %>%
survey_df ::rename_at(
dplyrvars( torename$vname_raw ),
~ torename$vname )
Examples:
[1] "Dmnd_NxtYear" "Staff_Fulltime_2021" "Staff_Parttime_2021"
[4] "Staff_Boardmmbr_2021"
Drop Nuisance Fields
Many of the exported qualtrics fields contain non-useful metadata or are empty. These have been labeled as “DROP” in the group field. Remove these for convenience.
# SELECT COLUMNS TO DROP:
<- dd$vname[ dd$group == "DROP" ] |> na.omit()
DROP_THESE
<-
survey_df %>%
survey_df ::select( -any_of( DROP_THESE ) ) dplyr
Add Survey Weights
# ADD SURVEY WEIGHTS
<- "DATA-PREP/00-sample-framework/"
fpath <- "year2wt.csv"
fname <- readr::read_csv( paste0( fpath, fname ) )
wt2
<- merge( survey_df, wt2, by.x="EIN", by.y="ein", all.x=TRUE ) survey_df
Groups of Variables
Each group of survey questions comes with its own set of valid inputs that must be recoded separately. For example, “N/A”’s are options for some survey questions and not for others, and some survey questions allow for manual text inputs.
The below code chunk separates all survey questions into their respective categories before further separating each category into numeric, text or NA inputs.
NA questions here refer to “Check here if not applicable to your organization” questions in the survey, where a “C” indicates that the respondent has checked the N/A box.
- 15 questions about CHANGES TO PROGRAMS AND SERVICES
- 4 questions about the NUMBER OF PEOPLE EACH ORGANIZATION SERVES
- 1 question about OVERALL PROGRAM DEMAND
- 27 questions about STAFF NUMBERS
- 2 questions about DONOR AND VOLUNTEER IMPORTANCE
- 11 questions about CHANGES TO LEADERSHIP
- 26 questions about THE RACE AND GENDER OF CEOS AND BOARD CHAIRS
- 8 questions about CHANGES TO ORGANIZATIONAL FINANCES
- 2 questions about CARES FUNDING
- 2 questions about FINANCIAL RESERVES
- 9 questions about REVENUE SOURCES
- 26 questions about FUNDRAISING SOURCES
- 2 questions about DONOR TYPES IN FUNDRAISING
- 7 questions about FUNDRAISING YIELDS
- 11 questions about FUNDRAISING STRATEGY CHANGES
- 1 questions about MAJOR GIFT AMOUNTS
- 13 questions about EXTERNAL AFFAIRS
- 1 questions about FUTURE CONCERNS