Function to summarize NCCS data. — preview

This function computes summary statistics for data pulled with get_data()

Usage

preview_sample(
  data,
  group_by,
  var,
  stats,
  ntee = NULL,
  ntee.group = NULL,
  ntee.code = NULL,
  ntee.orgtype = NULL,
  geo.state = NULL,
  geo.city = NULL,
  geo.region = NULL,
  geo.county = NULL
)

Arguments

data: data.frame or data.table. In-memory dataset to summarize
group_by: character vector. Vector of columns for dplyr::group_by()
var: character scalar. Column to calculate summary statistics with
stats: character vector. Vector of summary statistics to compute with dplyr::summarise(). Available options are count, min, max, median and mean
ntee: character vector. Vector of user inputs. The user inputs are progressively filtered until group, code and orgtypes are sorted into separate vectors.
ntee.group: character vector. Specific Industry Group codes submitted by user
ntee.code: character vector. Specific level 2-4 codes (Industry, Division, Subdivision) submitted by user.
ntee.orgtype: character vector. Specific level 5 codes (Organization Type) submitted by user.
geo.state: character vector. Filter query by state abbreviations e.g. "NY", "CA". Default == NULL includes all states.
geo.city: character vector. City names for filtering e.g. "Chicago", "montgomery". Case insensitive
geo.region: character vector. Regions for filtering e.g. "South", "Midwest" based on census region classifications.
geo.county: character vector. County names for filtering e.g. "cullman", "dale". Case insensitive.

Value

dataframe with summary statistics computed for each group

Examples

if (FALSE) {
core <- get_data(dsname = "core",
                 time = "2005")
preview_sample(data = core,
               group_by = c("NTEECC", "STATE"),
               var = c("TOTREV"),
               stats = c("count", "mean", "max"))
}