Skip to contents

This function computes summary statistics for data pulled with get_data()

Usage

preview_sample(
  data,
  group_by,
  var,
  stats,
  ntee = NULL,
  ntee.group = NULL,
  ntee.code = NULL,
  ntee.orgtype = NULL,
  geo.state = NULL,
  geo.city = NULL,
  geo.region = NULL,
  geo.county = NULL
)

Arguments

data

data.frame or data.table. In-memory dataset to summarize

group_by

character vector. Vector of columns for dplyr::group_by()

var

character scalar. Column to calculate summary statistics with

stats

character vector. Vector of summary statistics to compute with dplyr::summarise(). Available options are count, min, max, median and mean

ntee

character vector. Vector of user inputs. The user inputs are progressively filtered until group, code and orgtypes are sorted into separate vectors.

ntee.group

character vector. Specific Industry Group codes submitted by user

ntee.code

character vector. Specific level 2-4 codes (Industry, Division, Subdivision) submitted by user.

ntee.orgtype

character vector. Specific level 5 codes (Organization Type) submitted by user.

geo.state

character vector. Filter query by state abbreviations e.g. "NY", "CA". Default == NULL includes all states.

geo.city

character vector. City names for filtering e.g. "Chicago", "montgomery". Case insensitive

geo.region

character vector. Regions for filtering e.g. "South", "Midwest" based on census region classifications.

geo.county

character vector. County names for filtering e.g. "cullman", "dale". Case insensitive.

Value

dataframe with summary statistics computed for each group

Examples

if (FALSE) {
core <- get_data(dsname = "core",
                 time = "2005")
preview_sample(data = core,
               group_by = c("NTEECC", "STATE"),
               var = c("TOTREV"),
               stats = c("count", "mean", "max"))
}