Skip to contents

Overview

After preparing data with get_data(), preview_sample() can be used to compute summary statistics and print the results in a tabular format for different subgroups in the prepared data.

In this vignette, we will provide examples of preview_sample().

Pulling Data and Computing Summary Statistics

core <- get_data(dsname = "core",
                 time = "2015")
#> Valid inputs detected. Retrieving data.
#> Downloading core data
#> Requested files have a total size of 115 MB. Proceed
#>                       with download? Enter Y/N (Yes/no/cancel)
#> Core data downloaded
preview_sample(data = core,
              group_by = c("NTEECC", "STATE"),
              var = c("TOTREV"),
              stats = c("count", "mean", "max"))
#> Valid summary fields entered.
#> # A tibble: 13,091 × 5
#> # Groups:   NTEECC [937]
#>    NTEECC STATE count    mean       max
#>    <chr>  <chr> <int> <int64>   <int64>
#>  1 ""     ""      406 1769225 375740413
#>  2 "A01"  "AZ"      2   41647     73295
#>  3 "A01"  "CA"     13 1052177   9241479
#>  4 "A01"  "CO"      2  268455    319830
#>  5 "A01"  "CT"      2  228350    415503
#>  6 "A01"  "DC"      5  446664   1117827
#>  7 "A01"  "DE"      1  268308    268308
#>  8 "A01"  "FL"      2 1181261   1713932
#>  9 "A01"  "GA"      3   64731    109254
#> 10 "A01"  "HI"      3   15371     29528
#> # ℹ 13,081 more rows

preview_sample() groups the data set by user-defined group_by columns, and computes summary statistics for the user-defined var column. The available summary statistics are:

  • min: minimum value
  • median: median value
  • max: maximum value
  • mean: mean value
  • count: count of rows belonging to group