Skip to contents

Overview

After preparing data with get_data(), preview_sample() can be used to compute summary statistics and print the results in a tabular format for different subgroups in the prepared data.

In this vignette, we will provide examples of preview_sample().

Pulling Data and Computing Summary Statistics

core <- get_data(dsname = "core",
                 time = "2015")
#> Valid inputs detected. Retrieving data.
#> Downloading core data
#> Requested files have a total size of 115 MB. Proceed
#>                       with download? Enter Y/N (Yes/no/cancel)
#> Core data downloaded
preview_sample(data = core,
              group_by = c("NTEECC", "STATE"),
              var = c("TOTREV"),
              stats = c("count", "mean", "max"))
#> Valid summary fields entered.
#> # A tibble: 13,091 × 5
#> # Groups:   NTEECC [937]
#>    NTEECC STATE count     mean     max
#>    <chr>  <chr> <int>    <dbl>   <dbl>
#>  1 A01    AZ        2   41648.   73295
#>  2 A01    CA       13 1052178. 9241479
#>  3 A01    CO        2  268456.  319830
#>  4 A01    CT        2  228350.  415503
#>  5 A01    DC        5  446665. 1117827
#>  6 A01    DE        1  268308   268308
#>  7 A01    FL        2 1181261  1713932
#>  8 A01    GA        3   64731   109254
#>  9 A01    HI        3   15371.   29528
#> 10 A01    IA        2   47986.   78500
#> # ℹ 13,081 more rows

preview_sample() groups the data set by user-defined group_by columns, and computes summary statistics for the user-defined var column. The available summary statistics are:

  • min: minimum value
  • median: median value
  • max: maximum value
  • mean: mean value
  • count: count of rows belonging to group