15  CBSA Crosswalk

The CBSA crosswalk maps each county to its Core-Based Statistical Area — the OMB-defined metropolitan or micropolitan statistical area it belongs to. It is derived from the County FIPS Crosswalk: CBSAs are unions of whole counties, so a county GEOID is the natural bridge between a geocoded nonprofit and its metro area.

15.1 The two-hop chain

raw geocoder label  ──county_fips_crosswalk──▶  county_fips
                    ──cbsa_crosswalk──────────▶  CBSA (code, title, metro/micro)

Each crosswalk does one job and consumers compose them (ADR 0016, 12). Nothing in the Master BMF carries CBSA columns; this is an optional join layer.

15.2 Source & vintage

Membership comes from the authoritative OMB July-2023 delineation (Census “List 1”), not a spatial guess — it gives the official CBSA code, title, metropolitan/micropolitan designation, central/outlying status, and the Combined Statistical Area (CSA), all keyed by county FIPS. Crucially, the July-2023 delineation uses the same geography vintage as the TIGER-2023 county crosswalk — including the Connecticut planning regions (0911009190) — so every county GEOID joins cleanly.

15.3 Schema

One row per county GEOID in the county crosswalk’s resolved universe. CBSA columns are NA for rural counties that belong to no CBSA.

Column Type Description
county_fips chr 5-char county GEOID — join key to the county crosswalk’s geo_county_fips
county_name chr Census NAMELSAD, carried from the county crosswalk
cbsa_code chr 5-char CBSA code (NA if rural)
cbsa_title chr e.g. Hartford-West Hartford-East Hartford, CT
cbsa_type chr Metropolitan Statistical Area | Micropolitan Statistical Area
central_outlying chr Central | Outlying county within the CBSA
csa_code chr 3-char Combined Statistical Area code (NA if none)
csa_title chr CSA title (NA if none)
delineation_year int OMB delineation vintage (2023)

county_fips, cbsa_code, and csa_code are strings — leading zeros are significant (09110 CT, 01001 AL).

15.4 How it is built

scripts/build_cbsa_crosswalk.R
  1. Take the distinct resolved geo_county_fips from county_fips_crosswalk.parquet (the county universe + canonical names).
  2. Download the OMB List 1 delineation xlsx, parse county→CBSA membership (county FIPS = 2-char state + 3-char county, leading zeros preserved).
  3. Left-join membership onto the county universe; counties absent from the delineation are rural (NA CBSA), made explicit rather than dropped.
Rscript scripts/build_cbsa_crosswalk.R   # DELINEATION_YEAR=2023 by default

15.5 Coverage (OMB 2023)

Of the 3,228 county GEOIDs in the universe (resolved county-crosswalk GEOIDs plus the nine CT planning regions folded in from the CT planning-region companion):

Bucket Counties CBSAs
Metropolitan Statistical Area 1,252
Micropolitan Statistical Area 663
In a CBSA 1,915 935
Rural (no CBSA) 1,313

The universe includes the nine CT planning regions because their old-county labels are deferred_ct_planning_region in the county crosswalk (they resolve by coordinate, not by name — see the CT companion). Folding the companion’s GEOIDs in here is what lets the full chain raw label → coordinate → planning region → CBSA complete; without it the four CT metros/micros (Bridgeport-Stamford-Danbury, Putnam, Torrington, Waterbury-Shelton) would dead-end. As a result cbsa_crosswalk_audit.csv now records zero delineation counties absent from the BMF universe — only the rural tally remains.

15.6 How to use it

Published to s3://nccsdata/crosswalks/cbsa/ (parquet + csv + _manifest.json). Chain it after the county crosswalk:

library(dplyr); library(arrow)
county_xwalk <- read_parquet("county_fips_crosswalk.parquet")  # chr keys
cbsa_xwalk   <- read_parquet("cbsa_crosswalk.parquet")

bmf_geo |>
  left_join(county_xwalk, by = c("geo_state_abbr", "geo_county" = "geo_county_raw")) |>
  left_join(cbsa_xwalk,   by = c("geo_county_fips" = "county_fips")) |>
  # cbsa_code / cbsa_title / cbsa_type now attached; NA = rural or unresolved county
  count(cbsa_title, sort = TRUE)

Filter to metro areas only, or roll nonprofits up to a CSA:

metro <- bmf_joined |> filter(cbsa_type == "Metropolitan Statistical Area")
by_csa <- bmf_joined |> filter(!is.na(csa_code)) |> count(csa_title, sort = TRUE)

NA cbsa_code means the county is rural or the county label was unresolved upstream — handle the two together (no metro) unless you need to distinguish them via the county crosswalk’s resolution.

NoteConnecticut routes through the companion

The county-label join above yields NA for every CT <name> County row — those labels are deferred_ct_planning_region, because a retired CT county splits across several planning regions. To attach a CBSA to CT nonprofits, recover the planning-region geo_county_fips by coordinate first, via the CT planning-region companion, then join that to cbsa_crosswalk exactly as above.

15.7 Maintenance

Keyed to the OMB delineation vintage (delineation_year). Rebuild and re-publish (R/publish_cbsa_crosswalk.R, idempotent on sha256) when OMB issues a new delineation or when the county crosswalk’s resolved universe changes. Keep the two crosswalks on the same geography vintage (TIGER year ↔︎ OMB delineation year) so county GEOIDs continue to match.