15 CBSA Crosswalk
The CBSA crosswalk maps each county to its Core-Based Statistical Area — the OMB-defined metropolitan or micropolitan statistical area it belongs to. It is derived from the County FIPS Crosswalk: CBSAs are unions of whole counties, so a county GEOID is the natural bridge between a geocoded nonprofit and its metro area.
15.1 The two-hop chain
raw geocoder label ──county_fips_crosswalk──▶ county_fips
──cbsa_crosswalk──────────▶ CBSA (code, title, metro/micro)
Each crosswalk does one job and consumers compose them (ADR 0016, 12). Nothing in the Master BMF carries CBSA columns; this is an optional join layer.
15.2 Source & vintage
Membership comes from the authoritative OMB July-2023 delineation (Census “List 1”), not a spatial guess — it gives the official CBSA code, title, metropolitan/micropolitan designation, central/outlying status, and the Combined Statistical Area (CSA), all keyed by county FIPS. Crucially, the July-2023 delineation uses the same geography vintage as the TIGER-2023 county crosswalk — including the Connecticut planning regions (09110–09190) — so every county GEOID joins cleanly.
15.3 Schema
One row per county GEOID in the county crosswalk’s resolved universe. CBSA columns are NA for rural counties that belong to no CBSA.
| Column | Type | Description |
|---|---|---|
county_fips |
chr | 5-char county GEOID — join key to the county crosswalk’s geo_county_fips |
county_name |
chr | Census NAMELSAD, carried from the county crosswalk |
cbsa_code |
chr | 5-char CBSA code (NA if rural) |
cbsa_title |
chr | e.g. Hartford-West Hartford-East Hartford, CT |
cbsa_type |
chr | Metropolitan Statistical Area | Micropolitan Statistical Area |
central_outlying |
chr | Central | Outlying county within the CBSA |
csa_code |
chr | 3-char Combined Statistical Area code (NA if none) |
csa_title |
chr | CSA title (NA if none) |
delineation_year |
int | OMB delineation vintage (2023) |
county_fips,cbsa_code, andcsa_codeare strings — leading zeros are significant (09110CT,01001AL).
15.4 How it is built
scripts/build_cbsa_crosswalk.R
- Take the distinct resolved
geo_county_fipsfromcounty_fips_crosswalk.parquet(the county universe + canonical names). - Download the OMB List 1 delineation xlsx, parse county→CBSA membership (county FIPS = 2-char state + 3-char county, leading zeros preserved).
- Left-join membership onto the county universe; counties absent from the delineation are rural (
NACBSA), made explicit rather than dropped.
Rscript scripts/build_cbsa_crosswalk.R # DELINEATION_YEAR=2023 by default15.5 Coverage (OMB 2023)
Of the 3,228 county GEOIDs in the universe (resolved county-crosswalk GEOIDs plus the nine CT planning regions folded in from the CT planning-region companion):
| Bucket | Counties | CBSAs |
|---|---|---|
| Metropolitan Statistical Area | 1,252 | |
| Micropolitan Statistical Area | 663 | |
| In a CBSA | 1,915 | 935 |
| Rural (no CBSA) | 1,313 | — |
The universe includes the nine CT planning regions because their old-county labels are deferred_ct_planning_region in the county crosswalk (they resolve by coordinate, not by name — see the CT companion). Folding the companion’s GEOIDs in here is what lets the full chain raw label → coordinate → planning region → CBSA complete; without it the four CT metros/micros (Bridgeport-Stamford-Danbury, Putnam, Torrington, Waterbury-Shelton) would dead-end. As a result cbsa_crosswalk_audit.csv now records zero delineation counties absent from the BMF universe — only the rural tally remains.
15.6 How to use it
Published to s3://nccsdata/crosswalks/cbsa/ (parquet + csv + _manifest.json). Chain it after the county crosswalk:
library(dplyr); library(arrow)
county_xwalk <- read_parquet("county_fips_crosswalk.parquet") # chr keys
cbsa_xwalk <- read_parquet("cbsa_crosswalk.parquet")
bmf_geo |>
left_join(county_xwalk, by = c("geo_state_abbr", "geo_county" = "geo_county_raw")) |>
left_join(cbsa_xwalk, by = c("geo_county_fips" = "county_fips")) |>
# cbsa_code / cbsa_title / cbsa_type now attached; NA = rural or unresolved county
count(cbsa_title, sort = TRUE)Filter to metro areas only, or roll nonprofits up to a CSA:
metro <- bmf_joined |> filter(cbsa_type == "Metropolitan Statistical Area")
by_csa <- bmf_joined |> filter(!is.na(csa_code)) |> count(csa_title, sort = TRUE)NA cbsa_code means the county is rural or the county label was unresolved upstream — handle the two together (no metro) unless you need to distinguish them via the county crosswalk’s resolution.
The county-label join above yields NA for every CT <name> County row — those labels are deferred_ct_planning_region, because a retired CT county splits across several planning regions. To attach a CBSA to CT nonprofits, recover the planning-region geo_county_fips by coordinate first, via the CT planning-region companion, then join that to cbsa_crosswalk exactly as above.
15.7 Maintenance
Keyed to the OMB delineation vintage (delineation_year). Rebuild and re-publish (R/publish_cbsa_crosswalk.R, idempotent on sha256) when OMB issues a new delineation or when the county crosswalk’s resolved universe changes. Keep the two crosswalks on the same geography vintage (TIGER year ↔︎ OMB delineation year) so county GEOIDs continue to match.