CORE Data Pipeline Guide

Author

Thiyaghessan

Published

May 15, 2026

1 CORE Data Pipeline Guide

The nccs-data-core pipeline produces NCCS’s CORE Series: harmonized panels of select Form 990 / 990-EZ / 990-PF fields, built from IRS Statistics of Income (SOI) extracts (2012-present) and raw legacy NCCS files (1989-2011).

Note

TODO: Overview, audience, and how to navigate this book.

1.1 Outputs

Per (tax_year, form) CSV plus a per-output data dictionary and quality report:

  • 990 — full 990 schedule, 990 filers only.
  • 990ez — full 990-EZ schedule, current-only (no pre-2012 source).
  • 990pf — full 990-PF schedule, private foundations.
  • 990combined — 990 + 990-EZ stacked on the 53 shared harmonized columns.

1.2 Where to start