BMF Pipeline Quality Report

Post-Transformation Validation Results

Published

May 6, 2026

Executive Summary

Overall Status: PASSED

Pipeline Timestamp: 2026-05-06 16:33:33

Metric Value
Total Records 1,788,300
Output Columns 110
Overall Completeness 59.2%
Row Preservation Passed
Unique EINs 1,786,293
Duplicate EINs 2,007

Field Completeness by Category

This section shows the completeness of each output field, grouped by category. Each field shows its source column(s) from the raw BMF file.

Fields with Incomplete Data

Identity

Average Completeness: 100.0% | Columns: 2

Field Source Completeness Status Null Count
ein EIN 100.0% Good 0
ein_raw EIN 100.0% Good 0

Column Descriptions:

  • ein: Employer Identification Number formatted as XX-XXXXXXX
  • ein_raw: Original 9-digit EIN value from source file without formatting

Organization Name

Average Completeness: 67.6% | Columns: 5

Field Source Completeness Status Null Count
org_name_raw NAME 100.0% Good 0
org_name_join NAME 100.0% Good 0
org_name_display NAME 100.0% Good 0
org_legal_suffix NAME 36.9% Low 1,127,932
org_parent_name NAME 1.1% Low 1,768,406

Column Descriptions:

  • org_name_raw: Original organization name exactly as it appears in the source file
  • org_name_join: Standardized name for matching and joining (uppercase, punctuation removed)
  • org_name_display: Title-cased organization name suitable for display purposes
  • org_legal_suffix: Legal entity suffix extracted from name (Inc, Corp, LLC, Foundation, etc.)
  • org_parent_name: Parent organization name if this is a subordinate/chapter organization

DBA Name

Average Completeness: 24.2% | Columns: 2

Field Source Completeness Status Null Count
dba_name SORT_NAME 24.2% Low 1,354,909
dba_name_raw SORT_NAME 24.2% Low 1,354,909

Column Descriptions:

  • dba_name: Cleaned ‘Doing Business As’ name
  • dba_name_raw: Original secondary/DBA name from source file

In Care Of

Average Completeness: 33.3% | Columns: 3

Field Source Completeness Status Null Count
in_care_of_name_raw ICO 0.0% Low 1,788,300
in_care_of_name_clean ICO 0.0% Low 1,788,300
in_care_of_name_provided ICO 100.0% Good 0

Column Descriptions:

  • in_care_of_name_raw: Original ‘In Care Of’ field from source file
  • in_care_of_name_clean: Cleaned ICO name with standardized formatting
  • in_care_of_name_provided: Boolean indicating whether an ICO name was provided

Group Exemption

Average Completeness: 33.3% | Columns: 3

Field Source Completeness Status Null Count
group_exemption_number_raw GROUP 0.0% Low 1,788,300
group_exemption_number GROUP 0.0% Low 1,788,300
group_exemption_is_member GROUP 100.0% Good 0

Column Descriptions:

  • group_exemption_number_raw: Original group exemption number from source file
  • group_exemption_number: Cleaned group exemption number (GEN)
  • group_exemption_is_member: Boolean indicating if organization is a member of a group exemption

Address (Raw)

Average Completeness: 75.0% | Columns: 4

Field Source Completeness Status Null Count
org_addr_street_raw STREET 0.0% Low 1,788,300
org_addr_city_raw CITY 100.0% Good 0
org_addr_state_raw STATE 99.9% Good 1,050
org_addr_zip_raw ZIP 100.0% Good 0

Column Descriptions:

  • org_addr_street_raw: Original street address from source file
  • org_addr_city_raw: Original city name from source file
  • org_addr_state_raw: Original state code from source file
  • org_addr_zip_raw: Original ZIP code from source file

Address (Cleaned)

Average Completeness: 69.0% | Columns: 7

Field Source Completeness Status Null Count
org_addr_street STREET 0.0% Low 1,788,300
org_addr_city CITY 100.0% Good 0
org_addr_state STATE 99.9% Good 1,050
org_addr_zip5 ZIP 91.7% Fair 148,455
org_addr_zip4 ZIP 0.0% Low 1,788,300
org_addr_zip ZIP 91.7% Fair 148,455
org_addr_full STREET, CITY, STATE, ZIP 100.0% Good 0

Column Descriptions:

  • org_addr_street: Standardized street address with USPS abbreviations
  • org_addr_city: Cleaned city name
  • org_addr_state: Two-letter state abbreviation
  • org_addr_zip5: 5-digit ZIP code
  • org_addr_zip4: 4-digit ZIP code extension (if available)
  • org_addr_zip: Full ZIP code (5 or 9 digits)
  • org_addr_full: Complete formatted address string

Address Quality Flags

Average Completeness: 100.0% | Columns: 6

Field Source Completeness Status Null Count
org_addr_is_missing STREET, CITY, STATE, ZIP 100.0% Good 0
org_addr_is_po_box STREET 100.0% Good 0
org_addr_is_rural_route STREET 100.0% Good 0
org_addr_has_special_chars STREET 100.0% Good 0
org_addr_missing_number STREET 100.0% Good 0
org_addr_state_invalid STATE 100.0% Good 0

Column Descriptions:

  • org_addr_is_missing: TRUE if street address is missing or empty
  • org_addr_is_po_box: TRUE if address is a P.O. Box
  • org_addr_is_rural_route: TRUE if address is a rural route
  • org_addr_has_special_chars: TRUE if address contains unusual special characters
  • org_addr_missing_number: TRUE if street address lacks a street number
  • org_addr_state_invalid: TRUE if state code is not a valid US state/territory

Classification

Average Completeness: 50.0% | Columns: 4

Field Source Completeness Status Null Count
subsection_code SUBSECTION 100.0% Good 0
classification_code CLASSIFICATION 0.0% Low 1,788,300
exempt_organization_type SUBSECTION 100.0% Good 30
all_classifications_string CLASSIFICATION 0.0% Low 1,788,300

Column Descriptions:

  • subsection_code: IRS subsection code (e.g., 03 for 501(c)(3), 04 for 501(c)(4))
  • classification_code: IRS classification code indicating organization type within subsection
  • exempt_organization_type: Human-readable exempt organization type based on subsection
  • all_classifications_string: Semicolon-separated list of all classification descriptions

Organization Codes

Average Completeness: 23.3% | Columns: 11

Field Source Completeness Status Null Count
affiliation_code AFFILIATION 0.0% Low 1,788,300
affiliation_code_definition AFFILIATION 0.0% Low 1,788,300
deductibility_code DEDUCTIBILITY 0.0% Low 1,788,300
deductibility_code_definition DEDUCTIBILITY 0.0% Low 1,788,300
foundation_code FOUNDATION 78.1% Fair 391,943
foundation_code_definition FOUNDATION 78.1% Fair 391,944
organization_code ORGANIZATION 0.0% Low 1,788,300
organization_code_definition ORGANIZATION 0.0% Low 1,788,300
status_code STATUS 0.0% Low 1,788,300
status_code_definition STATUS 0.0% Low 1,788,300
naics_code NTEE_CD 100.0% Good 0

Column Descriptions:

  • affiliation_code: Code indicating relationship to parent organization (1-9)
  • affiliation_code_definition: Description of affiliation relationship
  • deductibility_code: Code indicating deductibility status of contributions (1-4)
  • deductibility_code_definition: Description of contribution deductibility
  • foundation_code: Foundation status code (00-99) per IRS determination
  • foundation_code_definition: Description of foundation/public charity status
  • organization_code: Code for type of organization (corporation, trust, etc.)
  • organization_code_definition: Description of organization type
  • status_code: IRS determination status code (01-99)
  • status_code_definition: Description of exempt status
  • naics_code: North American Industry Classification System code derived from NTEE

Dates

Average Completeness: 94.1% | Columns: 7

Field Source Completeness Status Null Count
ruling_date_ym_str RULING 100.0% Good 0
ruling_date RULING 100.0% Good 0
ruling_date_is_missing RULING 100.0% Good 0
tax_period_ym_str TAX_PERIOD 79.4% Fair 368,989
tax_period_ymd TAX_PERIOD 100.0% Good 0
tax_period_is_missing TAX_PERIOD 100.0% Good 0
accounting_period ACCT_PD 79.4% Fair 368,989

Column Descriptions:

  • ruling_date_ym_str: Ruling date as YYYYMM string
  • ruling_date: Date of IRS ruling granting exempt status
  • ruling_date_is_missing: TRUE if ruling date is missing or invalid
  • tax_period_ym_str: Tax period end date as YYYYMM string
  • tax_period_ymd: Tax period end date in YYYY-MM-DD format
  • tax_period_is_missing: TRUE if tax period is missing
  • accounting_period: Month when organization’s accounting period ends (01-12)

Financial Codes

Average Completeness: 0.0% | Columns: 4

Field Source Completeness Status Null Count
asset_code ASSET_CD 0.0% Low 1,788,300
asset_code_definition ASSET_CD 0.0% Low 1,788,300
income_code INCOME_CD 0.0% Low 1,788,300
income_code_definition INCOME_CD 0.0% Low 1,788,300

Column Descriptions:

  • asset_code: Asset amount range code (0-9)
  • asset_code_definition: Description of asset range (e.g., ‘$100,000 to $499,999’)
  • income_code: Income amount range code (0-9)
  • income_code_definition: Description of income range

Financial Amounts

Average Completeness: 62.4% | Columns: 3

Field Source Completeness Status Null Count
asset_amount ASSET_AMT 78.3% Fair 387,948
income_amount INCOME_AMT 78.3% Fair 387,948
revenue_amount REVENUE_AMT 30.7% Low 1,239,370

Column Descriptions:

  • asset_amount: Total assets in dollars (most recent return)
  • income_amount: Total income in dollars (can be negative)
  • revenue_amount: Total revenue in dollars (can be negative)

Activity

Average Completeness: 0.0% | Columns: 3

Field Source Completeness Status Null Count
activity_code ACTIVITY 0.0% Low 1,788,300
activity_code_definitions ACTIVITY 0.0% Low 1,788,300
activity_code_categories ACTIVITY 0.0% Low 1,788,300

Column Descriptions:

  • activity_code: Three 3-digit activity codes concatenated (9 characters total)
  • activity_code_definitions: Semicolon-separated descriptions of activity codes
  • activity_code_categories: Semicolon-separated activity categories

Filing Requirements

Average Completeness: 26.9% | Columns: 4

Field Source Completeness Status Null Count
filing_requirement_code FILING_REQ_CD 100.0% Good 0
filing_requirement_code_definition FILING_REQ_CD 7.6% Low 1,653,023
pf_filing_requirement_code PF_FILING_REQ_CD 0.0% Low 1,788,300
pf_filing_requirement_code_definition PF_FILING_REQ_CD 0.0% Low 1,788,300

Column Descriptions:

  • filing_requirement_code: Code indicating required annual return form (0-6)
  • filing_requirement_code_definition: Description of filing requirement (990, 990-EZ, 990-N, etc.)
  • pf_filing_requirement_code: Private foundation filing requirement code
  • pf_filing_requirement_code_definition: Description of private foundation filing requirement

NTEE Codes

Average Completeness: 99.6% | Columns: 6

Field Source Completeness Status Null Count
ntee_code_raw NTEE_CD 97.7% Good 41,768
ntee_code_clean NTEE_CD 100.0% Good 0
ntee_code_definition NTEE_CD 100.0% Good 0
ntee_code_major_group NTEE_CD 100.0% Good 0
ntee_common_code NTEE_CD 100.0% Good 0
ntee_common_code_definition NTEE_CD 100.0% Good 0

Column Descriptions:

  • ntee_code_raw: Original NTEE code from source file (1-4 characters)
  • ntee_code_clean: Standardized 3-character NTEE code
  • ntee_code_definition: Full description of NTEE classification
  • ntee_code_major_group: NTEE major group letter (A-Z) indicating broad category
  • ntee_common_code: Common code suffix for 4-character NTEE codes (e.g., 01-99)
  • ntee_common_code_definition: Description of common code suffix

NTEE V2 Codes

Average Completeness: 100.0% | Columns: 5

Field Source Completeness Status Null Count
nteev2 NTEE_CD 100.0% Good 0
nteev2_code NTEE_CD 100.0% Good 0
nteev2_subsector NTEE_CD 100.0% Good 0
nteev2_subsector_definition NTEE_CD 100.0% Good 0
nteev2_org_type NTEE_CD 100.0% Good 0

Column Descriptions:

  • nteev2: Full NTEEv2 code in SUBSECTOR-CODE-TYPE format
  • nteev2_code: NTEEv2 code portion (3 characters)
  • nteev2_subsector: NTEEv2 subsector code (e.g., UNI, HOS, ART, ENV)
  • nteev2_subsector_definition: Human-readable name of the NTEEv2 subsector (e.g., ‘Human Services’, ‘Public, Societal Benefit’)
  • nteev2_org_type: NTEEv2 organization type (RG=Regular, AA=Alliance, etc.)

Organization Distribution

Exempt Organization Type

Distribution of organizations by exempt organization type (based on IRS subsection code).

Organizations by Exempt Type
Exempt Organization Type Count Percentage
501(c)(3) 1,438,655 80.45%
501(c)(4) 76,559 4.28%
501(c)(6) 61,617 3.45%
501(c)(7) 48,350 2.70%
501(c)(5) 45,192 2.53%
501(c)(8) 39,646 2.22%
501(c)(19) 27,393 1.53%
501(c)(10) 15,505 0.87%
501(c)(13) 9,546 0.53%
4947(a)(1) 6,515 0.36%
501(c)(9) 5,713 0.32%
501(c)(12) 5,403 0.30%
501(c)(2) 4,339 0.24%
501(c)(14) 1,566 0.09%
501(c)(1) 680 0.04%
501(c)(15) 629 0.04%
501(c)(25) 580 0.03%
501(d) 217 0.01%
501(c)(17) 85 0.00%
NA 30 0.00%
501(c)(29) 17 0.00%
501(c)(27) 14 0.00%
82 12 0.00%
501(c)(16) 11 0.00%
501(c)(11) 6 0.00%
501(c)(26) 6 0.00%
501(e) 5 0.00%
501(c)(18) 3 0.00%
501(c)(23) 2 0.00%
529 1 0.00%
501(n) 1 0.00%
501(c)(20) 1 0.00%
501(c)(21) 1 0.00%

NTEE Major Group Distribution

Distribution of organizations by NTEE major group code.

Organizations by NTEE Major Group
NTEE Major Group Count Percentage
Religion-Related 308,491 17.25%
Education 228,618 12.78%
Recreation and Sports 136,037 7.61%
Arts, Culture and Humanities 133,185 7.45%
Human Services 130,823 7.32%
Community Improvement and Capacity Building 128,156 7.17%
Philanthropy, Voluntarism and Grantmaking Foundations 119,574 6.69%
Public and Societal Benefit 72,159 4.04%
Mutual and Membership Benefit 60,326 3.37%
Health Care 50,589 2.83%
UNDEFINED 45,968 2.57%
Youth Development 45,773 2.56%
Environment 38,671 2.16%
Animal-Related 36,557 2.04%
Housing and Shelter 36,266 2.03%
Employment 32,374 1.81%
Voluntary Health Associations and Medical Disciplines 30,008 1.68%
Public Safety, Disaster Preparedness and Relief 25,748 1.44%
International, Foreign Affairs and National Security 23,729 1.33%
Mental Health and Crisis Intervention 23,035 1.29%
Crime and Legal-Related 22,949 1.28%
Food, Agriculture and Nutrition 22,158 1.24%
Civil Rights, Societal Action and Advocacy 12,547 0.70%
Science and Technology 10,798 0.60%
Unknown 5,820 0.33%
Medical Research 5,227 0.29%
Social Science 2,714 0.15%

Financial Summary

Metric Value
Total Assets (all organizations) $8,028,615,501,101
Median Assets $0
Organizations with Asset Data 1,400,352
Organizations with Zero Assets 750,916
Total Income $5,197,799,362,274
Median Income $0
Total Revenue $2,841,332,155,904
Median Revenue $134,982

Address Quality

Address Issue Count % of Total
Missing Address 1,788,300 100.00%
P.O. Box Addresses 0 0.00%
Rural Route Addresses 0 0.00%
Invalid State Code 0 0.00%

Date Coverage

Ruling Date Range

Metric Value
Earliest Ruling Date 1900-02-01
Latest Ruling Date 2021-11-01
Organizations with Ruling Date 1,775,586

Tax Period Year Distribution

Organizations by Tax Period Year
Tax Year Count Percentage
2020 979,517 69.01%
2021 222,408 15.67%
2019 140,817 9.92%
2018 54,886 3.87%
2013 5,320 0.37%
2017 3,124 0.22%
2016 2,448 0.17%
2012 1,762 0.12%
2014 1,597 0.11%
2015 1,224 0.09%
2011 1,214 0.09%
2010 1,212 0.09%
2006 664 0.05%
2009 377 0.03%
2007 354 0.02%
2008 342 0.02%
2005 270 0.02%
2001 238 0.02%
2004 205 0.01%
2002 188 0.01%
1999 187 0.01%
2000 180 0.01%
2003 180 0.01%
1998 177 0.01%
1997 159 0.01%
1996 143 0.01%
1995 58 0.00%
1994 25 0.00%
1993 11 0.00%
1992 4 0.00%
1988 4 0.00%
1989 3 0.00%
1991 3 0.00%
1990 3 0.00%
1983 3 0.00%
1981 1 0.00%
1987 1 0.00%
1984 1 0.00%
1979 1 0.00%

Data Issues

No Data Issues Detected

All expected columns are present and no critical field issues were found.


Report Metadata

Property Value
Report Generated 2026-05-06 16:34:26
Pipeline Timestamp 2026-05-06 16:33:33
Row Preservation Check Passed
Overall Completeness 59.2%
Overall Result PASSED

Generated by BMF Pipeline Quality System