| Metric | Value |
|---|---|
| Total Records | 1,801,152 |
| Output Columns | 110 |
| Overall Completeness | 59.1% |
| Row Preservation | Passed |
| Unique EINs | 1,799,217 |
| Duplicate EINs | 1,935 |
BMF Pipeline Quality Report
Post-Transformation Validation Results
Executive Summary
Pipeline Timestamp: 2026-05-06 16:34:23
Field Completeness by Category
This section shows the completeness of each output field, grouped by category. Each field shows its source column(s) from the raw BMF file.
Identity
Average Completeness: 100.0% | Columns: 2
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| ein | EIN | 100.0% | Good | 0 |
| ein_raw | EIN | 100.0% | Good | 0 |
Column Descriptions:
- ein: Employer Identification Number formatted as XX-XXXXXXX
- ein_raw: Original 9-digit EIN value from source file without formatting
Organization Name
Average Completeness: 67.6% | Columns: 5
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| org_name_raw | NAME | 100.0% | Good | 0 |
| org_name_join | NAME | 100.0% | Good | 0 |
| org_name_display | NAME | 100.0% | Good | 0 |
| org_legal_suffix | NAME | 37.0% | Low | 1,135,362 |
| org_parent_name | NAME | 1.0% | Low | 1,782,499 |
Column Descriptions:
- org_name_raw: Original organization name exactly as it appears in the source file
- org_name_join: Standardized name for matching and joining (uppercase, punctuation removed)
- org_name_display: Title-cased organization name suitable for display purposes
- org_legal_suffix: Legal entity suffix extracted from name (Inc, Corp, LLC, Foundation, etc.)
- org_parent_name: Parent organization name if this is a subordinate/chapter organization
DBA Name
Average Completeness: 23.7% | Columns: 2
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| dba_name | SORT_NAME | 23.7% | Low | 1,374,346 |
| dba_name_raw | SORT_NAME | 23.7% | Low | 1,374,346 |
Column Descriptions:
- dba_name: Cleaned ‘Doing Business As’ name
- dba_name_raw: Original secondary/DBA name from source file
In Care Of
Average Completeness: 33.3% | Columns: 3
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| in_care_of_name_raw | ICO | 0.0% | Low | 1,801,152 |
| in_care_of_name_clean | ICO | 0.0% | Low | 1,801,152 |
| in_care_of_name_provided | ICO | 100.0% | Good | 0 |
Column Descriptions:
- in_care_of_name_raw: Original ‘In Care Of’ field from source file
- in_care_of_name_clean: Cleaned ICO name with standardized formatting
- in_care_of_name_provided: Boolean indicating whether an ICO name was provided
Group Exemption
Average Completeness: 33.3% | Columns: 3
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| group_exemption_number_raw | GROUP | 0.0% | Low | 1,801,152 |
| group_exemption_number | GROUP | 0.0% | Low | 1,801,152 |
| group_exemption_is_member | GROUP | 100.0% | Good | 0 |
Column Descriptions:
- group_exemption_number_raw: Original group exemption number from source file
- group_exemption_number: Cleaned group exemption number (GEN)
- group_exemption_is_member: Boolean indicating if organization is a member of a group exemption
Address (Raw)
Average Completeness: 75.0% | Columns: 4
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| org_addr_street_raw | STREET | 0.0% | Low | 1,801,152 |
| org_addr_city_raw | CITY | 100.0% | Good | 0 |
| org_addr_state_raw | STATE | 99.9% | Good | 1,070 |
| org_addr_zip_raw | ZIP | 100.0% | Good | 0 |
Column Descriptions:
- org_addr_street_raw: Original street address from source file
- org_addr_city_raw: Original city name from source file
- org_addr_state_raw: Original state code from source file
- org_addr_zip_raw: Original ZIP code from source file
Address (Cleaned)
Average Completeness: 69.1% | Columns: 7
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| org_addr_street | STREET | 0.0% | Low | 1,801,152 |
| org_addr_city | CITY | 100.0% | Good | 0 |
| org_addr_state | STATE | 99.9% | Good | 1,070 |
| org_addr_zip5 | ZIP | 91.8% | Fair | 148,075 |
| org_addr_zip4 | ZIP | 0.0% | Low | 1,801,152 |
| org_addr_zip | ZIP | 91.8% | Fair | 148,075 |
| org_addr_full | STREET, CITY, STATE, ZIP | 100.0% | Good | 0 |
Column Descriptions:
- org_addr_street: Standardized street address with USPS abbreviations
- org_addr_city: Cleaned city name
- org_addr_state: Two-letter state abbreviation
- org_addr_zip5: 5-digit ZIP code
- org_addr_zip4: 4-digit ZIP code extension (if available)
- org_addr_zip: Full ZIP code (5 or 9 digits)
- org_addr_full: Complete formatted address string
Address Quality Flags
Average Completeness: 100.0% | Columns: 6
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| org_addr_is_missing | STREET, CITY, STATE, ZIP | 100.0% | Good | 0 |
| org_addr_is_po_box | STREET | 100.0% | Good | 0 |
| org_addr_is_rural_route | STREET | 100.0% | Good | 0 |
| org_addr_has_special_chars | STREET | 100.0% | Good | 0 |
| org_addr_missing_number | STREET | 100.0% | Good | 0 |
| org_addr_state_invalid | STATE | 100.0% | Good | 0 |
Column Descriptions:
- org_addr_is_missing: TRUE if street address is missing or empty
- org_addr_is_po_box: TRUE if address is a P.O. Box
- org_addr_is_rural_route: TRUE if address is a rural route
- org_addr_has_special_chars: TRUE if address contains unusual special characters
- org_addr_missing_number: TRUE if street address lacks a street number
- org_addr_state_invalid: TRUE if state code is not a valid US state/territory
Classification
Average Completeness: 50.0% | Columns: 4
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| subsection_code | SUBSECTION | 100.0% | Good | 0 |
| classification_code | CLASSIFICATION | 0.0% | Low | 1,801,152 |
| exempt_organization_type | SUBSECTION | 100.0% | Good | 29 |
| all_classifications_string | CLASSIFICATION | 0.0% | Low | 1,801,152 |
Column Descriptions:
- subsection_code: IRS subsection code (e.g., 03 for 501(c)(3), 04 for 501(c)(4))
- classification_code: IRS classification code indicating organization type within subsection
- exempt_organization_type: Human-readable exempt organization type based on subsection
- all_classifications_string: Semicolon-separated list of all classification descriptions
Organization Codes
Average Completeness: 22.8% | Columns: 11
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| affiliation_code | AFFILIATION | 0.0% | Low | 1,801,152 |
| affiliation_code_definition | AFFILIATION | 0.0% | Low | 1,801,152 |
| deductibility_code | DEDUCTIBILITY | 0.0% | Low | 1,801,152 |
| deductibility_code_definition | DEDUCTIBILITY | 0.0% | Low | 1,801,152 |
| foundation_code | FOUNDATION | 75.6% | Fair | 438,961 |
| foundation_code_definition | FOUNDATION | 75.6% | Fair | 438,962 |
| organization_code | ORGANIZATION | 0.0% | Low | 1,801,152 |
| organization_code_definition | ORGANIZATION | 0.0% | Low | 1,801,152 |
| status_code | STATUS | 0.0% | Low | 1,801,152 |
| status_code_definition | STATUS | 0.0% | Low | 1,801,152 |
| naics_code | NTEE_CD | 100.0% | Good | 0 |
Column Descriptions:
- affiliation_code: Code indicating relationship to parent organization (1-9)
- affiliation_code_definition: Description of affiliation relationship
- deductibility_code: Code indicating deductibility status of contributions (1-4)
- deductibility_code_definition: Description of contribution deductibility
- foundation_code: Foundation status code (00-99) per IRS determination
- foundation_code_definition: Description of foundation/public charity status
- organization_code: Code for type of organization (corporation, trust, etc.)
- organization_code_definition: Description of organization type
- status_code: IRS determination status code (01-99)
- status_code_definition: Description of exempt status
- naics_code: North American Industry Classification System code derived from NTEE
Dates
Average Completeness: 94.1% | Columns: 7
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| ruling_date_ym_str | RULING | 100.0% | Good | 0 |
| ruling_date | RULING | 100.0% | Good | 0 |
| ruling_date_is_missing | RULING | 100.0% | Good | 0 |
| tax_period_ym_str | TAX_PERIOD | 79.2% | Fair | 373,850 |
| tax_period_ymd | TAX_PERIOD | 100.0% | Good | 0 |
| tax_period_is_missing | TAX_PERIOD | 100.0% | Good | 0 |
| accounting_period | ACCT_PD | 79.2% | Fair | 373,850 |
Column Descriptions:
- ruling_date_ym_str: Ruling date as YYYYMM string
- ruling_date: Date of IRS ruling granting exempt status
- ruling_date_is_missing: TRUE if ruling date is missing or invalid
- tax_period_ym_str: Tax period end date as YYYYMM string
- tax_period_ymd: Tax period end date in YYYY-MM-DD format
- tax_period_is_missing: TRUE if tax period is missing
- accounting_period: Month when organization’s accounting period ends (01-12)
Financial Codes
Average Completeness: 0.0% | Columns: 4
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| asset_code | ASSET_CD | 0.0% | Low | 1,801,152 |
| asset_code_definition | ASSET_CD | 0.0% | Low | 1,801,152 |
| income_code | INCOME_CD | 0.0% | Low | 1,801,152 |
| income_code_definition | INCOME_CD | 0.0% | Low | 1,801,152 |
Column Descriptions:
- asset_code: Asset amount range code (0-9)
- asset_code_definition: Description of asset range (e.g., ‘$100,000 to $499,999’)
- income_code: Income amount range code (0-9)
- income_code_definition: Description of income range
Financial Amounts
Average Completeness: 62.1% | Columns: 3
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| asset_amount | ASSET_AMT | 78.2% | Fair | 392,825 |
| income_amount | INCOME_AMT | 78.2% | Fair | 392,825 |
| revenue_amount | REVENUE_AMT | 29.8% | Low | 1,264,630 |
Column Descriptions:
- asset_amount: Total assets in dollars (most recent return)
- income_amount: Total income in dollars (can be negative)
- revenue_amount: Total revenue in dollars (can be negative)
Activity
Average Completeness: 0.0% | Columns: 3
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| activity_code | ACTIVITY | 0.0% | Low | 1,801,152 |
| activity_code_definitions | ACTIVITY | 0.0% | Low | 1,801,152 |
| activity_code_categories | ACTIVITY | 0.0% | Low | 1,801,152 |
Column Descriptions:
- activity_code: Three 3-digit activity codes concatenated (9 characters total)
- activity_code_definitions: Semicolon-separated descriptions of activity codes
- activity_code_categories: Semicolon-separated activity categories
Filing Requirements
Average Completeness: 26.9% | Columns: 4
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| filing_requirement_code | FILING_REQ_CD | 100.0% | Good | 0 |
| filing_requirement_code_definition | FILING_REQ_CD | 7.7% | Low | 1,662,300 |
| pf_filing_requirement_code | PF_FILING_REQ_CD | 0.0% | Low | 1,801,152 |
| pf_filing_requirement_code_definition | PF_FILING_REQ_CD | 0.0% | Low | 1,801,152 |
Column Descriptions:
- filing_requirement_code: Code indicating required annual return form (0-6)
- filing_requirement_code_definition: Description of filing requirement (990, 990-EZ, 990-N, etc.)
- pf_filing_requirement_code: Private foundation filing requirement code
- pf_filing_requirement_code_definition: Description of private foundation filing requirement
NTEE Codes
Average Completeness: 99.6% | Columns: 6
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| ntee_code_raw | NTEE_CD | 97.6% | Good | 42,648 |
| ntee_code_clean | NTEE_CD | 100.0% | Good | 0 |
| ntee_code_definition | NTEE_CD | 100.0% | Good | 0 |
| ntee_code_major_group | NTEE_CD | 100.0% | Good | 0 |
| ntee_common_code | NTEE_CD | 100.0% | Good | 0 |
| ntee_common_code_definition | NTEE_CD | 100.0% | Good | 0 |
Column Descriptions:
- ntee_code_raw: Original NTEE code from source file (1-4 characters)
- ntee_code_clean: Standardized 3-character NTEE code
- ntee_code_definition: Full description of NTEE classification
- ntee_code_major_group: NTEE major group letter (A-Z) indicating broad category
- ntee_common_code: Common code suffix for 4-character NTEE codes (e.g., 01-99)
- ntee_common_code_definition: Description of common code suffix
NTEE V2 Codes
Average Completeness: 100.0% | Columns: 5
| Field | Source | Completeness | Status | Null Count |
|---|---|---|---|---|
| nteev2 | NTEE_CD | 100.0% | Good | 0 |
| nteev2_code | NTEE_CD | 100.0% | Good | 0 |
| nteev2_subsector | NTEE_CD | 100.0% | Good | 0 |
| nteev2_subsector_definition | NTEE_CD | 100.0% | Good | 0 |
| nteev2_org_type | NTEE_CD | 100.0% | Good | 0 |
Column Descriptions:
- nteev2: Full NTEEv2 code in SUBSECTOR-CODE-TYPE format
- nteev2_code: NTEEv2 code portion (3 characters)
- nteev2_subsector: NTEEv2 subsector code (e.g., UNI, HOS, ART, ENV)
- nteev2_subsector_definition: Human-readable name of the NTEEv2 subsector (e.g., ‘Human Services’, ‘Public, Societal Benefit’)
- nteev2_org_type: NTEEv2 organization type (RG=Regular, AA=Alliance, etc.)
Organization Distribution
Exempt Organization Type
Distribution of organizations by exempt organization type (based on IRS subsection code).
| Exempt Organization Type | Count | Percentage |
|---|---|---|
| 501(c)(3) | 1,458,803 | 80.99% |
| 501(c)(4) | 73,745 | 4.09% |
| 501(c)(6) | 60,498 | 3.36% |
| 501(c)(7) | 47,584 | 2.64% |
| 501(c)(5) | 44,593 | 2.48% |
| 501(c)(8) | 38,986 | 2.16% |
| 501(c)(19) | 26,723 | 1.48% |
| 501(c)(10) | 15,084 | 0.84% |
| 501(c)(13) | 9,505 | 0.53% |
| 4947(a)(1) | 6,515 | 0.36% |
| 501(c)(9) | 5,611 | 0.31% |
| 501(c)(12) | 5,384 | 0.30% |
| 501(c)(2) | 4,278 | 0.24% |
| 501(c)(14) | 1,539 | 0.09% |
| 501(c)(1) | 694 | 0.04% |
| 501(c)(15) | 624 | 0.03% |
| 501(c)(25) | 574 | 0.03% |
| 501(d) | 218 | 0.01% |
| 501(c)(17) | 86 | 0.00% |
| NA | 29 | 0.00% |
| 501(c)(29) | 17 | 0.00% |
| 501(c)(27) | 14 | 0.00% |
| 82 | 11 | 0.00% |
| 501(c)(16) | 11 | 0.00% |
| 501(c)(11) | 6 | 0.00% |
| 501(c)(26) | 6 | 0.00% |
| 501(e) | 5 | 0.00% |
| 501(c)(18) | 3 | 0.00% |
| 501(c)(23) | 2 | 0.00% |
| 529 | 1 | 0.00% |
| 501(n) | 1 | 0.00% |
| 501(c)(20) | 1 | 0.00% |
| 501(c)(21) | 1 | 0.00% |
NTEE Major Group Distribution
Distribution of organizations by NTEE major group code.
| NTEE Major Group | Count | Percentage |
|---|---|---|
| Religion-Related | 312,781 | 17.37% |
| Education | 228,832 | 12.70% |
| Recreation and Sports | 135,706 | 7.53% |
| Human Services | 135,116 | 7.50% |
| Arts, Culture and Humanities | 133,667 | 7.42% |
| Community Improvement and Capacity Building | 127,686 | 7.09% |
| Philanthropy, Voluntarism and Grantmaking Foundations | 121,254 | 6.73% |
| Public and Societal Benefit | 72,114 | 4.00% |
| Mutual and Membership Benefit | 59,418 | 3.30% |
| Health Care | 50,728 | 2.82% |
| UNDEFINED | 46,963 | 2.61% |
| Youth Development | 46,709 | 2.59% |
| Environment | 39,246 | 2.18% |
| Animal-Related | 37,225 | 2.07% |
| Housing and Shelter | 36,460 | 2.02% |
| Employment | 32,185 | 1.79% |
| Voluntary Health Associations and Medical Disciplines | 29,302 | 1.63% |
| Public Safety, Disaster Preparedness and Relief | 25,738 | 1.43% |
| Mental Health and Crisis Intervention | 23,892 | 1.33% |
| International, Foreign Affairs and National Security | 23,657 | 1.31% |
| Crime and Legal-Related | 22,863 | 1.27% |
| Food, Agriculture and Nutrition | 22,847 | 1.27% |
| Civil Rights, Societal Action and Advocacy | 11,906 | 0.66% |
| Science and Technology | 10,746 | 0.60% |
| Unknown | 6,162 | 0.34% |
| Medical Research | 5,246 | 0.29% |
| Social Science | 2,703 | 0.15% |
Financial Summary
| Metric | Value |
|---|---|
| Total Assets (all organizations) | $8,772,943,152,611 |
| Median Assets | $0 |
| Organizations with Asset Data | 1,408,327 |
| Organizations with Zero Assets | 761,888 |
| Total Income | $5,423,394,940,673 |
| Median Income | $0 |
| Total Revenue | $2,834,431,699,952 |
| Median Revenue | $138,610 |
Address Quality
| Address Issue | Count | % of Total |
|---|---|---|
| Missing Address | 1,801,152 | 100.00% |
| P.O. Box Addresses | 0 | 0.00% |
| Rural Route Addresses | 0 | 0.00% |
| Invalid State Code | 0 | 0.00% |
Date Coverage
Ruling Date Range
| Metric | Value |
|---|---|
| Earliest Ruling Date | 1900-02-01 |
| Latest Ruling Date | 2022-07-01 |
| Organizations with Ruling Date | 1,788,548 |
Tax Period Year Distribution
| Tax Year | Count | Percentage |
|---|---|---|
| 2021 | 913,305 | 63.99% |
| 2020 | 373,623 | 26.18% |
| 2019 | 59,526 | 4.17% |
| 2022 | 57,186 | 4.01% |
| 2013 | 5,295 | 0.37% |
| 2018 | 4,968 | 0.35% |
| 2017 | 2,285 | 0.16% |
| 2012 | 1,747 | 0.12% |
| 2014 | 1,543 | 0.11% |
| 2010 | 1,212 | 0.08% |
| 2011 | 1,211 | 0.08% |
| 2016 | 901 | 0.06% |
| 2015 | 743 | 0.05% |
| 2006 | 654 | 0.05% |
| 2009 | 369 | 0.03% |
| 2007 | 358 | 0.03% |
| 2008 | 336 | 0.02% |
| 2005 | 263 | 0.02% |
| 2001 | 234 | 0.02% |
| 2004 | 202 | 0.01% |
| 2002 | 192 | 0.01% |
| 1999 | 188 | 0.01% |
| 1998 | 179 | 0.01% |
| 2000 | 178 | 0.01% |
| 2003 | 176 | 0.01% |
| 1997 | 160 | 0.01% |
| 1996 | 147 | 0.01% |
| 1995 | 58 | 0.00% |
| 1994 | 27 | 0.00% |
| 1993 | 11 | 0.00% |
| 1992 | 5 | 0.00% |
| 1988 | 4 | 0.00% |
| 1989 | 3 | 0.00% |
| 1991 | 3 | 0.00% |
| 1990 | 3 | 0.00% |
| 1983 | 3 | 0.00% |
| 1981 | 1 | 0.00% |
| 1987 | 1 | 0.00% |
| 1984 | 1 | 0.00% |
| 1979 | 1 | 0.00% |
Data Issues
All expected columns are present and no critical field issues were found.
Report Metadata
| Property | Value |
|---|---|
| Report Generated | 2026-05-06 16:35:15 |
| Pipeline Timestamp | 2026-05-06 16:34:23 |
| Row Preservation Check | Passed |
| Overall Completeness | 59.1% |
| Overall Result | PASSED |
Generated by BMF Pipeline Quality System