Digitized Legal Compendium

Authors

Jesse Lecy, ASU

Teresa Harrison, Drexel

Published

December 19, 2024




Combine State Files

Stacked Files

Rename Variables

V1 V2
Unnamed: 0.1 id1
Unnamed: 0 id2
ID-State id3
ID id4
STATE state
Regulation Indicator reg_indicator
Regulatory Type reg_rule_label
Regulatory Type Abbr reg_rule_abbr
Regulatory Type Full type_body_combined
Regulatory Body reg_body_label
State code_state
division code_division
Section Code code_section1
section code_section2
text code_text
Year legislation originally enacted code_year_enacted
Year legislation changed code_change_year
Type of Change code_change_type
Notes notes
section original section_original
0 v0
1 v1
Section Code Re section_code_re
section1 section1
first_element first_element


Fix Encodings

When data is copied from websites or PDFs, saved in Excel, and later exported to CSV the encodings get messed up. The input text had “double mojibake” problems, meaning the encodings were mangled at least twice, usually once from copying from HTML and a second with how Excel stores text. It was especially the case with the “symbol” character used to denote statute numbers in legal citations.

These steps here ensure the text in the final dataset has a simple, standardized encoding so that text fields will not be mangled if read into stats programs.

Problematic cases look like this:

"\"(4) �\u0080\u009cpaid solicitor�\u0080\u009d means a person who is r
equired to be registered under AS 45.68.010(b), and includes a person who 
is employed, procured, or engaged, directly or indirectly, by a paid 
solicitor to solicit, if the person is compensated; �\u0080\u009cpaid 
solicitor�\u0080\u009d does not include\n          (A) an attorney l
icensed to practice law in this or another state, an investment counselor, 
an insurance company, or a supervised financial institution, to the extent 
the attorney, investment counselor, insurance company, or supervised 
financial institution advises the person on whether to make a contribution; 
or\n\n          (B) a bona fide salaried officer, employee, or volunteer of 
a charitable organization;\" AS 45.68.900"

Fixed:

“(4)”paid solicitor” means a person who is required to be registered under AS 45.68.010(b), and includes a person who is employed, procured, or engaged, directly or indirectly, by a paid solicitor to solicit, if the person is compensated; “paid solicitor” does not include (A) an attorney licensed to practice law in this or another state, an investment counselor, an insurance company, or a supervised financial institution, to the extent the attorney, investment counselor, insurance company, or supervised financial institution advises the person on whether to make a contribution; or (B) a bona fide salaried officer, employee, or volunteer of a charitable organization;” AS 45.68.900

Standardize State Names

Fix IDs

The new ID format base includes the state, regulation class, regulation type, and the regulatory body. For regulatory body:

  • AG: At tourney General
  • OT: Other
  • NS: Not Specified in the LC Field
 [1] "AL-BIFURCD-BIFURC-NS" "AL-BIFURCD-REGIOF-NS" "AL-REPORTS-ASSETS-AG"
 [4] "AL-REPORTS-ASSETS-OT" "AL-REPORTS-MERGER-AG" "AL-REPORTS-MERGER-OT"
 [7] "AL-REPORTS-AMMEND-AG" "AL-REPORTS-AMMEND-OT" "AL-DISSOLV-VOLUNT-AG"
[10] "AL-DISSOLV-VOLUNT-OT"

New ID format with some added counts for data quality inspection purposes.

  • n_id: number of statutes per reg (rule class + rule + agency)
  • n_rule: number of statutes per reg (rule class + rule)
  • n_type: number of statutes per reg group (rule class)
  • n_state: number of nonprofit statutes within the state (not unique)



Note that we are currently only counting rows that contain any data. It appears that this version includes rows as placeholders (there is one row for each regulatory type, even if there is no statute within the state pertaining to the issue).

The reg_x field is a count of those that had some info on the statutes in the notes (notes seemed to be more complete than the statute text field).

#  reg_x = 1 if there were notes
#  in that row; 0 otherwise

d$reg_x <- ifelse( is.na(d$notes), 0, 1 )
d$reg_x[ d$notes == "" ] <- 0

Any case with no statute or notes listed is coded as a count of zero here. That’s different than states where the indicator is reported as “NO” in the spreadsheet (many rows did not include a YES/NO value for the indicator). We need to double-check this.




Fix Data Ecoding Problems

Example Problem Cases

Section Numbers Converting to Dates:

§ 7-6-55  -->  7/6/1955
§ 10-3810 -->  10/1/3810

Inconsistent Input in Raw Files:

Section Code State division section
552.1 oklahoma NA 18-552-1
552.6 oklahoma NA 18-552-6
50-22-01(2)(a) NA NA NA
50-22-01(2)(b)(6) NA NA NA
50-22-01(2)(b)(4)) NA NA NA
82.356 nevada NA 82.35600000000001
617.1420 florida NA 617.1420000000001
21-19 nebraska NA 21-19-141
21-1977 nebraska NA 21-1977
14-3-1041 georgia part-4 14-3-1041
14-3-1005 georgia part-1 14-3-1005
29-410.03 district-of-columbia 29-410-03 index.html
29-410.01 district-of-columbia 29-410-01 index.html

Current Fixes

To preserve statute sections as text and prevent them from being read as dates, a {SS} value is added:

  • {SS}: 10A-3-5.04
  • {SS}: 13A-9-71(e)
  • {SS}: 10A-1-4.02

A universal citation field is created from:

  • state_abbr
  • code_label: “Code” or alternative name for state statutes
  • code_annotated: is the cited text the annotated version?
  • code_section: section cited

Save Combined Dataset

readr::write_excel_csv( d, "02-data-inter/ALL-STATES-FORMATTED.CSV", na="" )



Summary Stats

Regulation Coverage

State Environments




Standardizing Citation Formats

Statutes are published in books called codes, which present laws for a particular jurisdiction arranged by subject.

Statute citations have a volume (federal) or state/municipality (local), the name of the cited code, a section, and a date:

Example: 42 U.S.C. § 7706(a) (1994). 42 = Volume that contains the statue U.S.C. = Abbreviation for the code 7706(a) = Section of the code being cited 1994 = Year the code volume was published

Universal Citation

A “media-neutral” or “vendor-neutral” citation. Generally, creating a citation to a legal source allows a reader to more efficiently locate it. The citation we provide here is one that is media-neutral and does not depend on being located in a print edition of a book.

This citation is based on the second edition of the American Association of Law Libraries “Universal Citation Guide”.

  • Code State
  • Code Label - name of the code publication (usually just “Code”)
  • Are we citing the annotated version?
  • Year
  • Section1
  • Section2 - end section if multiple sections listed
  • Note (source, condition, exception)

For example:

  • IA Code § 602.1614
  • IA Code § 602.1614 - 602.1615
  • IA Code § 602.1614 (Westlaw current through P.L. 116-193)

EXAMPLES

Abbreviations

Statutes are published in books called codes, which present laws for a particular jurisdiction arranged by subject.

In most cases there is one official “code book” (catalog of the current laws) that contains all of the individual statutes. Thus “statute” and “code” might be used interchangeably in citations or references (although technically the statute is a section within the code). Typically the statute_ref should be labeled as CODE.

  • G.C. § 43-17-5 (georgia code)
  • Ga ST § 14-3-1430 (georgia statutes?)
  • GA Code § 14-3-1430 {universal}

The exceptions is when states have specialized codes for specific agencies or if there are multiple jurisdictions or municipalities that might enforce a law, in which case the specific code must be noted:

  • Cal. Prob. Code § 141. [PROBATE CODE]
  • CA Corp Code (2023) § 15903.01 [CORPORATE CODE]
  • Cincinnati, Ohio, Municipal Code § 302-5.
  • Des Moines, Iowa, Municipal Code § 6.3.

Code Names

In some states major divisions of the code are designated by name rather than by number. California, for example, has 28 different codes corresponding to functional areas of the law.

  • Business and Professions Code - BPC
  • Civil Code - CIV
  • Code of Civil Procedure - CCP
  • Commercial Code - COM
  • Corporations Code - CORP
  • etc.

Thus “CA Code § 5830” is underspecified because it could mean either:

  • CA WIC Code § 5830 (Welfare and Institutions Code)
  • CA PRC Code § 5830 (Public Resources Code)

Kentucky uses “Revised Statutes” instead of “Code”:

  • Ky. Rev. Stat. § 367.652
  • KRS § 273.320

Annotated Versions

Many codes are published in two editions – the official edition and an annotated edition with notes about related cases and articles.

  • O.C.G.A. § 43-17-3 (official code of georgia annotated)
  • Ga. Code Ann., § 14-3-1420
  • GA Code Annotated § 14-3-1430 {universal}

Notes and Conditions

If the citation is taken from a source that requires citation, include a source field:

  • IA Code (YEAR) § 602.1614 (Westlaw current through P.L. 116-193)

Should we cite the year before the Section Symbol instead of at the end to differentiate notes and sources from years, otherwise it would have double parentheses:

  • IA Code § 602.1614 (YEAR) (Westlaw current through P.L. 116-193)

Do we need to add a condition or notes field as well?

  • Ga. Code Ann., § 14-3-1041: only required if amending to operate for profit
  • GA Code Annotated § 14-3-1041 (only required if amending to operate for profit)

Multiple Sections Cited

If the referenced sections are adjacent then add an ending section:

AND:

  • F.S.A. § 496.407(1)(b) and (c)
  • FL Statutes Annotated § 496.407(1)(b) - § 496.407(1)(c) {universal}

OR;

  • F.S.A. § 496.409; F.S.A. § 496.410
  • FL Code Annotated § 496.409 - § 496.410 {universal}

If not immediately adjacent they should be recorded as two separate rows in the data:

  • Ga. Code Ann. § 14-3-1005; GA ST § 14-3-1007
  • GA Code Annotated § 14-3-1005 {universal}
  • GA Code Annotated § 14-3-1007 {universal}

Years

If the provision being cited is currently in effect and has not been the subject of recent change, no date element need be included. However, if the provision being cited has, by the time of writing, been repealed or amended or if it has only recently been enacted or revised, the date of a compilation that contains the language cited should be provided in parentheses.

  • Iowa Code (2020) § 1606(1)(a)
  • Iowa Code (2012) § 1606(1)(a) (prior to 2013 amendment)
  • Iowa Code (2012) § 1606(1)(a) (eff. 7/1/2013)

Unless the citation’s context furnishes the information, a parenthetical note identifying the amending legislation and clarifying whether the citation refers to the version in effect before or after the change may be called for. The precise form this information takes will be governed by the form in which the compilation relied upon presents its “as of” date.

  • GA ST § 14-3-1430 (Amended by 2023 Ga. Laws 260,§ 1-1, eff. 7/1/2023)
  • GA Code (2023) § 14-3-1430 {universal}

Need to add a field to capture the bill that changed the law?

  • change field: Amended by 2023 Ga. Laws 260,§ 1-1

Time and Revision Fields

Example

GA Code § 33-24-47.1 (2023)

Note that “article” is not a part of the citation.

Justia dot com Representation:

2023 CODE OF GEORGIA:

Title 33         - INSURANCE (§§ 33-1-1 — 33-66-8)
Chapter 24       - INSURANCE GENERALLY (§§ 33-24-1 — 33-24-98)
Article 1        - GENERAL PROVISIONS (§§ 33-24-1 — 33-24-59.33)
Section 33-24-47 - Notice required of termination or nonrenewal, increase in premium rates, or change restricting or reducing coverage; failure of insurer to comply

The URL Structed Similarly:

https://law.justia.com/codes/georgia/title-33/chapter-24/article-1/section-33-24-47-1/

How to determine which of these updates impacted this specific law?

Also, is this a legislative reference (ID and section of the bill), not a legal reference (ID and section of the code that the bill changed): 2019 Ga. Laws 140,§ 52?

Justia dot com:

GA Code § 33-24-47.1 (2023)

Amended by 2019 Ga. Laws 140,§ 52, eff. 7/1/2019.

Amended by 2015 Ga. Laws 9,§ 33, eff. 3/13/2015.

Time Fields

Types of Change

  • new law
  • major change (substantive meaning change)
  • minor change (minor edits to clarify intent)
  • interpretive change (section not changed, but other changes impact)
  • repeal of a law

Note that a repeal would appear as a new row if it is not replaced by another law:

  • GA Code (2015) § 33-24-47
  • GA Code (2022) § 33-24-47 (repealed)

The second entry designates a repeal of the 2015 law, the repeal passing in 2022.