Geographic gazetteer

Describe, retrieve and prepare dataframes with geographic boundaries of various geographic units of the USA.

Source files

Source data files are downloaded from web and cached locally.

Census Bureau: - TIGER Data Products Guide: Which Product Should I Use? - Cartographic Boundary Files. 2018 and before, 2019 and after. Simplified representations of selected geographic areas from the Census Bureau’s MAF/TIGER geographic database. Small scale (limited detail) spatial files clipped to shoreline. - TIGER/Line shapefiles. Most comprehensive geographic dataset in full detail. - Relationship files. These text files describe geographic relationships. There are two types of relationship files; those that show the relationship between the same type of geography over time (comparability) and those that show the relationship between two types of geography for the same time period. - LSAD codes. Legal/Statistical Area Description Codes and Definitions - FIPS codes - Gazeteer reference files - Character encoding. Files from 2014 and earlier use “ISO-8859-1”, 2015 and after use “UTF-8”.

Shapefile format

Wikipedia

Census Bureau shapefiles come as zipped folders and can be read directly with geopandas.

XML metadata

Most zipped shapefile folders contain XML documents with metadata. Helper functions here parse these files for inspection. In later years files ending with .iso.xml adhere to ISO standards and can be more easily parsed for feature descriptions.

Scale

Shapefiles are available in different scale. TIGER is the most precise, then follows 1:500,000, then 1:5,000,000, and 1:20,000,000 is the lowest resolution.

Shapefile revisions change from year to year. Between year differences are clearly visible in all scales except TIGER.

Table below compares boundaries of Tolland County, Connecticut, taken from shapefiles in different years and scales. “Length” column is is boundary length in shape units (degrees), and “points” is the total number of points in the polygon.

scale length points
tiger 2010 1.877432 2152
tiger 2020 1.877094 2097
500k 2010 1.852025 201
500k 2020 1.854092 236
5m 2010 1.767873 50
20m 2010 1.623553 27

Map below visualizes boundary differences.

Make this Notebook Trusted to load map: File -> Trust Notebook

States

CODE ABBR NAME ALAND AWATER
01 AL Alabama 131174048583 4593327154
02 AK Alaska 1478839695958 245481577452
04 AZ Arizona 294198551143 1027337603
05 AR Arkansas 134768872727 2962859592
06 CA California 403503931312 20463871877
08 CO Colorado 268422891711 1181621593
09 CT Connecticut 12542497068 1815617571
10 DE Delaware 5045925646 1399985648
11 DC District of Columbia 158340391 18687198
12 FL Florida 138949136250 31361101223
13 GA Georgia 149482048342 4422936154
15 HI Hawaii 16633990195 11777809026
16 ID Idaho 214049787659 2391722557
17 IL Illinois 143780567633 6214824948
18 IN Indiana 92789302676 1538002829
19 IA Iowa 144661267977 1084180812
20 KS Kansas 211755344060 1344141205
21 KY Kentucky 102279490672 2375337755
22 LA Louisiana 111897594374 23753621895
23 ME Maine 79887426037 11746549764
24 MD Maryland 25151100280 6979966958
25 MA Massachusetts 20205125364 7129925486
26 MI Michigan 146600952990 103885855702
27 MN Minnesota 206228939448 18945217189
28 MS Mississippi 121533519481 3926919758
29 MO Missouri 178050802184 2489425460
30 MT Montana 376962738765 3869208832
31 NE Nebraska 198956658395 1371829134
32 NV Nevada 284329506470 2047206072
33 NH New Hampshire 23189413166 1026675248
34 NJ New Jersey 19047825980 3544860246
35 NM New Mexico 314196306401 728776523
36 NY New York 122049149763 19246994695
37 NC North Carolina 125923656064 13466071395
38 ND North Dakota 178707534813 4403267548
39 OH Ohio 105828882568 10268850702
40 OK Oklahoma 177662925723 3374587997
41 OR Oregon 248606993270 6192386935
42 PA Pennsylvania 115884442321 3394589990
72 PR Puerto Rico 8868896030 4922382562
44 RI Rhode Island 2677779902 1323670487
45 SC South Carolina 77864918488 5075218778
46 SD South Dakota 196346981786 3382720225
47 TN Tennessee 106802728188 2350123465
48 TX Texas 676653171537 19006305260
49 UT Utah 212886221680 6998824394
50 VT Vermont 23874175944 1030416650
51 VA Virginia 102257717110 8528531774
53 WA Washington 172112588220 12559278850
54 WV West Virginia 62266474513 489028543
55 WI Wisconsin 140290039723 29344951758
56 WY Wyoming 251458544898 1867670745
Make this Notebook Trusted to load map: File -> Trust Notebook

Counties

Source data

Cartographic Boundary Files are available for 1990, 2000, 2010, 2013 and every year after that.

TIGER/Line Shapefiles in shapefile format are available for 2000, 2007 and every year after that. 1992 and 2006 are available in legacy format.

Changes

County changes happen whenever decided by local authoritities. Annually released boundary files reflect boundaries effective January 1 of the reference year. List of changes here.

Substantial county boundary changes are those affecting an estimated population of 200 or more; changes of at least one square mile where an estimated population number was not available, but research indicated that 200 or more people may have been affected; and annexations of unpopulated territory of at least 10 square miles.

CRS in 1990 and 2000 is unknown, created dataframes have “naive geometries”.

Make this Notebook Trusted to load map: File -> Trust Notebook

Census Tracts

Code is 11 digits: 2 state, 5 county, 4+2 tract.

Reference

Make this Notebook Trusted to load map: File -> Trust Notebook

Changes over time

Major changes to tract codes and shapes change after decennial censuses, with smaller changes in between years.

The first four digits of the tract code are “permanent.” When tracks get large (+8000 residents), tracts are split and 2 digit tag is used (same with the split of splits):

1990 2000 2010
1000 1000.01 1000.03
1000 1000.01 1000.04
1000 1000.02 1000.05
1000 1000.02 1000.06

The naming convention for merges (population falls below 1,200) and boundary revisions are less clear-cut.

When changes (splits, merges, redefinitions) occur, the relationship of new tracts to old tracts is crosswalked.

There is a master file, as well as two files that provided the identifiers of tracts that were “substantially changed” between decennials. The two files of significantly changed census tracts consist only of a list of census tracts that exhibited a change of 2.5-percent or greater. Tract relationships may be one-to-one, many-to-one, one-to-many, or many-to-many.

Relationship files are currently available for 2010 (relative to 2000) and 2000 (relative to 1990).

ZIP Code Tabulation Area (ZCTA)

Homepage

ZIP Code Tabulation Areas (ZCTAs) are generalized areal representations of United States Postal Service (USPS) ZIP Code service areas. The USPS ZIP Codes identify the individual post office or metropolitan area delivery station associated with mailing addresses. USPS ZIP Codes are not areal features but a collection of mail delivery routes.

ZCTAs are build from census block, thus blocks can be used as cross-walk to other geographics that partition into blocks. Relationship files are available for blocks, counties, county subdivisions, places, tracts, and for ZCTA changes over time.

Cartographic boundary and TIGER shapefiles. 2000 files are only 3-digit codes. For some reason, the 2010 CB file is almost x10 bigger than other years - 527mb.

ZCTA shapefile columns over time
2000 2010 2013 2014 2015 2016 2017 2018 2019 2020
AREA X
PERIMETER X
Z399_D00_ X
Z399_D00_I X
ZCTA3 X
NAME X X
LSAD X X
LSAD_TRANS X
geometry X X X X X X X X X X
GEO_ID X
ZCTA5 X
CENSUSAREA X
ZCTA5CE10 X X X X X X X
AFFGEOID10 X X X X X X X
GEOID10 X X X X X X X
ALAND10 X X X X X X X
AWATER10 X X X X X X X
ZCTA5CE20 X
AFFGEOID20 X
GEOID20 X
NAME20 X
LSAD20 X
ALAND20 X
AWATER20 X
Make this Notebook Trusted to load map: File -> Trust Notebook

Congressional Districts

A geographical and political division in which voters elect representatives to the U.S. House of Representatives. Each state establishes its congressional districts based on population counts, with the goal of having districts as equal in population as possible. (ESRI dictionary)

About Congressional Districts (Census) - All congressional districts population are supposed to be equal throughout the state to equally be able to elect the representative - They don’t cross state lines, but may cross all other classifications such as Census tracts. - They DO cross county boundaries - Map of CT for reference - Closer breakdown of District 1 in CT - States are required to redraw the district lines every 10 years after the Census is released (except single district states)

In 33 states, state legislatures play the dominant role in congressional redistricting. In eight states, commissions draw congressional district lines. In two states, hybrid systems are used, in which the legislatures share redistricting authority with commissions. The remaining states comprise one congressional district each, rendering redistricting unnecessary (AK, DE, DC, MT, ND, SD, VT, WY). Link

Gerrymandering can and often does occur with congressional districts lines to help whomever the in power party is to make them stay in power. Examples

School Districts

The U.S. has more than 13,000 geographically defined public school districts. These include districts that are administratively and fiscally independent of any other government, as well as public school systems that lack sufficient autonomy to be counted as separate governments and are classified as a dependent agency of some other government—a county, municipal, township, or state. Most public school systems are Unified districts that operate regular, special, and/or vocational programs for children in Prekindergarten through 12th grade.

  • School districts are complex and have almost no consistency from state to state because they are formulated by the local town government in most public school cases.
  • Boundary files
  • Since they vary by local government then changes happen every year in many places throughout the US.

Native American Reservations

About | Definitions | Data

Diffrent breakdowns are avalible, going as small as tracts and block groups (link).

Build this module

Converted notebook "nbs/geography.ipynb" to module "rurec/geography.py".