Since 1880, the Census Bureau has applied formal rules to categorize geographic areas as urban for the purposes of statistical reporting. The most recent version of this taxonomic schema classifies areas as urban areas (UA, densely settled with a population greater than 50,000) and urban clusters (UC, densely settled with populations greater than 2,500 and less than 50,000). For as long as the Census Bureau has been defining urbanicity, rural has ALWAYS been the residual category: from the 2010 Census Urban-Rural Classification Criterion, “‘rural’ encompasses all population, housing, and territory not included within an urban area.”
The construction of urban places follows a complex set of rules accounting for the attributes of urban land use and residential development patterns, but the intuition is fairly straight-forward. First, identify contiguous Census blocks (sometimes, tracts) above a minimum population density thereby defining a potential urban core. Then, continue to agglomerate adjacent-ish (again, there are complex rules allowing for hops, jumps, exclaves, enclaves, indentations, etc.) blocks/tracts meeting other density and population thresholds until the boundaries of the place are established. The population of this agglomeration then determines whether it is a UA, UC, or rural. The RUCA codes are a further refinement constructed by ERS to account for 1) the size of urban agglomerations and 2) resident commuting patterns, while applying the urban-rural typology employed by OMB. Thus, every UA is termed a Metropolitan Area Core. UCs are delimited by population into Micropolitan Area Core (10,000 to 49,999) and Small Town Core (2,500 to 9,999). Remaining blocks/tracts are then classified according to whether a high (more than 30%) or low (10% to 30%) percentage of residents commute to an urban core, as well as the type of core of the primary flow. Secondary codes are also assigned based upon the destination type of the secondary commuter flow.
We propose to use core units as the basis for building Rural-Urban Economic Catchment Areas codes from InfoGroup and the Census of Agriculture in very much the same vein as ERS constructs the RUCA codes, but with a notable design distinction: we will classify tracts from low density to high density, rather than defining rural as a catch-all, non-urban residual. Specifically:
We will define a primary economic catchment area (ECA) centered on each small UC core (RUCA 7) that includes associated commuting Census tracts (RUCA 8 and 9) and nearby rural tracts that include large employers in InfoGroup. The latter condition accommodates economic activities that may employ relatively large numbers of residents but are purposefully located far from residential areas, e.g., paper milling and government military installations. We will explore the sensitivity of our coding to various definitions of nearby, e.g. 5, 10, and 20 mile radii.
We will use the Census of Agriculture to identify the primary crop and livestock production activities and the County Business Patterns to identify the presence of forestry and mining activity in rural Zip codes surrounding the primary ECA (the smallest published unit for both programs is the Zip code). We anticipate that for the overwhelming majority of the country, using the county as the basic unit will suffice for identifying agricultural, forestry, and extraction, but western counties are sufficiently large to benefit from a more disaggregated reporting unit.
We will then use the most recent Use Tables published by BEA as part of its Input-Output Accounts program (2018 for 71 two- and three-digit NAICS industries and 2012 for 415 fourdigit NAICS industries) to calculate absolute and relative rural use coefficients for each primary ECA with respect to the agricultural, forestry, and extraction (AFE) activity in its vicinity.
where \(s^{m}_{i}\) is the share of employment/sales for industry \(m\) in business catchment area \(i\) calculated from InfoGroup; \(\rho^{k}_{m}\) is the use parameter for AFE activity \(k\) in industry \(m\) from the BEA Use Tables; \(s^{k}_{i}\) is the share of agricultural or forestry activity \(k\) of total AFE activities; and \(M\) and \(K\) are the set of non-AFE and AFE industries, respectively.
To illustrate, a simplified five-sector economy based on the BEA 2012 Use Table might be:
Use Factor
1111B0
311210
311810
445000
448000
Grain farming
6440
14398
1924
684
0
Flour milling and malt manufacturing
0
303
2983
1064
0
Bread and bakery product manufacturing
0
0
63
123
0
Food and beverage stores
0
0
0
0
0
Clothing and clothing accessories stores
0
0
0
0
0
We will then assign remaining rural tracts to primary ECAs using a gravity model based on the use coefficients defined above. Gravity models have been applied as far back as the 1850s (Carey, 1858) to describe social phenomena and continue to be used across disciplines (Erlander, 1980; Haynes and Fotheringham, 1984; Sen and Smith, 1995; Wilson 2000), including contemporary analyses of international trade (USITC, 2020). To illustrate, suppose that there were two primary ECAs with the following industry sales (in $1000s) and calculated rural use coefficients:
Industry Sales
1111B0
311210
311810
445000
448000
alpha
rho
A
0
500
0
1000
500
336.2
0.168
B
0
100
200
500
200
92.4
0.092
To simplify, suppose Area A were at 0 on the unit interval and Area B at 1 with all intervening space representing rural tracts. A gravity approach based on the use coefficient would assign the intervening rural tracts on (0,0.784) to Area A and tracts (0.784,1) to Area B. In contrast, a gravity approach using the adjusted use coefficient would assign the intervening rural tracts on (0,0.645) to Area A and tracts (0.645,1) to Area B. More generally, for either use coefficient, we will treat the United States as a plane over the unit simplex and apply a gravity approach to assign rural tracts to any primary ECA.
A notable feature of this approach is that primary ECAs may exhibit sufficiently low gravity that they are effectively economic islands. For example, a small-town anchored by car manufacturing plant surrounded by cropland. Of course, that small-town may be associated with larger, discontiguous economic areas through the manufacturing supply chain, which will reveal itself at higher economic geographies described subsequently, but creating measures to make meaningful economic distinctions between population centers currently lumped as “small towns” is a valuable project output that will benefit policy-makers.
We will then move to assigning the remaining tracts, beginning with Micropolitan low commuting (RUCA 6) tracts. For each tract, we will calculate the following total use coefficients:
where \(i\) denotes the tract, \(j\) denotes the adjacent primary ECA, and \(k\) denotes the Micropolitan core (RUCA 4) [NB: There may be multiple adjacent primary ECA, so that for the general case of Z adjacent primary ECA, 2Z+2 total use coefficients are calculated]. These coefficients summarize the economic flow to and from the tract. Thus, we will assign the tract to either the Micropolitan Core or the adjacent primary ECA according to the magnitude of the largest total use coefficients, or leave the tract as an economic island if flows are insufficiently strong, i.e., gravity is sufficiently weak.
After assigning all Micropolitan and Metropolitan commuting tracts (RUCA 2, 3, and 5) following a similar approach to the one above, we will define secondary ECAs. To do so, we will again treat the United States as a plane with nodes defined by Micropolitan cores and primary ECAs attracted to those nodes by total use coefficients. Primary ECAs are assigned to secondary ECAs centered on Micropolitan cores when gravity is sufficiently strong. Next, contiguous groups of yet-unassigned primary ECAs with sufficiently strong gravity will be aggregated into secondary ECAs independent of a Micropolitan core; those without are classified as isolated.
Step 6 is repeated with Metropolitan cores to define tertiary ECAs.
The approach described above is based on the three categories of urbanized places from the RUCA system, but it generalizes broadly to an N-level hierarchy of nodes where the nth ECA is defined by the parameter pair (\(\rho_{n}\), \(\theta_{n}\)), the upper limit on the population of the core and the gravity threshold that defines economic islands, respectively. Refinements based on the former would permit the definition of small and large metropolitan cores, or even Megapolitan cores. The latter parameter can be varied according to moments of the use coefficient distribution (the lowest quintile or decile) or absolute values with a priori relevance, e.g., a parameter from a specified decay function in a formal model. We will implement the gravity model in Python, as it is open-source and with spatial modeling packages, e.g., SpInt(Oshan 2016) and GME(USITC 2019), that are already being used to model international trade, migration flows, and transportation networks.
The static Input-Output analysis, à la Wassily Leontief, poses the question: What level of output should each industry produce, such that it will satisfy the total demand of that industries output across all industries in an economy?
Simplifying assumptions:
Each industry produces only one homogeneous commodity.
Each industry uses a fixed input ratio.
Each industry exhibits constant returns to scale.
Suppose the National level economy is composed of 2 industries (e.g., Extraction and Farming). Let the interindustry sales be given by,
Given the vector of total output \(\mathbf{x}' = \begin{bmatrix} 1000 & 2000 \end{bmatrix}\), the Direct Requirements matrix (input requirements), constructed as \(\mathbf{A} = \mathbf{Z\hat{x}^{-1}}\), is given by,
Given total output \(\mathbf{x}\) and interindustry sales \(\mathbf{Z}\), the exogenous final demand is simply the difference in total output and the row sums of interindustry transactions given by, \(\mathbf{f} = \mathbf{x} - \mathbf{Zi}\) where \(\mathbf{f}' = \begin{bmatrix} 350 & 1700 \end{bmatrix}\)
Suppose, final demand for extraction were to increase to $600 next year and decrease to $1500 for farming, such that the change in final demand is given by, \(\Delta\mathbf{f}' = \begin{bmatrix} 250 & 200 \end{bmatrix}\). Using the Leontief inverse (total requirements) matrix, the expected change in total output of the economy is given by \(\Delta\mathbf{x} = \mathbf{L}\Delta\mathbf{f}\) such that \(\Delta\mathbf{x}' = \begin{bmatrix} 247.5248 & -158.4158 \end{bmatrix}\)
Code
#A simple I/O analysis "by hand" presuming given national accounts are in the form of raw interindustry flows where one must generate a direct requirements matrices (Technical Coefficients) and the total requirements matrix (Leontief inverse).# Note the national IO table at present is a domestic flow specification as such the interindustry transactions table is no longer representative of a technological matrix. It rather represents the intra-national interindustry transactions, which are determined not only by technological factors, but also by trade factors. As a consequence in the domestic-flow table the balance between supply and demand is made considering only domestic production.#Toy Transaction matrixToy_Z_mat <-matrix(c(150, 500, 200, 100), nrow =2, byrow =TRUE)rownames(Toy_Z_mat) <-c("Extract", "Farm")colnames(Toy_Z_mat) <-c("Extract", "Farm")print("Transaction matrix")
[1] "Transaction matrix"
Code
Toy_Z_mat
Extract Farm
Extract 150 500
Farm 200 100
Code
#Toy Total Output vectorToy_Total_Output <-c(1000, 2000)print("Total Output vector")
#Toy change in Final Demand vectorToy_Delta_Demand <-c(250, -200)print("Final Demand vector change")
[1] "Final Demand vector change"
Code
Toy_Delta_Demand
[1] 250 -200
Code
#Toy change in Total Output vectorToy_Delta_Output <- Toy_L_mat %*% Toy_Delta_Demandprint("Total Output vector change")
[1] "Total Output vector change"
Code
Toy_Delta_Output
[,1]
Extract 247.5248
Farm -158.4158
Code
# Note, to allow for the presence of final demand and primary inputs, in an open model, the sum of each column in $A$ must be less than 1. # Simple sufficient, but not necessary condition of sustainability. All must be TRUE.all(colSums(Toy_A_mat) <1) %>% {paste0("Technology matrix structure passes sufficient test of sustainability: ", . )}
[1] "Technology matrix structure passes sufficient test of sustainability: TRUE"
Code
#Hawkins – Simon condition check. All must be TRUE.local({PM =NULLfor(n in2:nrow(Toy_A_mat)){ PM[1:n] = (det((diag(nrow(Toy_A_mat)) - Toy_A_mat)[1:n,1:n]) >0)}all(all(PM) & (diag(diag(nrow(Toy_A_mat)) - Toy_A_mat) >0) & (det(diag(nrow(Toy_A_mat)) - Toy_A_mat) >0)) %>% {paste0("Leontief matrix structure passes necessary and sufficient test of practibility and viability: ", . )}})
[1] "Leontief matrix structure passes necessary and sufficient test of practibility and viability: TRUE"
Retrospective
One objective of the project narative is to quantify and match the economic connectedness of Micropolitian and Urban clusters to large Metropolitan economies. This is done by analyzing how well the commodity output mix of Micropolitian and Urban clusters to the commodity uses of industries in large Metropolitan cores.
In practice, for two primary ECAs\(A\) and \(B\), we construct the absolute rural use coefficients as:
where \(u_{cj}\) the the value of purchases of a single commodity \(c\) by industry \(j\) from the BEA’s the Use matrix, and \(s_{Aj}\) and \(s_{Bj}\) are the annual sales of industry \(j\) in primary ECA regions \(A\) and \(B\), for \(j = 1, \dots, n\) industries. Similarly, we construct the relative rural use coefficients as:
In the absolute specification, the numerator is a uniform modification of the elements in a row of the national Use matrix. The national value of purchases of commodity \(c\) by each industry \(j\) weighted by the regional value of “sales” by each industry \(j\) (CBP will offer values of county-level industry employment, Q1 payroll, and annual payroll). This is divided by the total intermediate value of commodity \(c\) at the national level.
The relative specification then takes the absolute coefficient divided by the sum of the regional “sales” by each industry \(j\).
Toy Model
Suppose there is 1 National level economy composed of 2 industries (e.g., Hydrocarbon Extraction and Farming), 3 commodities (e.g., Gasoline, Hogs, and Ice cream), and 4 regions (e.g., Dane county (Madison , WI), Cook county (Chicago, IL), Brown County (Green Bay, WI), and the All the rest as a collective residual).
Available data consists of a \(3 \times 2\)National level commodity by industry Use table (\(\underset{(3 \times 2)}{\mathbf{U}} = [u_{ij}]\)) that depicts the value of purchases of commodity \(i\) by industry \(j\). (In conjunction with a column vector of total industry outputs (\(\underset{(2 \times 1)}{\mathbf{x}} = [x_{j}]\)), this gives rise to the standard Use coefficients\(\mathbf{B} = [b_{ij}] = u_{ij}/x_{j}\) in which column \(j\) represents the value of inputs of each commodity per dollar’s worth \(\mathbf{B}\) are therefore commodities-by-industries.) Similarly, other available data consists of a \(2 \times 3\)National level commodity by industry Make table (\(\underset{(2 \times 3)}{\mathbf{V}} = [v_{ij}]\)) that depicts the value of output of commodity \(j\) that is produced by industry \(i\). (Note, the row sums of \(\mathbf{V}\) are total industry output \(\mathbf{x}\) and column sums of \(\mathbf{V}\) are total [commodity] output \(\mathbf{q}'\).)
Available data of at the region level consists of sector specific industry activity (\(\underset{(2 \times 4)}{\mathbf{x}^r} = [x^{r}_{j}]\)) in the form of either employment, payroll, or simply number of establishments.
What are the interpretations of these coefficients explicitly?
From here we can ask the question: Is Madison more economically connected to Chicago or Green Bay or is it an economic island? More specifically, is the commodity(?) output (mix?) of Madison more congruent with the commodity(?) output (mix?) of Chicago or Green Bay? (Note: there is no claim/evidence of trade and commodity flows across space using this specification, instead we are measuring possible (one direction) sector compatibility.)
Given the commodity by geography, relative use coefficient matrix \(\mathbf{\rho}\), we can now construct a geography by geography (from-to) score for each commodity given by:
\[
\mathbf{\phi}_{Gasoline} = \hat{\rho}_{G}\mathbf{i}\mathbf{i}'[\hat{\rho}_{G}\mathbf{i}\mathbf{i}'+\hat{\rho}'_{G}\mathbf{i}\mathbf{i}']^{-1} =
\begin{bmatrix}
\frac{\rho_{G}^{A}}{\rho_{G}^{A} + \rho_{G}^{A}} & \frac{\rho_{G}^{A}}{\rho_{G}^{A} + \rho_{G}^{B}} & \frac{\rho_{G}^{A}}{\rho_{G}^{A} + \rho_{G}^{C}} & \frac{\rho_{G}^{A}}{\rho_{G}^{A} + \rho_{G}^{D}} \\
\frac{\rho_{G}^{B}}{\rho_{G}^{B} + \rho_{G}^{A}} & \frac{\rho_{G}^{B}}{\rho_{G}^{B} + \rho_{G}^{B}} & \frac{\rho_{G}^{B}}{\rho_{G}^{B} + \rho_{G}^{C}} & \frac{\rho_{G}^{B}}{\rho_{G}^{B} + \rho_{G}^{D}} \\
\frac{\rho_{G}^{C}}{\rho_{G}^{C} + \rho_{G}^{A}} & \frac{\rho_{G}^{C}}{\rho_{G}^{C} + \rho_{G}^{B}} & \frac{\rho_{G}^{C}}{\rho_{G}^{C} + \rho_{G}^{C}} & \frac{\rho_{G}^{C}}{\rho_{G}^{C} + \rho_{G}^{D}} \\
\frac{\rho_{G}^{D}}{\rho_{G}^{D} + \rho_{G}^{A}} & \frac{\rho_{G}^{D}}{\rho_{G}^{D} + \rho_{G}^{B}} & \frac{\rho_{G}^{D}}{\rho_{G}^{D} + \rho_{G}^{C}} & \frac{\rho_{G}^{D}}{\rho_{G}^{D} + \rho_{G}^{D}} \\
\end{bmatrix}
\] The maximum value by row of \(\mathbf{\phi}_{G}\) determines which typology each region is assigned on a per commodity basis.
The maximum value by row of \(\mathbf{\phi}\) determines which typology each region is assigned on an overall commodity basis.
(Alternative assignment procedures may incorporate a weighted mix of commodities or alternatively measure the “gravity” of regions by commodities to determine a regions commodity center of gravity.)
“Quantify the economic connectedness of places through production supply chains.”
Simply put at the core of our analysis lies the question: How similar are the inputs of Chicago to the outputs of Madison?
Given only national level Input-Output tables (consisting of Make/Use tables, industry-by-industry, industry-by-commodity, and commodity-by-commodity total requirements tables, and commodity-by-industry direct requirements tables) and regional level industry activity (in the form of employment, payroll, and number of establishments), we are left with a handful of possible methods to derive an answer.
Possible Methodologies
Seemingly, our first task is to “regionalize” the national level I/O accounts down to the desired regional scale.
The most comprehensive approach is to execute an interregional input–output (IRIO) model. However, this approach really only serves as a platonic abstraction given the vast and detailed necessary exogenous inputs.
One possible approach is to implement parts or all of a multiregional input–output model. The regional tables of an MRIO employ a regional technical coefficients matrix, \(\mathbf{A^{r}}\). Under this paradigm, information regarding the region of origin is ignored. These transactions are denoted \(z^{\cdot r}_{ij}\), with the coefficients estimated as \(a^{r}_{ij} = \frac{z^{\cdot r}_{ij}}{x^{r}_{j}}\) where \(\mathbf{A^{r}} = \left[ a^{r}_{ij} \right]\). In practice, estimates are can be made using a product-mix approach. This technique essentially uses detailed level national technical coefficients and detailed level regional outputs to produce regional coefficients at a less detailed summary level (from weighted averages of the national detailed coefficients). This approach may be feasible given the available data, however, the necessary agglomeration of industry classes is not ideal. Furthermore, extending this approach into the realm of interregional tables, requires data on dollar flow of goods between regions, \(z^{rs}_{i}\), irrespective of the sector of destination in the receiving region. Again, however, this extension is beyond the scope of the available data.
Another possible approach is to employ the RAS technique, however, instead of the typical deployment for updating coefficients across time, we may update input coefficients across space. This approach requires \(3n\) sets of information for the location of interest. These are, total gross outputs, \(x^{r}_{j}\), total interindustry sales, by sector \(u_{i} = \sum^{n}_{j=1} z^{r}_{ij}\), and total interindustry purchases, by sector \(v_{i} = \sum^{n}_{i=1} z^{r}_{ij}\). But even this slimmed down approach is still beyond the scope of the data.
Yet another possible alternative approach might be to estimate regional supply percentages, \(p^{r}_{i}\). This approach requires knowledge of total regional output of each sector, \(x^{r}_{i}\), and the exports of the product of each sector from the region, \(e^{r}_{i}\), and the imports of all goods into the region, \(m^{r}_{i}\). With knowledge of regional supply percentages one can scale the national level direct/input requirements matrix to estimate a regional input requirements matrix, \(\mathbf{A^{rr}} = \mathbf{\hat{p}^{r}A}\). Clearly, this too is beyond the scope of the available data.
However in this manner, when trying to estimate \(a^{rr}_{ij}\) from national data, one can generalize the problem as a two step process of: (1) estimate a regional technical coefficient, \(a^{r}_{ij}\), from the corresponding national coefficient, \(a^{n}_{ij}\), and then (2) estimate the regional input coefficient, \(a^{rr}_{ij}\), as some proportion of the regional technical coefficient; that is, \(a^{rr}_{ij} = p^{r}_{ij}a^{r}_{ij}\) (where \(0 \leq p^{r}_{ij} \leq 1\)).
The procedure for estimating \(a^{rr}_{ij}\) from \(a^{n}_{ij}\) is therefore:
find \(\alpha^{r}_{ij} \geq 0\) such that \(a^{r}_{ij} = (\alpha^{r}_{ij}) (a^{n}_{ij})\), and
find \(\beta^{r}_{ij}\) (\(0 \leq \beta^{r}_{ij} \leq 1\)) such that \(a^{rr}_{ij} = (\beta^{r}_{ij}) (a^{r}_{ij})\)
If we assume regional production recipes are identical then \(\alpha^{r}_{ij} = 1\) and \(a^{r}_{ij} = a^{n}_{ij}\) for all \(i\) and \(j\). Or if we assume each regional purchaser buys the same proportion of inputs from within the region, then \(\beta^{r}_{ij} = p^{r}_{i}\) for all \(i\).
Given this structure, and assuming \(a^{r}_{ij} = a^{n}_{ij}\), we may propose various functional estimates of \(\beta^{r}_{ij}\) to derive \(a^{rr}_{ij}\) from \(a^{n}_{ij}\), i.e., Location Quotients. The \(LQ\)s only require knowledge of gross output of sector \(i\) in region \(r\) and total output of all sectors in the region, \(x^{r}_{i}\) and \(x^{r}\), along with these values at the national level: \(LQ^{r}_{i} = \left( \frac{x^{r}_{i} / x^{r}} {x^{n}_{i} / x^{n}} \right)\). Furthermore, many regional analysts will use employment rather than output as an alternative to the relevant measures of regional and national activity. This is a feasible option given the available data.
Another possible approach is to employ Regional Purchase Coefficients. The \(RPC\) is defined as the proportion of regional demand for that sector’s output that is fulfilled from regional production. Much in the same way that location quotients operate, we may designate \(\beta^{r}_{ij} = p^{r}_{i} = RPC^{r}_{i}\), where \(RPC^{r}_{i} = z^{rr}_{i} / (z^{rr}_{i} + z^{sr}_{i}) = 1/[1 + 1/(z^{rr}_{i}/z^{sr}_{i})]\). The relative shipment term \(z^{rr}_{i}/z^{sr}_{i}\), may itself be estimated from published sources such as Census of Transportation, unfortunately these are beyond the current scope and available data.
The final standard approach is to implement a Gravity Model formulation to estimate commodity flows between regions. The Gravity Model estimates the flow of goods between regions as a function of (1) some measure of the total output of \(i\) in \(r\), \(x^{r}_{i}\), (2) some measure of the total purchases of \(i\) in \(s\), \(x^{s}_{i}\), and (3) the distance (as a measure of “impedance”) between the two regions, \(d^{rs}\). The relatively simplified gravity model specification is given by: \(z^{rs}_{i} = \frac{x^{r \cdot}_{i}x^{\cdot s}_{i}} {x^{\cdot \cdot}_{i}} Q^{rs}_{i}\) where \(x^{r \cdot}_{i}\) is the “supply pool” of good \(i\) in region \(r\), \(x^{\cdot s}_{i}\) is the “demand pool” of good \(i\) in region \(s\), \(x^{\cdot \cdot}_{i}\) is the total production of commodity \(i\) in the system and \(Q^{rs}_{i}\) is a parameter to be estimated. In practice the interregional flows per industry \(i\) are given by \([\mathbf{{x'}}_{i}\mathbf{x}_{i}](\mathbf{x}_{i}\mathbf{i})^{-1} \odot \mathbf{Q}\), where \(\mathbf{Q} = (\mathbf{d \odot d})^{-1}\) and \(\mathbf{d}\) is the great-circle distance matrix derived from longitude and latitude vector pairs from region centroid. This approach may be feasible given the available data, too, (again with harsh simplifying assumptions).
Assumptions and Caveats
Note that strictly speaking, the \(x^{r}_{j}\) terms throughout are specified as gross outputs of each sector in the region. In practice, however, these outputs are approximated through the use of employment, payroll, or number of establishments data, and they are not technically equivalent substitutes. Furthermore, the “supply pool” and “demand pool” total outputs called for in in the simple Gravity specification are even greater abstractions. First we assume \(\alpha^{r}_{ij} = 1\) such that \(a^{r}_{ij} = a^{n}_{ij}\), then assuming regional industry activity is equivalent to regional gross output \(x^{r}_{j}\) and regional final demand is approximated by a single region model (i.e., \(\mathbf{f^{r}} = \mathbf{(I} - \mathbf{A^{rr})x^{r}}\)), which restricts by definition regional supply for good \(i\) to be equivalent to regional demand for good \(i\) (i.e., \(x^{r \cdot}_{i} = x^{\cdot r}_{i}\)). Note in fact, no national accounting data are needed for this gravity approach, one only needs regional industry activity and the impedance function, \(Q^{rs}_{i}\), which is given by \(e^{-d^{rs}}\) (or \(1/(d^{rs})^2\)) where \(d^{rs}\) is the great-circle distance between each region.
Note that given an industry-by-industry BEA total requirements matrix, \(\mathbf{L}\), we can derive an industry based technology, industry-by-industry direct requirements matrix, \(\mathbf{A}\) from the accounting identity \(\mathbf{L} = (\mathbf{I} - \mathbf{A})^{-1}\). And total industry output, \(\mathbf{x}\), is also exogenously obtainable from the BEA Make/Use tables, allowing the further derivation of the transactions matrix, \(\mathbf{Z}\), from the accounting identity \(\mathbf{A} = \mathbf{Z\hat{x}^{-1}}\).
############ Cross-Industry Location Quotient#This allows for differing modifiers within a given row of the national matrix; that is, it allows for differing cell-by-cell adjustments within An rather than uniform adjustments along each row.Toy_CLQ <-c()for (i in1:Region_Count){ Toy_CLQ[[i]] <- (Toy_LQ[,i] %*%t(1/Toy_LQ[,i])) %>%round(4)rownames(Toy_CLQ[[i]]) <-c("Extract", "Fiber", "Grains", "Hogs", "Info") }names(Toy_CLQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Cross-Industry Location Quotients")
Each Location Quotient is a different estimate of the regional supply proportion of good \(i\). Using a Location Quotient in conjunction with the national-level Direct Requirements matrix, one can estimate an intraregional Direct Requirements (input/ technical coefficients) matrix.
Reprise
However, to get back to the central question: How similar are the inputs of Chicago to the outputs of Madison? we may only need to look as far as the Location Quotients themselves. We have established that if \(LQ_{i}^{r} > 1\), then industry \(i\) is concentrated in region \(r\) relative to the nation. As such we can expect all demand of good \(i\) to be satisfied regionally and regional surplus will be exported to other less concentrated regions. Similarly, if \(LQ_{i}^{r} < 1\), then industry \(i\) is less concentrated in region \(r\) relative to the nation. Thus if we look for industries in Madison where \(LQ_{i}^{r} > 1\), and in Chicago where \(LQ_{i}^{r} < 1\), we may use this matching to address the central question.
For example, the Fiber and Grains simple LQ for Dane County (i.e., Madison) are both greater than 1. Similarly, only the Fiber simple LQ for Brown County (i.e., Green Bay), and only the Grain simple LQ for All Else is less than 1, making either only a partial partner. However, the Fiber and Grains simple LQ for Cook County (i.e., Chicago) are both less than 1, implying it is the strongest sink and best matched for using goods for which Madison is concentrated.
Code
Toy_export <- Toy_CLQ$DaneToy_import <- Toy_CLQ$Cookfor (i in1:Industry_Count){for (j in1:Industry_Count){if (Toy_export[i,j] <1){ Toy_export[i,j] <-0 }else{ Toy_export[i,j] <- Toy_export[i,j] } }}for (i in1:Industry_Count){for (j in1:Industry_Count){if (Toy_import[i,j] >1){ Toy_import[i,j] <-0 }else{ Toy_import[i,j] <- Toy_import[i,j] } }}print("Sub-industry CLQ Import/Export Compatibility - Madison as Exporter, Chicago as Importer")
[1] "Sub-industry CLQ Import/Export Compatibility - Madison as Exporter, Chicago as Importer"
Code
(Toy_export/Toy_import) %>%round(4)
Extract Fiber Grains Hogs Info
Extract 1.0000 NaN NaN NaN 50.0041
Fiber 5.8671 1.0000 NaN 5.4997 293.4885
Grains 12.3011 2.0967 1 11.5326 615.6513
Hogs 1.0667 NaN NaN 1.0000 53.3437
Info NaN NaN NaN NaN 1.0000
Similarly, for the other sub-industry location quotients, if sector \(i\) at the region-level is relatively smaller than sector \(j\) at the region-level (i.e, \(CLQ^{r}_{ij} <1\)), then some of \(j\)’s need for \(i\) in \(r\) will need to be imported.
One can divide two sub-industry location quotients, an export region by an import region, and derive the following relationships:
NaN -> Exp < 1 & Imp > 1
Inf -> Exp >1 & Imp > 1
0.0 -> Exp < 1 & Imp < 1
‘##’ -> Exp > 1 & Imp < 1
In this example, Chicago will need to import: Extract, Fiber, Grains, and Hog commodities for the Info sector, Fiber for the Extract sector, Fiber and Grains for the Hogs sector, Nothing for the Grains sector, Grains for the Fiber sector, and finally Fiber, Grains, and Hogs for the Extract sector. While Madison will export its excess in the exact same categories!
MAPE
Code
#MAPElocal({ MAT <-bdiag(Toy_Dir_LQ) n <-4 Toy_MAPE_mat <<-c()for (i inseq(from =1, to =length(Toy_LQ), by =5 )){for (j inseq(from =1, to =length(Toy_LQ), by =5 )){ Toy_MAPE_mat <<-c(Toy_MAPE_mat, mean((abs(c(MAT[j:j+n,j:j+n]) -c(MAT[i:i+n,i:i+n]))/c(MAT[i:i+n,i:i+n]))*100)) } } Toy_MAPE_mat <<-matrix(Toy_MAPE_mat, nrow = n, byrow =FALSE)}) colnames(Toy_MAPE_mat) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_MAPE_mat) <-c("AllElse", "Brown", "Cook", "Dane")Toy_MAPE_mat
AllElse Brown Cook Dane
AllElse 0.00000 151.48515 63.58423 408
Brown 60.23622 0.00000 85.51971 102
Cook 174.60630 590.59406 0.00000 1295
Dane 80.31496 50.49505 92.83154 0
Alternatively, we can ask the question: Are the inter-industry flows of Chicago similar to the inter-industry flows of Madison? Or in other words do they have congruent intraregional economic activity? To answer this we can compare the estimated regional level Direct Requirements (technical coefficients) through the application of Mean Absolute Percent Error. Let \(\text{MAPE} = \left(\sum_{i=1}^{n}\sum_{j=1}^{n}\frac{|a_{ij} - \tilde{a}_{ij}|}{a_{ij}}\right) \times 100\).
For example, using the regional Direct Requirements matrix derived from the simple location quotient procedure, we can see that among the possible matches, the inter-industry flows of Madison are most similar to Green Bay, with Chicago and Madison displaying the most incongruent intraregional economic activity. In this case more incongruity points toward more gains from trade.
[1] "Gravity Model Formulations - Interreginal Flows by Industry"
Code
Toy_Gravity_mat
$Extract
AllElse Brown Cook Dane
AllElse 377.073906 7.5843547 67.096967 34.4544477
Brown 7.584355 0.2549020 1.459629 0.8090985
Cook 67.096967 1.4596292 15.082956 6.2312113
Dane 34.454448 0.8090985 6.231211 3.7707391
$Fiber
AllElse Brown Cook Dane
AllElse 289.85507 19.059813 19.341430 58.266965
Brown 19.05981 2.094203 1.375542 4.473259
Cook 19.34143 1.375542 1.630435 3.951672
Dane 58.26697 4.473259 3.951672 14.028986
$Grains
AllElse Brown Cook Dane
AllElse 159.857904 49.466697 9.4817668 59.887583
Brown 49.466697 25.577265 3.1733325 21.636127
Cook 9.481767 3.173332 0.7104796 3.610295
Dane 59.887583 21.636127 3.6102952 26.872114
$Hogs
AllElse Brown Cook Dane
AllElse 77.66990 60.08576 25.913761 14.193894
Brown 60.08576 77.66990 21.681871 12.819891
Cook 25.91376 21.68187 10.922330 4.813157
Dane 14.19389 12.81989 4.813157 3.106796
$Info
AllElse Brown Cook Dane
AllElse 19.4552529 1.50506649 69.237804 0.71107545
Brown 1.5050665 0.19455253 5.793081 0.06422416
Cook 69.2378044 5.79308097 311.284047 2.57201127
Dane 0.7110755 0.06422416 2.572011 0.03112840
Similarly, the Gravity approach represents interregional interindustry transaction estimates. In this example, the “All Else” category dominates the analysis, and rightly so given that it is an aggregate residual category. Furthermore the “impedance” (distance) measure is undefinable for an aggregate “All Else” category. For illustration an arbitrary proxy location is used.
Ignoring “AllElse”, we may employ the Gravity approach to answer the question: Does Madison have more interregional interindustry transaction with Chicago or Green Bay?
In this example, Chicago is a stronger trading partner for Extraction and Info, whereas Green Bay is a stronger trading partner for Fiber, Grains, and Hogs.
Taking one step further and comparing the aggregate magnitudes of interregional interindustry transactions reveals that Green Bay has nearly twice as much trading value than Chicago does with Madison. Note, however, that when we employ an exponential distance decay impediance function, such that intraregional interindustry transactions are definable, then Madison will actually have more “pull” with itself than Greenbay for Fiber and Grains. In this case, we would estimate Madison to have interregional interindustry transactions with Chicago for Extraction and Info commodities, and only Hogs with Green Bay. And as for Grains and Fiber, interindustry transactions are mostly dominated by intraregional exchange.
A nested and hierarchical approach
We have shown that it is feasible to construct regional-level technical coefficients matrix from a national-level Leontief matrix and regional industry employment. This then brings about some possible extensions. One is the possibility of performing a multistage regionalization, whereby first or second-level administrative divisions (e.g., states) are used to establish a coarse regionalization. Then either place specific technical coefficients can be derived using the administrative division technical coefficients or the administrative division regionalization can be used as the basis for constructing interregional compatibility.
To derive an administrative technical coefficients matrix, we aggregate the place specific regional industry employment to the administrative level. Let \(\mathbf{X^{r}}\) be the industry-by-region measures economic activity. If all \(j\) regions make up the desired administrative division, then the aggregate administrative measure of economic activity is given by \(\mathbf{x^{R}} = \mathbf{X^{r}i}\), i.e., the row sums of the industry-by-region economic activity matrix. The administrative division Direct Requirements (technical coefficients) matrix can then be derived using a Location Quotient approach.
The simple administrative division location quotient is given by:
\[LQ^{R}_{i} = \left( \frac{x^{R}_{i} / x^{R}} {x^{n}_{i} / x^{n}} \right) = \left( \frac{(\mathbf{X^{r}i})_{i} / \mathbf{iX^{r}i}} {x^{n}_{i} / x^{n}} \right) \] And the administrative Direct Requirements (technical coefficients) matrix is generated by: \[
a^{R}_{ij} = \left\{
\begin{align*}
(LQ^{R}_{i})a^{n}_{ij} \space\space \text{if} \space\space LQ^{R}_{i} < 1 \\
a^{n}_{ij} \space\space \text{if} \space\space LQ^{R}_{i} \geq 1
\end{align*}
\right\}
\] From here we may iterate and derive regional technical coefficients from the administrative technical coefficients matrix using a location quotient approach.
The second stage simple regional division location quotient is given by:
And the administrative Direct Requirements (technical coefficients) matrix is generated by: \[
\widetilde{a}^{rr}_{ij} = \left\{
\begin{align*}
(\widetilde{LQ}^{r}_{i})a^{R}_{ij} \space\space \text{if} \space\space \widetilde{LQ}^{r}_{i} < 1 \\
a^{R}_{ij} \space\space \text{if} \space\space \widetilde{LQ}^{r}_{i} \geq 1
\end{align*}
\right\}
\]
We can then examine what, if any, functional differences exist between a deriving regional coefficients directly from the national level or from a two stage, national to administrative division to regional approach. Note, under the simple LQ approach, whenever \(LQ \geq 1 \vdash a^{n}_{ij} = a^{rr}_{ij} = a^{R}_{ij} = \widetilde{a}^{rr}_{ij}\).
Suppose \(R = \{\text{Brown, Cook, Dane}\}\), such that administrative-level Direct Requirements matrix, \(a^{R}_{ij}\), using the simple LQ approach is is given by:
Brown Cook Dane
Extract 0.2765 1.2829 1.3145
Fiber 0.7755 0.4127 2.4809
Grains 1.5818 0.1590 2.0041
Hogs 2.2012 0.4979 0.5442
Info 0.0837 2.0205 0.0414
Remember that under the original national paradigm Madison was an exporter (i.e., \(LQ^{r}_{i} > 1\)) for only Fiber and Grains and Chicago was the only region that was an importer (i.e., \(LQ^{r}_{i} < 1\)) of both Fiber and Grains. However, under the nested hierarchical paradigm, Madison is an exporter of Fiber and Grains and Extraction (i.e., \(\widetilde{LQ}^{r}_{i} \geq 1\)). Within the administrative division cluster regime, there is now no longer a region that is an importer of all three of Madison’s exports. Chicago is still an importer of Fiber and Grains, but not Extraction. Whereas Green Bay is an importer of Extract and Fiber, but not Grains.
Project Narrative Models
In addition to the methods established in the literature deployed to answer the question: How similar are the inputs of Chicago to the outputs of Madison? we may also adapt the Project Narrative methods, too (within the data restrictions).
Let the “Absolute Total Use Coefficient” be given by \(\mathbf{\alpha} = (\widehat{\mathbf{Zi}})^{-1}\mathbf{Z}\mathbf{X}^{r}\) and the “Relative Total Use Coefficient” be given by \({\mathbf{\rho}} = (\widehat{\mathbf{Zi}})^{-1}\mathbf{Z}\mathbf{X}^{r}(\widehat{\mathbf{iX}^{r}})^{-1}\).
We may also specify “Absolute Regional Typology Coefficients” and “Relative Regional Typology Coefficients” for each commodity \(i\) as \(\mathbf{\phi}_{i} =\hat{\alpha}_{i}\mathbf{i}\mathbf{i}'[\hat{\alpha}_{i}\mathbf{i}\mathbf{i}'+\hat{\alpha}'_{i}\mathbf{i}\mathbf{i}']^{-1}\) and \(\mathbf{\varphi}_{i} =\hat{\rho}_{i}\mathbf{i}\mathbf{i}'[\hat{\rho}_{i}\mathbf{i}\mathbf{i}'+\hat{\rho}'_{i}\mathbf{i}\mathbf{i}']^{-1}\).
In addition, we may also specify “Gravity of Absolute Total Use Coefficients” and “Gravity of Relative Total Use Coefficients” for each commodity \(i\) as \(\theta_{i} = [\mathbf{{\alpha'}}_{i}\mathbf{\alpha}_{i}](\mathbf{\alpha}_{i}\mathbf{i})^{-1} \odot \mathbf{Q}\) and \(\vartheta_{i} = [\mathbf{{\rho'}}_{i}\mathbf{\rho}_{i}](\mathbf{\rho}_{i}\mathbf{i})^{-1} \odot \mathbf{Q}\), where \(\mathbf{Q} = e^{-\mathbf{d}}\) the exponential distance decay function.
$Extract
AllElse Brown Cook Dane
AllElse 69.5921 3.2241 103.5431 4.9242
Brown 3.2241 0.2496 5.1882 0.2663
Cook 103.5431 5.1882 194.6210 7.4464
Dane 4.9242 0.2663 7.4464 0.4173
$Fiber
AllElse Brown Cook Dane
AllElse 52.0338 9.0594 81.5136 9.2456
Brown 9.0594 2.6356 15.3493 1.8794
Cook 81.5136 15.3493 161.3176 14.7208
Dane 9.2456 1.8794 14.7208 1.9677
$Grains
AllElse Brown Cook Dane
AllElse 52.6976 9.3430 82.8958 8.3525
Brown 9.3430 2.7679 15.8956 1.7289
Cook 82.8958 15.8956 164.7330 13.3539
Dane 8.3525 1.7289 13.3539 1.5856
$Hogs
AllElse Brown Cook Dane
AllElse 51.4767 6.4403 87.2919 6.4117
Brown 6.4403 1.3464 11.8118 0.9366
Cook 87.2919 11.8118 187.0012 11.0506
Dane 6.4117 0.9366 11.0506 0.9565
$Info
AllElse Brown Cook Dane
AllElse 30.1971 5.7936 75.5399 2.3221
Brown 5.7936 1.8574 15.6751 0.5201
Cook 75.5399 15.6751 238.7235 5.9039
Dane 2.3221 0.5201 5.9039 0.2139
Code
Toy_Gravity_mat_Rho <-as.list(c())for (i in1:nrow(Toy_Rho_mat)){ Toy_Gravity_mat_Rho[[i]] <- ((Toy_Rho_mat[i,] %*%t(Toy_Rho_mat[i,])) /sum(Toy_Rho_mat[i,]) * Toy_Q ) %>%round(4)rownames(Toy_Gravity_mat_Rho[[i]]) <-c("AllElse", "Brown", "Cook", "Dane")}names(Toy_Gravity_mat_Rho) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")print("Gravity of Relative Total Use Coefficients")
[1] "Gravity of Relative Total Use Coefficients"
Code
Toy_Gravity_mat_Rho
$Extract
AllElse Brown Cook Dane
AllElse 0.0232 0.0043 0.0827 0.0080
Brown 0.0043 0.0013 0.0165 0.0017
Cook 0.0827 0.0165 0.3728 0.0292
Dane 0.0080 0.0017 0.0292 0.0034
$Fiber
AllElse Brown Cook Dane
AllElse 0.0156 0.0108 0.0586 0.0136
Brown 0.0108 0.0125 0.0439 0.0110
Cook 0.0586 0.0439 0.2784 0.0520
Dane 0.0136 0.0110 0.0520 0.0143
$Grains
AllElse Brown Cook Dane
AllElse 0.0159 0.0112 0.0600 0.0124
Brown 0.0112 0.0132 0.0458 0.0102
Cook 0.0600 0.0458 0.2862 0.0475
Dane 0.0124 0.0102 0.0475 0.0116
$Hogs
AllElse Brown Cook Dane
AllElse 0.0160 0.0080 0.0650 0.0098
Brown 0.0080 0.0066 0.0350 0.0057
Cook 0.0650 0.0350 0.3340 0.0404
Dane 0.0098 0.0057 0.0404 0.0072
$Info
AllElse Brown Cook Dane
AllElse 0.0091 0.0070 0.0547 0.0034
Brown 0.0070 0.0089 0.0452 0.0031
Cook 0.0547 0.0452 0.4152 0.0210
Dane 0.0034 0.0031 0.0210 0.0016
The Absolute Use Coefficient portrays national-level intermediate transaction proportions scaled by regional-level economic activity in the form of industry employment. In this example the Info sector is by far the largest purchaser of intermediate goods for every industry including itself. And Cook County happens to have the largest Info workforce, making it dominate the Absolute Use Coefficient table.
This is only the case, however, when we use the national-level Transactions matrix to capture the interconnectedness of industries. If instead we employ the national-level technical coefficients matrix, which specifies the amount of industry \(i\)’s commodity used to produce of one unit of industry \(j\) ’s commodity, the Absolute Use Coefficient table is dominated by the AllElse category with the overall largest workforce. In this way we may take the national-level technical coefficients matrix, \(\mathbf{A^{n}}\), as a first approximation to regional technical coefficients, \(\mathbf{\tilde{A}^r}\), and thus the numerator of the project narrative Absolute and Relative Use Coefficients becomes \(\mathbf{\tilde{A}^{r}X^{r}}\) or \(\mathbf{\tilde{Z}^{r}}\) a first approximation of the regional Transactions matrix.
The Relative Use Coefficient portrays national-level intermediate transaction proportions scaled by regional-level industry employment proportions.
The Relative and Absolute “Regional Typology Coefficients”, \(\phi\), portray a region-by-region Use Coefficient proportion for each commodity type. In other words, the row maximum assigns the best potential relationship for each commodity of that region. We can use the \(\phi\) to answer the question: For which commodities is Madison a sink or a source? And which regions are complementary to those commodity balances? Looking across each row, whenever the coefficient is greater than 0.5 that region is a source relative to the column region and vice versa. If the coefficient is less than 0.5, the row region is a relative sink to the column region, for that commodity. In this example, the Relative \(\phi\) shows only Green Bay as a potential relative source for Madison and only for the Extract, Fiber, and Hogs commodities.
The Gravity of the Absolute and Relative “Total Use Coefficients”, uses an alternate procedure to estimate the volume and sensitivity of the coefficient proportions. In this example, Madison (and Green Bay for that matter) is dominated by the “pull” of Chicago.
Location specific metrics of industry input needs
We specify a vector of location input needs as the product of the national input coefficient matrix and the location specific total output vector, \(\mathbf{Ax^{r}}\).
Similarly, let the share or relative input needs vector be given as \(\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1}\).
In addition, we define the actual or import input needs vector as, \(\max \{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\}\).
(Similarly, let the actual exports vector \(\mathbf{\tilde{x}^{r}}\) be defined as \(\vert \min \{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\} \vert\))
And let the relative import input needs vector be given by, \(\max \{\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}, 0\}\).
Note: one may concatenate the input needs vectors across each location to construct an industry-by-location matrix of input needs for each specification above. Specifically denoted by:
To elucidate the industrial compatibility across locations, we construct a similarity index. In general, it is defined on the metric space \((\mathbf{X}, D)\) where \(\mathbf{X}\) is a set of vectors \(\{\mathbf{x}_{i} \in \mathbb{R} : i \in \{1, \dots, n \}\}\) and \(D:\mathbf{X} \times \mathbf{X} \rightarrow \mathbb{R}\) is a distance function.
Specifying \(D\) as a Euclidean or 2-norm distance, we define the industrial compatibility distance or simple similarity index between two locations \(r\) and \(s\) as:
In addition, let the relative similarity index between two locations \(r\) and \(s\) be given by:
\[\Vert \mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1} \Vert_{2}\] The import similarity index between two locations \(r\) and \(s\) be given by:
Let the relative import similarity index between two locations \(r\) and \(s\) be given by:
\[\Vert \max\{\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}, 0\} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1} \Vert_{2}\] (I would argue that when a similarity index employs an import specification the “exporting” location should be specified instead as \(\mathbf{\tilde{x}^{s}} = \vert \min \{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\} \vert\) )
One may further weight a similarity index by a geographical impedance function, \(\mathbf{Q}\). In general terms, let this metric be defined by \(\mathfrak{D}\left(\mathbf{Ax^{r}}, \mathbf{x^{s}}, \mathbf{Q}\right)\). In practice, the geographical impedance between locations \(r\) and \(s\) is given by, \(Q^{rs} = e^{-d^{rs}}\) or \(Q^{rs} = 1/(d^{rs})^2\) where \(d^{rs}\) is the great-circle distance between the two locations. Specifically,
\[d^{rs} = R \cdot \arccos(\sin(\phi_{r})\sin(\phi_{s}) + \cos(\phi_{r})\cos(\phi_{s})\cos(\vert \lambda_{r} - \lambda_{s} \vert)) \] Where \(\lambda_{r}, \phi_{r}\) and \(\lambda_{s}, \phi_{r}\) are the longitude and latitude of locations \(r\) and \(s\), and \(R\) is the radius of the reference sphere, which in this case is the mean earth radius from the WGS84 ellipsoid.
As such the spatial simple similarity index is given by
Let the spatial relative similarity index between two locations \(r\) and \(s\) be given by:
\[\sqrt{(\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1})^{'}(\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1})} / Q^{rs}\] Let the spatial import similarity index between two locations \(r\) and \(s\) be given by:
Let the spatial relative import similarity index between two locations \(r\) and \(s\) be given by:
\[\sqrt{([\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}] - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1})^{'}([\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}] - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1})} / Q^{rs}\] For any given location-by-location Similarity Index matrix specification, the minimum “distance” by row is said to be the most compatible trading partner.
Alternative specifications of the spatial distance are possible. One possibility is to discretize the impedance matrix as a binary proximity operator which is one when two locations share a boundary (or vertex) and zero otherwise.
Empirical Example
Suppose we still only have national level Input-Output tables (consisting of Make/Use tables, industry-by-industry, industry-by-commodity, and commodity-by-commodity total requirements tables, and commodity-by-industry direct requirements tables) and location specific industry activity (in the form of employment, payroll, and number of establishments). How might we implement the above methods?
As before, we can derive a national-level, industry-by-industry direct requirements matrix, \(\mathbf{A^{n}}\), from the national-level, industry-by-industry total requirements matrix, \(\mathbf{L^{n}}\), using \(\mathbf{A^{n}} = \mathbf{I} - (\mathbf{L^{n}})^{-1}\). Furthermore, we can construct the location specific total output vector, \(\mathbf{x^{r}}\) from a payroll and labor share quotient. As a first pass assume unit labor share, such that location specific industry payroll is equivalent to location specific total output.
Suppose these tables are given by:
“National Total Requirements Table”: \(\mathbf{L}\)
AllElse Brown Cook Dane
AllElse 0.4965 0.7344 0.7270 0.7420
Brown 0.4529 0.4854 0.6351 0.8035
Cook 0.7274 0.9116 0.8108 0.8840
Dane 0.2530 0.5554 0.3146 0.7064
For any given location-by-location Similarity Index matrix specification, the minimum “distance” by row is said to be the most compatible trading partner.
“The Assumption of Spatially Invariant Production Functions: The Problem of Heterogeneity All classic nonsurvey methods assume that national total use coefficients - the production functions in full - hold at every conceivable regional dimension. Various suggestions have been made as to why this is unlikely to be observed in practice. Smith and Morrison (1974 p.22) identify anything from differences in productive efficiency to climatic variation as contributing to the phenomenon. However, it is argued here that, apart from variation due to stochastic observation error, differences in total use can be attributed to the violation of one fundamental assumption of the Leontief model, and that is that each defined sector is homogeneous. Homogeneity implies that the set of coefficients which describes each defined commodity’s production is wholly unique. Hence there can be no variation in the means of production for any defined commodity, and if there is, for whatever reason, the commodity would require separate definition in order for the homogeneity assumption to hold. The reason for the assumption is quite straightforward. If each production function describes a diverse set of commodities, the pattem of linkage is hidden within average relationships (see Table 5.1 below), and hence the precision with which one can calculate, for example, multipliers is eroded. It is argued here that differences in regional and national total use derive from the fact that the production functions of the national model are heterogeneously defined. The argument is as follows. At the broadest levels of commodity definition i.e. agriculture, manufacture, regional production functions are merged together, and as such cannot be identified in the national model. As the definitions of the national model are increased, the production functions of regionally specialised commodities emerge i.e. dair>’ farming, sea fishing, fish farming. At some much higher level of disaggregation - where homogeneity is approached - the definitions o f the national model are so fine that it is possible to identify the individual factories and firms operating within the nation. At this point, the national model becomes one of infinite-regions because it is possible to extract the input-output table for any conceivable spatial subset of the nation. Not only the problem of estimating regional production functions evaporated, the trade estimation problem has ceased to exist.” - (Brand, 1998)
FIN
Source Code
---title: "Toy Story"author: "Austin Sandler"date: "3/22/2022"format: html: self-contained: false page-layout: full code-fold: true code-tools: true code_download: yes latex_engine: pdflatex---```{r preamble, include = FALSE}# When you click the **Render** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.#For nonscientific notationoptions(scipen=999)root_dir <- rprojroot::find_rstudio_root_file()``````{r packages and libraries, include = FALSE}# List packages needed for this exercisepackages <-c("dlm","geosphere","gtools","knitr","readxl","tidyr","tools")# Load packagesinvisible(lapply(packages, library, character.only =TRUE))``````{r version control, include = FALSE}# Save snapshot of session versions (must be run outside of render)writeLines(capture.output(sessionInfo()), paste0("sessionInfo_", basename(file_path_sans_ext(script_file())), ".txt"))```This R Markdown document is a testing ground for Toy model specifications with numerical examples. ::: {.panel-tabset}### Project Narrative```{r Richard Trial, error=FALSE}# Exercise to replicate table and analysis from project narrativeUse_SUT_Framework_2007_2012_DET <-read_excel(file.path(root_dir, "data/AllTablesSUP/Use_SUT_Framework_2007_2012_DET.xlsx"), sheet ="2012", col_names =FALSE, skip =2) %>%suppressMessages()ProjNarrIO <-c(Use_SUT_Framework_2007_2012_DET[[6,4]], Use_SUT_Framework_2007_2012_DET[[6,196]], Use_SUT_Framework_2007_2012_DET[[6,211]], Use_SUT_Framework_2007_2012_DET[[6,286]], Use_SUT_Framework_2007_2012_DET[[6,291]], Use_SUT_Framework_2007_2012_DET[[197,4]], Use_SUT_Framework_2007_2012_DET[[197,196]], Use_SUT_Framework_2007_2012_DET[[197,211]], Use_SUT_Framework_2007_2012_DET[[197,286]], Use_SUT_Framework_2007_2012_DET[[197,291]], Use_SUT_Framework_2007_2012_DET[[212,4]], Use_SUT_Framework_2007_2012_DET[[212,196]], Use_SUT_Framework_2007_2012_DET[[212,211]], Use_SUT_Framework_2007_2012_DET[[212,286]], Use_SUT_Framework_2007_2012_DET[[212,291]], Use_SUT_Framework_2007_2012_DET[[287,4]], Use_SUT_Framework_2007_2012_DET[[287,196]], Use_SUT_Framework_2007_2012_DET[[287,211]], Use_SUT_Framework_2007_2012_DET[[287,286]], Use_SUT_Framework_2007_2012_DET[[287,291]], Use_SUT_Framework_2007_2012_DET[[292,4]], Use_SUT_Framework_2007_2012_DET[[292,196]], Use_SUT_Framework_2007_2012_DET[[292,211]], Use_SUT_Framework_2007_2012_DET[[292,286]], Use_SUT_Framework_2007_2012_DET[[292,291]]) ProjNarrIO[is.na(ProjNarrIO)] <-0codenames <-c(Use_SUT_Framework_2007_2012_DET[[6,1]], Use_SUT_Framework_2007_2012_DET[[197,1]], Use_SUT_Framework_2007_2012_DET[[212,1]], Use_SUT_Framework_2007_2012_DET[[287,1]], Use_SUT_Framework_2007_2012_DET[[292,1]])descriptionnames <-c(Use_SUT_Framework_2007_2012_DET[[6,2]], Use_SUT_Framework_2007_2012_DET[[197,2]], Use_SUT_Framework_2007_2012_DET[[212,2]], Use_SUT_Framework_2007_2012_DET[[287,2]], Use_SUT_Framework_2007_2012_DET[[292,2]])ProjNarrIO %<>%as.numeric() %>%matrix(nrow =5, ncol =5, byrow =TRUE, dimnames =list(descriptionnames, codenames))ProjNarrRegSales <-data.frame(A =c(0, 500, 0, 1000, 500), B =c(0, 100, 200, 500, 200) ) %>%t() %>%as.matrix()i_mat <-rep(c(1), each=ncol(ProjNarrIO)) %>%as.matrix()alpha <-sweep(ProjNarrIO[1, ] %*%t(ProjNarrRegSales), 1, ProjNarrIO[1, ] %*% i_mat, "/")rho <-sweep(alpha, 2, ProjNarrRegSales %*% i_mat, "/")colnames(ProjNarrRegSales) <- codenamessample_RUC<-cbind(ProjNarrRegSales, t(alpha), t(rho))colnames(sample_RUC)[6] <-"alpha"colnames(sample_RUC)[7] <-"rho"```**Generate Economic Catchment Area (ECA) codes** Since 1880, the Census Bureau has applied formal rules to categorize geographic areas as urban for the purposes of statistical reporting. The most recent version of this taxonomic schema classifies areas as urban areas (UA, densely settled with a population greater than 50,000) and urban clusters (UC, densely settled with populations greater than 2,500 and less than 50,000). <u> For as long as the Census Bureau has been defining urbanicity, rural has **ALWAYS** been the residual category</u>: from the 2010 Census Urban-Rural Classification Criterion, “‘rural’ encompasses all population, housing, and territory not included within an urban area.”The construction of urban places follows a complex set of rules accounting for the attributes of urban land use and residential development patterns, but the intuition is fairly straight-forward. First, identify contiguous Census blocks (sometimes, tracts) above a minimum population density thereby defining a potential urban core. Then, continue to agglomerate adjacent-*ish* (again, there are complex rules allowing for hops, jumps, exclaves, enclaves, indentations, etc.) blocks/tracts meeting other density and population thresholds until the boundaries of the place are established. The population of this agglomeration then determines whether it is a UA, UC, or rural. The RUCA codes are a further refinement constructed by ERS to account for 1) the size of urban agglomerations and 2) resident commuting patterns, while applying the urban-rural typology employed by OMB. Thus, every UA is termed a Metropolitan Area Core. UCs are delimited by population into Micropolitan Area Core (10,000 to 49,999) and Small Town Core (2,500 to 9,999). Remaining blocks/tracts are then classified according to whether a high (more than 30%) or low (10% to 30%) percentage of residents commute to an urban core, as well as the type of core of the primary flow. Secondary codes are also assigned based upon the destination type of the secondary commuter flow.We propose to use core units as the basis for building Rural-Urban Economic Catchment Areas codes from InfoGroup and the Census of Agriculture in very much the same vein as ERS constructs the RUCA codes, but with a notable design distinction: we will classify tracts from low density to high density, rather than defining rural as a catch-all, non-urban residual. Specifically:1) We will define a *primary economic catchment area* (ECA) centered on each small UC core (RUCA 7) that includes associated commuting Census tracts (RUCA 8 and 9) and nearby rural tracts that include large employers in InfoGroup. The latter condition accommodates economic activities that may employ relatively large numbers of residents but are purposefully located far from residential areas, e.g., paper milling and government military installations. We will explore the sensitivity of our coding to various definitions of nearby, e.g. 5, 10, and 20 mile radii.2) We will use the Census of Agriculture to identify the *primary* crop and livestock production activities and the County Business Patterns to identify the presence of forestry and mining activity in rural Zip codes surrounding the primary ECA (the smallest published unit for both programs is the Zip code). We anticipate that for the overwhelming majority of the country, using the county as the basic unit will suffice for identifying agricultural, forestry, and extraction, but western counties are sufficiently large to benefit from a more disaggregated reporting unit.3) We will then use the most recent Use Tables published by BEA as part of its Input-Output Accounts program (2018 for 71 two- and three-digit NAICS industries and 2012 for 415 fourdigit NAICS industries) to calculate absolute and relative *rural use coefficients* for each *primary* ECA with respect to the agricultural, forestry, and extraction (AFE) activity in its vicinity.<palign="center">$\alpha^{r}_{i} = \sum^{k} s^{k}_{i} \sum^{m} s^{m}_{i} \rho^{k}_{m}$ and $\rho^{r}_{i} = \sum^{k} s^{k}_{i} \left( \frac{\sum^{m} s^{m}_{i} \rho^{k}_{m}} {\sum^{m} \rho^{k}_{m} } \right)$ </p> where $s^{m}_{i}$ is the share of employment/sales for industry $m$ in business catchment area $i$ calculated from InfoGroup; $\rho^{k}_{m}$ is the use parameter for AFE activity $k$ in industry $m$ from the BEA Use Tables; $s^{k}_{i}$ is the share of agricultural or forestry activity $k$ of total AFE activities; and $M$ and $K$ are the set of non-AFE and AFE industries, respectively. To illustrate, a simplified five-sector economy based on the BEA 2012 Use Table might be:`r kable(ProjNarrIO, caption = "Use Factor")`4) We will then assign remaining rural tracts to *primary* ECAs using a gravity model based on the use coefficients defined above. Gravity models have been applied as far back as the 1850s (Carey, 1858) to describe social phenomena and continue to be used across disciplines (Erlander, 1980; Haynes and Fotheringham, 1984; Sen and Smith, 1995; Wilson 2000), including contemporary analyses of international trade (USITC, 2020). To illustrate, suppose that there were two *primary* ECAs with the following industry sales (in $1000s) and calculated rural use coefficients:`r kable(sample_RUC, caption = "Industry Sales", digits = c(0, 0, 0, 0, 0, 1, 3))` To simplify, suppose Area A were at 0 on the unit interval and Area B at 1 with all intervening space representing rural tracts. A gravity approach based on the use coefficient would assign the intervening rural tracts on (0,`r round(alpha[1]/(alpha[1]+alpha[2]), 3)`) to Area A and tracts (`r round(alpha[1]/(alpha[1]+alpha[2]), 3)`,1) to Area B. In contrast, a gravity approach using the adjusted use coefficient would assign the intervening rural tracts on (0,`r round(rho[1]/(rho[1]+rho[2]), 3)`) to Area A and tracts (`r round(rho[1]/(rho[1]+rho[2]), 3)`,1) to Area B. More generally, for either use coefficient, we will treat the United States as a plane over the unit simplex and apply a gravity approach to assign rural tracts to any *primary* ECA. A notable feature of this approach is that *primary* ECAs may exhibit sufficiently low gravity that they are effectively economic islands. For example, a small-town anchored by car manufacturing plant surrounded by cropland. Of course, that small-town may be associated with larger, discontiguous economic areas through the manufacturing supply chain, which will reveal itself at higher economic geographies described subsequently, but <u>**creating measures to make meaningful economic distinctions between population centers currently lumped as “small towns” is a valuable project output that will benefit policy-makers**</u>.5) We will then move to assigning the remaining tracts, beginning with Micropolitan low commuting (RUCA 6) tracts. For each tract, we will calculate the following *total use coefficients*:<palign="center">$\rho_{ij} = \sum^{M} s^{m}_{i} \left( \frac{\sum^{m} s^{n}_{j} \rho^{n}_{i}} {\rho^{n}_{i} } \right)$,$\rho_{ji} = \sum^{M} s^{m}_{j} \left( \frac{\sum^{m} s^{n}_{i} \rho^{n}_{j}} {\rho^{n}_{j} } \right)$,$\rho_{ik} = \sum^{M} s^{m}_{i} \left( \frac{\sum^{m} s^{n}_{k} \rho^{n}_{i}} {\rho^{n}_{i} } \right)$,$\rho_{ki} = \sum^{M} s^{m}_{k} \left( \frac{\sum^{m} s^{n}_{i} \rho^{n}_{k}} {\rho^{n}_{k} } \right)$</p> where $i$ denotes the tract, $j$ denotes the adjacent primary ECA, and $k$ denotes the Micropolitan core (RUCA 4) [NB: There may be multiple adjacent *primary* ECA, so that for the general case of Z adjacent *primary* ECA, 2Z+2 total use coefficients are calculated]. These coefficients summarize the economic flow to and from the tract. Thus, we will assign the tract to either the Micropolitan Core or the adjacent *primary* ECA according to the magnitude of the largest total use coefficients, or leave the tract as an economic island if flows are insufficiently strong, i.e., gravity is sufficiently weak.6) After assigning all Micropolitan and Metropolitan commuting tracts (RUCA 2, 3, and 5) following a similar approach to the one above, we will define *secondary* ECAs. To do so, we will again treat the United States as a plane with nodes defined by Micropolitan cores and *primary* ECAs attracted to those nodes by *total use coefficients*. *Primary* ECAs are assigned to *secondary* ECAs centered on Micropolitan cores when gravity is sufficiently strong. Next, contiguous groups of yet-unassigned *primary* ECAs with sufficiently strong gravity will be aggregated into *secondary* ECAs independent of a Micropolitan core; those without are classified as isolated.7) Step 6 is repeated with Metropolitan cores to define *tertiary* ECAs. The approach described above is based on the three categories of urbanized places from the RUCA system, but it generalizes broadly to an N-level hierarchy of nodes where the nth ECA is defined by the parameter pair ($\rho_{n}$, $\theta_{n}$), the upper limit on the population of the core and the gravity threshold that defines economic islands, respectively. Refinements based on the former would permit the definition of small and large metropolitan cores, or even Megapolitan cores. The latter parameter can be varied according to moments of the use coefficient distribution (the lowest quintile or decile) or absolute values with a priori relevance, e.g., a parameter from a specified decay function in a formal model. We will implement the gravity model in Python, as it is open-source and with spatial modeling packages, e.g., [SpInt](https://github.com/pysal/spint)[(Oshan 2016)](https://doi.org/10.18335/region.v3i2.175) and [GME](https://github.com/USITC-Gravity-Group/GME)[(USITC 2019)](http://dx.doi.org/10.2139/ssrn.3832208), that are already being used to model international trade, migration flows, and transportation networks.### Toy IO The static Input-Output analysis, à la Wassily Leontief, poses the question: What level of output should each industry produce, such that it will satisfy the total demand of that industries output across all industries in an economy?Simplifying assumptions:1. Each industry produces only one homogeneous commodity.1. Each industry uses a fixed input ratio.1. Each industry exhibits constant returns to scale.Suppose the **N**ational level economy is composed of 2 industries (e.g., **E**xtraction and **F**arming). Let the interindustry sales be given by,\begin{equation}\underset{(i \times i)}{\mathbf{Z}} = \begin{bmatrix} z_{11} & z_{12} \\ z_{21} & z_{22} \\ \end{bmatrix} = \begin{bmatrix} 150 & 500 \\ 200 & 100 \\ \end{bmatrix}\end{equation}Given the vector of total output $\mathbf{x}' = \begin{bmatrix} 1000 & 2000 \end{bmatrix}$, the Direct Requirements matrix (input requirements), constructed as $\mathbf{A} = \mathbf{Z\hat{x}^{-1}}$, is given by, \begin{equation}\underset{(i \times i)}{\mathbf{A}} = \begin{bmatrix} \frac{z_{11}}{x_{1}} & \frac{z_{12}}{x_{2}} \\ \frac{z_{21}}{x_{1}} & \frac{z_{22}}{x_{2}} \\ \end{bmatrix} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} = \begin{bmatrix} 0.15 & 0.25 \\ 0.20 & 0.05 \\ \end{bmatrix}\end{equation}The Total Requirements matrix (Leontief inverse), constructed as $\mathbf{L} = \left(\mathbf{I} - \mathbf{A}\right)^{-1}$, is given by, \begin{equation}\underset{(i \times i)}{\mathbf{L}} = \begin{bmatrix} l_{11} & l_{12} \\ l_{21} & l_{22} \\ \end{bmatrix} = \begin{bmatrix} 1.2541254 & 0.330033 \\ 0.2640264 & 1.122112 \\ \end{bmatrix}\end{equation}Given total output $\mathbf{x}$ and interindustry sales $\mathbf{Z}$, the exogenous final demand is simply the difference in total output and the row sums of interindustry transactions given by, $\mathbf{f} = \mathbf{x} - \mathbf{Zi}$ where $\mathbf{f}' = \begin{bmatrix} 350 & 1700 \end{bmatrix}$Suppose, *final demand* for extraction were to increase to \$600 next year and decrease to \$1500 for farming, such that the change in final demand is given by, $\Delta\mathbf{f}' = \begin{bmatrix} 250 & 200 \end{bmatrix}$. Using the Leontief inverse (total requirements) matrix, the expected change in total output of the economy is given by $\Delta\mathbf{x} = \mathbf{L}\Delta\mathbf{f}$ such that $\Delta\mathbf{x}' = \begin{bmatrix} 247.5248 & -158.4158 \end{bmatrix}$```{r Toy open IO}#A simple I/O analysis "by hand" presuming given national accounts are in the form of raw interindustry flows where one must generate a direct requirements matrices (Technical Coefficients) and the total requirements matrix (Leontief inverse).# Note the national IO table at present is a domestic flow specification as such the interindustry transactions table is no longer representative of a technological matrix. It rather represents the intra-national interindustry transactions, which are determined not only by technological factors, but also by trade factors. As a consequence in the domestic-flow table the balance between supply and demand is made considering only domestic production.#Toy Transaction matrixToy_Z_mat <-matrix(c(150, 500, 200, 100), nrow =2, byrow =TRUE)rownames(Toy_Z_mat) <-c("Extract", "Farm")colnames(Toy_Z_mat) <-c("Extract", "Farm")print("Transaction matrix")Toy_Z_mat#Toy Total Output vectorToy_Total_Output <-c(1000, 2000)print("Total Output vector")Toy_Total_Output#Toy Final Demand vectorToy_Final_Demand <- Toy_Total_Output - (Toy_Z_mat %*%rep(c(1), each=ncol(Toy_Z_mat)))print("Final Demand vector")Toy_Final_Demand#Toy Direct Requirements matrixToy_A_mat <- Toy_Z_mat %*%diag(1/Toy_Total_Output)rownames(Toy_A_mat) <-c("Extract", "Farm")colnames(Toy_A_mat) <-c("Extract", "Farm")print("Direct Requirements matrix")Toy_A_mat#Toy Total Requirements matrixToy_L_mat <-solve(diag(nrow(Toy_A_mat)) - Toy_A_mat)print("Total Requirements matrix")Toy_L_mat#Toy change in Final Demand vectorToy_Delta_Demand <-c(250, -200)print("Final Demand vector change")Toy_Delta_Demand#Toy change in Total Output vectorToy_Delta_Output <- Toy_L_mat %*% Toy_Delta_Demandprint("Total Output vector change")Toy_Delta_Output``````{r Toy necessary and sufficient tests}# Note, to allow for the presence of final demand and primary inputs, in an open model, the sum of each column in $A$ must be less than 1. # Simple sufficient, but not necessary condition of sustainability. All must be TRUE.all(colSums(Toy_A_mat) <1) %>% {paste0("Technology matrix structure passes sufficient test of sustainability: ", . )}#Hawkins – Simon condition check. All must be TRUE.local({PM =NULLfor(n in2:nrow(Toy_A_mat)){ PM[1:n] = (det((diag(nrow(Toy_A_mat)) - Toy_A_mat)[1:n,1:n]) >0)}all(all(PM) & (diag(diag(nrow(Toy_A_mat)) - Toy_A_mat) >0) & (det(diag(nrow(Toy_A_mat)) - Toy_A_mat) >0)) %>% {paste0("Leontief matrix structure passes necessary and sufficient test of practibility and viability: ", . )}})```### Toy TUC **Retrospective** One objective of the project narative is to quantify and match the economic connectedness of Micropolitian and Urban clusters to large Metropolitan economies. This is done by analyzing how well the commodity output mix of Micropolitian and Urban clusters to the commodity uses of industries in large Metropolitan cores.In practice, for two *primary ECAs* $A$ and $B$, we construct the absolute rural use coefficients as:<palign="center">$\displaystyle \alpha^{r}_{A} = \frac{\sum^{n}_{j=1} u_{cj} s_{Aj}} {\sum^{n}_{j=1} u_{cj}}$,$\displaystyle \alpha^{r}_{B} = \frac{\sum^{n}_{j=1} u_{cj} s_{Bj}} {\sum^{n}_{j=1} u_{cj}}$</p>where $u_{cj}$ the the value of purchases of a single commodity $c$ by industry $j$ from the BEA's the *Use* matrix, and $s_{Aj}$ and $s_{Bj}$ are the annual sales of industry $j$ in *primary* ECA regions $A$ and $B$, for $j = 1, \dots, n$ industries. Similarly, we construct the relative rural use coefficients as:<palign="center">$\displaystyle \rho^{r}_{A} = \frac{\alpha^{r}_{A}} {\sum^{n}_{j=1} s_{Aj}}$,$\displaystyle \rho^{r}_{B} = \frac{\alpha^{r}_{B}} {\sum^{n}_{j=1} s_{Bj}}$</p>In the absolute specification, the numerator is a uniform modification of the elements in a row of the national *Use* matrix. The national value of purchases of commodity $c$ by each industry $j$ weighted by the regional value of "sales" by each industry $j$ (CBP will offer values of county-level industry employment, Q1 payroll, and annual payroll). This is divided by the total intermediate value of commodity $c$ at the national level.The relative specification then takes the absolute coefficient divided by the sum of the regional "sales" by each industry $j$. **Toy Model**Suppose there is 1 **N**ational level economy composed of 2 industries (e.g., Hydrocarbon **E**xtraction and **F**arming), 3 commodities (e.g., **G**asoline, **H**ogs, and **I**ce cream), and 4 regions (e.g., **D**ane county (Madison , WI), **C**ook county (Chicago, IL), **B**rown County (Green Bay, WI), and the **A**ll the rest as a collective residual). Available data consists of a $3 \times 2$ **N**ational level commodity by industry *Use table* ($\underset{(3 \times 2)}{\mathbf{U}} = [u_{ij}]$) that depicts the value of purchases of commodity $i$ by industry $j$. (In conjunction with a column vector of total industry outputs ($\underset{(2 \times 1)}{\mathbf{x}} = [x_{j}]$), this gives rise to the standard *Use coefficients* $\mathbf{B} = [b_{ij}] = u_{ij}/x_{j}$ in which column $j$ represents the value of inputs of each commodity per dollar’s worth $\mathbf{B}$ are therefore commodities-by-industries.) Similarly, other available data consists of a $2 \times 3$ **N**ational level commodity by industry *Make table* ($\underset{(2 \times 3)}{\mathbf{V}} = [v_{ij}]$) that depicts the value of output of commodity $j$ that is produced by industry $i$. (Note, the row sums of $\mathbf{V}$ are total industry output $\mathbf{x}$ and column sums of $\mathbf{V}$ are total [commodity] output $\mathbf{q}'$.)Available data of at the region level consists of sector specific industry activity ($\underset{(2 \times 4)}{\mathbf{x}^r} = [x^{r}_{j}]$) in the form of either employment, payroll, or simply number of establishments. As such let, \begin{equation}\underset{(c \times i)}{\mathbf{U}^{n}} = \begin{bmatrix} u^{n}_{11} & u^{n}_{12} \\ u^{n}_{21} & u^{n}_{22} \\ u^{n}_{31} & u^{n}_{32} \\ \end{bmatrix} = \begin{bmatrix} u^{n}_{GE} & u^{n}_{GF} \\ u^{n}_{HE} & u^{n}_{HF} \\ u^{n}_{IE} & u^{n}_{IF} \\ \end{bmatrix}, \; \text{ and} \; \;\underset{(i \times g)}{\mathbf{X}^{r}_{empl/pay}} = \begin{bmatrix} x^{1}_{1} & x^{2}_{1} & x^{3}_{1} & x^{4}_{1} \\ x^{1}_{2} & x^{2}_{2} & x^{3}_{2} & x^{4}_{2} \\ \end{bmatrix} = \begin{bmatrix} x^{A}_{E} & x^{B}_{E} & x^{C}_{E} & x^{D}_{E} \\ x^{A}_{F} & x^{B}_{F} & x^{C}_{F} & x^{D}_{F} \\ \end{bmatrix}\end{equation}As such the project narrative use coefficients (as demonstrated in the excel trial) are:$$\underset{(c \times g)}{\mathbf{\alpha}} = (\hat{\mathbf{Ui}}^{n})^{-1}\mathbf{U}^{n}\mathbf{X}^{r} = \begin{bmatrix} \frac{u^{n}_{GE}x^{A}_{E}+u^{n}_{GF}x^{A}_{F}}{u^{n}_{GE}+u^{n}_{GF}} & \frac{u^{n}_{GE}x^{B}_{E}+u^{n}_{GF}x^{B}_{F}}{u^{n}_{GE}+u^{n}_{GF}} & \frac{u^{n}_{GE}x^{C}_{E}+u^{n}_{GF}x^{C}_{F}}{u^{n}_{GE}+u^{n}_{GF}} & \frac{u^{n}_{GE}x^{D}_{E}+u^{n}_{GF}x^{D}_{F}}{u^{n}_{GE}+u^{n}_{GF}} \\ \frac{u^{n}_{HE}x^{A}_{E}+u^{n}_{HF}x^{A}_{F}}{u^{n}_{HE}+u^{n}_{HF}} & \frac{u^{n}_{HE}x^{B}_{E}+u^{n}_{HF}x^{B}_{F}}{u^{n}_{HE}+u^{n}_{HF}} & \frac{u^{n}_{HE}x^{C}_{E}+u^{n}_{HF}x^{C}_{F}}{u^{n}_{HE}+u^{n}_{HF}} & \frac{u^{n}_{HE}x^{D}_{E}+u^{n}_{HF}x^{D}_{F}}{u^{n}_{HE}+u^{n}_{HF}} \\ \frac{u^{n}_{IE}x^{A}_{E}+u^{n}_{IF}x^{A}_{F}}{u^{n}_{IE}+u^{n}_{IF}} & \frac{u^{n}_{IE}x^{B}_{E}+u^{n}_{IF}x^{B}_{F}}{u^{n}_{IE}+u^{n}_{IF}} & \frac{u^{n}_{IE}x^{C}_{E}+u^{n}_{IF}x^{C}_{F}}{u^{n}_{IE}+u^{n}_{IF}} & \frac{u^{n}_{IE}x^{D}_{E}+u^{n}_{IF}x^{D}_{F}}{u^{n}_{IE}+u^{n}_{IF}} \\\end{bmatrix}$$and $$\underset{(c \times g)}{\mathbf{\rho}} = (\hat{\mathbf{Ui}}^{n})^{-1}\mathbf{U}^{n}\mathbf{X}^{r}(\hat{\mathbf{iX}}^{r})^{-1} = \begin{bmatrix} \frac{u^{n}_{GE}x^{A}_{E}+u^{n}_{GF}x^{A}_{F}}{(u^{n}_{GE}+u^{n}_{GF})(x^{A}_{E} + x^{A}_{F})} & \frac{u^{n}_{GE}x^{B}_{E}+u^{n}_{GF}x^{B}_{F}}{(u^{n}_{GE}+u^{n}_{GF})(x^{B}_{E} + x^{B}_{F})} & \frac{u^{n}_{GE}x^{C}_{E}+u^{n}_{GF}x^{C}_{F}}{(u^{n}_{GE}+u^{n}_{GF})(x^{C}_{E} + x^{C}_{F})} & \frac{u^{n}_{GE}x^{D}_{E}+u^{n}_{GF}x^{D}_{F}}{(u^{n}_{GE}+u^{n}_{GF})(x^{D}_{E} + x^{D}_{F})} \\ \frac{u^{n}_{HE}x^{A}_{E}+u^{n}_{HF}x^{A}_{F}}{(u^{n}_{HE}+u^{n}_{HF})(x^{A}_{E} + x^{A}_{F})} & \frac{u^{n}_{HE}x^{B}_{E}+u^{n}_{HF}x^{B}_{F}}{(u^{n}_{HE}+u^{n}_{HF})(x^{B}_{E} + x^{B}_{F})} & \frac{u^{n}_{HE}x^{C}_{E}+u^{n}_{HF}x^{C}_{F}}{(u^{n}_{HE}+u^{n}_{HF})(x^{C}_{E} + x^{C}_{F})} & \frac{u^{n}_{HE}x^{D}_{E}+u^{n}_{HF}x^{D}_{F}}{(u^{n}_{HE}+u^{n}_{HF})(x^{D}_{E} + x^{D}_{F})} \\ \frac{u^{n}_{IE}x^{A}_{E}+u^{n}_{IF}x^{A}_{F}}{(u^{n}_{IE}+u^{n}_{IF})(x^{A}_{E} + x^{A}_{F})} & \frac{u^{n}_{IE}x^{B}_{E}+u^{n}_{IF}x^{B}_{F}}{(u^{n}_{IE}+u^{n}_{IF})(x^{B}_{E} + x^{B}_{F})} & \frac{u^{n}_{IE}x^{C}_{E}+u^{n}_{IF}x^{C}_{F}}{(u^{n}_{IE}+u^{n}_{IF})(x^{C}_{E} + x^{C}_{F})} & \frac{u^{n}_{IE}x^{D}_{E}+u^{n}_{IF}x^{D}_{F}}{(x^{C}_{E} + x^{C}_{F})(x^{D}_{E} + x^{D}_{F})} \\\end{bmatrix}$$**What are the interpretations of these coefficients explicitly?**From here we can ask the question: Is Madison more economically connected to Chicago or Green Bay or is it an economic island? More specifically, is the commodity(?) output (mix?) of Madison more congruent with the commodity(?) output (mix?) of Chicago or Green Bay? (Note: there is no claim/evidence of trade and commodity flows across space using this specification, instead we are measuring possible (one direction) sector compatibility.)Given the commodity by geography, relative use coefficient matrix $\mathbf{\rho}$, we can now construct a geography by geography (from-to) score for each commodity given by:$$\mathbf{\phi}_{Gasoline} = \hat{\rho}_{G}\mathbf{i}\mathbf{i}'[\hat{\rho}_{G}\mathbf{i}\mathbf{i}'+\hat{\rho}'_{G}\mathbf{i}\mathbf{i}']^{-1} = \begin{bmatrix} \frac{\rho_{G}^{A}}{\rho_{G}^{A} + \rho_{G}^{A}} & \frac{\rho_{G}^{A}}{\rho_{G}^{A} + \rho_{G}^{B}} & \frac{\rho_{G}^{A}}{\rho_{G}^{A} + \rho_{G}^{C}} & \frac{\rho_{G}^{A}}{\rho_{G}^{A} + \rho_{G}^{D}} \\ \frac{\rho_{G}^{B}}{\rho_{G}^{B} + \rho_{G}^{A}} & \frac{\rho_{G}^{B}}{\rho_{G}^{B} + \rho_{G}^{B}} & \frac{\rho_{G}^{B}}{\rho_{G}^{B} + \rho_{G}^{C}} & \frac{\rho_{G}^{B}}{\rho_{G}^{B} + \rho_{G}^{D}} \\ \frac{\rho_{G}^{C}}{\rho_{G}^{C} + \rho_{G}^{A}} & \frac{\rho_{G}^{C}}{\rho_{G}^{C} + \rho_{G}^{B}} & \frac{\rho_{G}^{C}}{\rho_{G}^{C} + \rho_{G}^{C}} & \frac{\rho_{G}^{C}}{\rho_{G}^{C} + \rho_{G}^{D}} \\ \frac{\rho_{G}^{D}}{\rho_{G}^{D} + \rho_{G}^{A}} & \frac{\rho_{G}^{D}}{\rho_{G}^{D} + \rho_{G}^{B}} & \frac{\rho_{G}^{D}}{\rho_{G}^{D} + \rho_{G}^{C}} & \frac{\rho_{G}^{D}}{\rho_{G}^{D} + \rho_{G}^{D}} \\ \end{bmatrix}$$The maximum value by row of $\mathbf{\phi}_{G}$ determines which typology each region is assigned on a per commodity basis.Similarly the other commodities are given by:$$\mathbf{\phi}_{Hogs} = \hat{\rho}_{H}\mathbf{i}\mathbf{i}'[\hat{\rho}_{H}\mathbf{i}\mathbf{i}'+\hat{\rho}'_{H}\mathbf{i}\mathbf{i}']^{-1} = \begin{bmatrix} \frac{\rho_{H}^{A}}{\rho_{H}^{A} + \rho_{H}^{A}} & \frac{\rho_{H}^{A}}{\rho_{H}^{A} + \rho_{H}^{B}} & \frac{\rho_{H}^{A}}{\rho_{H}^{A} + \rho_{H}^{C}} & \frac{\rho_{H}^{A}}{\rho_{H}^{A} + \rho_{H}^{D}} \\ \frac{\rho_{H}^{B}}{\rho_{H}^{B} + \rho_{H}^{A}} & \frac{\rho_{H}^{B}}{\rho_{H}^{B} + \rho_{H}^{B}} & \frac{\rho_{H}^{B}}{\rho_{H}^{B} + \rho_{H}^{C}} & \frac{\rho_{H}^{B}}{\rho_{H}^{B} + \rho_{H}^{D}} \\ \frac{\rho_{H}^{C}}{\rho_{H}^{C} + \rho_{H}^{A}} & \frac{\rho_{H}^{C}}{\rho_{H}^{C} + \rho_{H}^{B}} & \frac{\rho_{H}^{C}}{\rho_{H}^{C} + \rho_{H}^{C}} & \frac{\rho_{H}^{C}}{\rho_{H}^{C} + \rho_{H}^{D}} \\ \frac{\rho_{H}^{D}}{\rho_{H}^{D} + \rho_{H}^{A}} & \frac{\rho_{H}^{D}}{\rho_{H}^{D} + \rho_{H}^{B}} & \frac{\rho_{H}^{D}}{\rho_{H}^{D} + \rho_{H}^{C}} & \frac{\rho_{H}^{D}}{\rho_{H}^{D} + \rho_{H}^{D}} \\ \end{bmatrix}$$$$\mathbf{\phi}_{Ice cream} = \hat{\rho}_{I}\mathbf{i}\mathbf{i}'[\hat{\rho}_{I}\mathbf{i}\mathbf{i}'+\hat{\rho}'_{I}\mathbf{i}\mathbf{i}']^{-1} = \begin{bmatrix} \frac{\rho_{I}^{A}}{\rho_{I}^{A} + \rho_{I}^{A}} & \frac{\rho_{I}^{A}}{\rho_{I}^{A} + \rho_{I}^{B}} & \frac{\rho_{I}^{A}}{\rho_{I}^{A} + \rho_{I}^{C}} & \frac{\rho_{I}^{A}}{\rho_{I}^{A} + \rho_{I}^{D}} \\ \frac{\rho_{I}^{B}}{\rho_{I}^{B} + \rho_{I}^{A}} & \frac{\rho_{I}^{B}}{\rho_{I}^{B} + \rho_{I}^{B}} & \frac{\rho_{I}^{B}}{\rho_{I}^{B} + \rho_{I}^{C}} & \frac{\rho_{I}^{B}}{\rho_{I}^{B} + \rho_{I}^{D}} \\ \frac{\rho_{I}^{C}}{\rho_{I}^{C} + \rho_{I}^{A}} & \frac{\rho_{I}^{C}}{\rho_{I}^{C} + \rho_{I}^{B}} & \frac{\rho_{I}^{C}}{\rho_{I}^{C} + \rho_{I}^{C}} & \frac{\rho_{I}^{C}}{\rho_{I}^{C} + \rho_{I}^{D}} \\ \frac{\rho_{I}^{D}}{\rho_{I}^{D} + \rho_{I}^{A}} & \frac{\rho_{I}^{D}}{\rho_{I}^{D} + \rho_{I}^{B}} & \frac{\rho_{I}^{D}}{\rho_{I}^{D} + \rho_{I}^{C}} & \frac{\rho_{I}^{D}}{\rho_{I}^{D} + \rho_{I}^{D}} \\ \end{bmatrix}$$Such that $$\mathbf{\phi} = \begin{bmatrix}\mathbf{\phi}_{G} & 0 & 0 \\0 & \mathbf{\phi}_{H} & 0 \\0 & 0 & \mathbf{\phi}_{I} \\\end{bmatrix}$$The maximum value by row of $\mathbf{\phi}$ determines which typology each region is assigned on an overall commodity basis.(Alternative assignment procedures may incorporate a weighted mix of commodities or alternatively measure the "gravity" of regions by commodities to determine a regions commodity center of gravity.)Suppose, \begin{equation}\underset{(c \times i)}{\mathbf{U}^{n}} = \begin{bmatrix} 18 & 12 \\ 20 & 16 \\ 2 & 6 \\ \end{bmatrix}, \; \text{ and} \; \;\underset{(i \times g)}{\mathbf{X}^{r}_{employment}} = \begin{bmatrix} 500 & 10 & 30 & 4 \\ 6000 & 200 & 400 & 50 \\ \end{bmatrix}\end{equation}```{r toyexample}Toy_U_mat <-matrix(c(18, 12, 20, 16, 2, 6), nrow =3, byrow =TRUE)rownames(Toy_U_mat) <-c("Gas", "Hogs", "Ice")colnames(Toy_U_mat) <-c("Extract", "Farm")Toy_X_mat <-matrix(c(500, 10, 30, 4, 6000, 200, 400, 50), nrow =2, byrow =TRUE)rownames(Toy_X_mat) <-c("Extract", "Farm")colnames(Toy_X_mat) <-c("AllElse", "Brown", "Cook", "Dane")Toy_u_vec <-as.vector(Toy_U_mat %*%rep(c(1), each=ncol(Toy_U_mat)))Toy_x_vec <-as.vector(t(rep(c(1), each=nrow(Toy_X_mat))) %*% Toy_X_mat)Toy_Alpha_mat <-solve(matrix(diag(Toy_u_vec, nrow =3), ncol=3)) %*% Toy_U_mat %*% Toy_X_matrownames(Toy_Alpha_mat) <-c("Gas", "Hogs", "Ice")print("Alpha matrix")Toy_Alpha_matToy_Rho_mat <- Toy_Alpha_mat %*%solve(matrix(diag(Toy_x_vec, nrow =4), ncol=4)) colnames(Toy_Rho_mat) <-c("AllElse", "Brown", "Cook", "Dane")print("Rho matrix")Toy_Rho_matToy_Phi_tilde_mat_G <-matrix(diag(Toy_Rho_mat[1,], nrow =4), ncol=4) %*%rep(c(1), each=ncol(Toy_X_mat)) %*%t(rep(c(1), each=ncol(Toy_X_mat)))Toy_Phi_mat_G <- Toy_Phi_tilde_mat_G / (Toy_Phi_tilde_mat_G +t(Toy_Phi_tilde_mat_G))rownames(Toy_Phi_mat_G) <-c("AllElse", "Brown", "Cook", "Dane")colnames(Toy_Phi_mat_G) <-c("AllElse", "Brown", "Cook", "Dane")print("Phi matrix - Gas")Toy_Phi_mat_GToy_Phi_tilde_mat_H <-matrix(diag(Toy_Rho_mat[2,], nrow =4), ncol=4) %*%rep(c(1), each=ncol(Toy_X_mat)) %*%t(rep(c(1), each=ncol(Toy_X_mat)))Toy_Phi_mat_H <- Toy_Phi_tilde_mat_H / (Toy_Phi_tilde_mat_H +t(Toy_Phi_tilde_mat_H))rownames(Toy_Phi_mat_H) <-c("AllElse", "Brown", "Cook", "Dane")colnames(Toy_Phi_mat_H) <-c("AllElse", "Brown", "Cook", "Dane")print("Phi matrix - Hogs")Toy_Phi_mat_HToy_Phi_tilde_mat_I <-matrix(diag(Toy_Rho_mat[3,], nrow =4), ncol=4) %*%rep(c(1), each=ncol(Toy_X_mat)) %*%t(rep(c(1), each=ncol(Toy_X_mat)))Toy_Phi_mat_I <- Toy_Phi_tilde_mat_I / (Toy_Phi_tilde_mat_I +t(Toy_Phi_tilde_mat_I))rownames(Toy_Phi_mat_I) <-c("AllElse", "Brown", "Cook", "Dane")colnames(Toy_Phi_mat_I) <-c("AllElse", "Brown", "Cook", "Dane")print("Phi matrix - Ice")Toy_Phi_mat_IToy_Phi_mat <-bdiag(Toy_Phi_mat_G, Toy_Phi_mat_H, Toy_Phi_mat_I)```### Redux"Quantify the economic connectedness of places through production supply chains."Simply put at the core of our analysis lies the question: How similar are the inputs of Chicago to the outputs of Madison?Given only national level Input-Output tables (consisting of Make/Use tables, industry-by-industry, industry-by-commodity, and commodity-by-commodity total requirements tables, and commodity-by-industry direct requirements tables) and regional level industry activity (in the form of employment, payroll, and number of establishments), we are left with a handful of possible methods to derive an answer.*Possible Methodologies*Seemingly, our first task is to "regionalize" the national level I/O accounts down to the desired regional scale. * The most comprehensive approach is to execute an *interregional input–output (IRIO)* model. However, this approach really only serves as a platonic abstraction given the vast and detailed necessary exogenous inputs. * One possible approach is to implement parts or all of a *multiregional input–output model*. The regional tables of an MRIO employ a regional *technical* coefficients matrix, $\mathbf{A^{r}}$. Under this paradigm, information regarding the region of origin is ignored. These transactions are denoted $z^{\cdot r}_{ij}$, with the coefficients estimated as $a^{r}_{ij} = \frac{z^{\cdot r}_{ij}}{x^{r}_{j}}$ where $\mathbf{A^{r}} = \left[ a^{r}_{ij} \right]$. In practice, estimates are can be made using a *product-mix approach*. This technique essentially uses detailed level national technical coefficients and detailed level regional outputs to produce regional coefficients at a less detailed summary level (from weighted averages of the national detailed coefficients). This approach may be feasible given the available data, however, the necessary agglomeration of industry classes is not ideal. Furthermore, extending this approach into the realm of interregional tables, requires data on dollar flow of goods between regions, $z^{rs}_{i}$, irrespective of the sector of destination in the receiving region. Again, however, this extension is beyond the scope of the available data. * Another possible approach is to employ the *RAS technique*, however, instead of the typical deployment for updating coefficients across time, we may update input coefficients across space. This approach requires $3n$ sets of information for the location of interest. These are, *total gross outputs*, $x^{r}_{j}$, *total interindustry sales*, by sector $u_{i} = \sum^{n}_{j=1} z^{r}_{ij}$, and *total interindustry purchases*, by sector $v_{i} = \sum^{n}_{i=1} z^{r}_{ij}$. But even this slimmed down approach is still beyond the scope of the data.* Yet another possible alternative approach might be to estimate *regional supply percentages*, $p^{r}_{i}$. This approach requires knowledge of total regional output of each sector, $x^{r}_{i}$, and the exports of the product of each sector from the region, $e^{r}_{i}$, and the imports of all goods into the region, $m^{r}_{i}$. With knowledge of *regional supply percentages* one can scale the national level *direct/input requirements matrix* to estimate a regional *input requirements matrix*, $\mathbf{A^{rr}} = \mathbf{\hat{p}^{r}A}$. Clearly, this too is beyond the scope of the available data. However in this manner, when trying to estimate $a^{rr}_{ij}$ from national data, one can generalize the problem as a two step process of: (1) estimate a regional technical coefficient, $a^{r}_{ij}$, from the corresponding national coefficient, $a^{n}_{ij}$, and then (2) estimate the regional input coefficient, $a^{rr}_{ij}$, as some proportion of the regional technical coefficient; that is, $a^{rr}_{ij} = p^{r}_{ij}a^{r}_{ij}$ (where $0 \leq p^{r}_{ij} \leq 1$). The procedure for estimating $a^{rr}_{ij}$ from $a^{n}_{ij}$ is therefore: (1) find $\alpha^{r}_{ij} \geq 0$ such that $a^{r}_{ij} = (\alpha^{r}_{ij}) (a^{n}_{ij})$, and(2) find $\beta^{r}_{ij}$ ($0 \leq \beta^{r}_{ij} \leq 1$) such that $a^{rr}_{ij} = (\beta^{r}_{ij}) (a^{r}_{ij})$If we assume regional production recipes are identical then $\alpha^{r}_{ij} = 1$ and $a^{r}_{ij} = a^{n}_{ij}$ for all $i$ and $j$. Or if we assume each regional purchaser buys the same proportion of inputs from within the region, then $\beta^{r}_{ij} = p^{r}_{i}$ for all $i$. * Given this structure, and assuming $a^{r}_{ij} = a^{n}_{ij}$, we may propose various functional estimates of $\beta^{r}_{ij}$ to derive $a^{rr}_{ij}$ from $a^{n}_{ij}$, i.e., *Location Quotients*. The $LQ$s only require knowledge of gross output of sector $i$ in region $r$ and total output of all sectors in the region, $x^{r}_{i}$ and $x^{r}$, along with these values at the national level: $LQ^{r}_{i} = \left( \frac{x^{r}_{i} / x^{r}} {x^{n}_{i} / x^{n}} \right)$. Furthermore, many regional analysts will use employment rather than output as an alternative to the relevant measures of regional and national activity. This is a feasible option given the available data.* Another possible approach is to employ *Regional Purchase Coefficients*. The $RPC$ is defined as the proportion of regional demand for that sector’s output that is fulfilled from regional production. Much in the same way that location quotients operate, we may designate $\beta^{r}_{ij} = p^{r}_{i} = RPC^{r}_{i}$, where $RPC^{r}_{i} = z^{rr}_{i} / (z^{rr}_{i} + z^{sr}_{i}) = 1/[1 + 1/(z^{rr}_{i}/z^{sr}_{i})]$. The *relative shipment* term $z^{rr}_{i}/z^{sr}_{i}$, may itself be estimated from published sources such as *Census of Transportation*, unfortunately these are beyond the current scope and available data. * The final standard approach is to implement a *Gravity Model* formulation to estimate commodity flows between regions. The *Gravity Model* estimates the flow of goods between regions as a function of (1) some measure of the total output of $i$ in $r$, $x^{r}_{i}$, (2) some measure of the total purchases of $i$ in $s$, $x^{s}_{i}$, and (3) the distance (as a measure of “impedance”) between the two regions, $d^{rs}$. The relatively simplified gravity model specification is given by: $z^{rs}_{i} = \frac{x^{r \cdot}_{i}x^{\cdot s}_{i}} {x^{\cdot \cdot}_{i}} Q^{rs}_{i}$ where $x^{r \cdot}_{i}$ is the “supply pool” of good $i$ in region $r$, $x^{\cdot s}_{i}$ is the “demand pool” of good $i$ in region $s$, $x^{\cdot \cdot}_{i}$ is the total production of commodity $i$ in the system and $Q^{rs}_{i}$ is a parameter to be estimated. In practice the interregional flows per industry $i$ are given by $[\mathbf{{x'}}_{i}\mathbf{x}_{i}](\mathbf{x}_{i}\mathbf{i})^{-1} \odot \mathbf{Q}$, where $\mathbf{Q} = (\mathbf{d \odot d})^{-1}$ and $\mathbf{d}$ is the great-circle distance matrix derived from longitude and latitude vector pairs from region centroid. This approach may be feasible given the available data, too, (again with harsh simplifying assumptions). *Assumptions and Caveats*Note that strictly speaking, the $x^{r}_{j}$ terms throughout are specified as gross outputs of each sector in the region. In practice, however, these outputs are approximated through the use of employment, payroll, or number of establishments data, and they are not technically equivalent substitutes. Furthermore, the "supply pool" and "demand pool" total outputs called for in in the simple Gravity specification are even greater abstractions. First we assume $\alpha^{r}_{ij} = 1$ such that $a^{r}_{ij} = a^{n}_{ij}$, then assuming regional industry activity is equivalent to regional gross output $x^{r}_{j}$ and regional final demand is approximated by a single region model (i.e., $\mathbf{f^{r}} = \mathbf{(I} - \mathbf{A^{rr})x^{r}}$), which restricts by definition regional supply for good $i$ to be equivalent to regional demand for good $i$ (i.e., $x^{r \cdot}_{i} = x^{\cdot r}_{i}$). Note in fact, no national accounting data are needed for this gravity approach, one only needs regional industry activity and the impedance function, $Q^{rs}_{i}$, which is given by $e^{-d^{rs}}$ (or $1/(d^{rs})^2$) where $d^{rs}$ is the great-circle distance between each region.### ExposéNote that given an industry-by-industry BEA total requirements matrix, $\mathbf{L}$, we can derive an industry based technology, industry-by-industry direct requirements matrix, $\mathbf{A}$ from the accounting identity $\mathbf{L} = (\mathbf{I} - \mathbf{A})^{-1}$. And total industry output, $\mathbf{x}$, is also exogenously obtainable from the BEA Make/Use tables, allowing the further derivation of the transactions matrix, $\mathbf{Z}$, from the accounting identity $\mathbf{A} = \mathbf{Z\hat{x}^{-1}}$.*Data*```{r toyBaseData}Industry_Count <-5Toy_Total_mat <-matrix(c(1.2523, 0.0143, 0.0055, 0.0276, 0.0892,0.0251, 1.1230, 0.1450, 0.0331, 0.0709,0.0203, 0.0236, 1.0420, 0.0128, 0.0237,0.0092, 0.0171, 0.0088, 1.0048, 0.0105,0.4020, 0.2675, 0.1055, 0.5102, 1.7703), nrow = Industry_Count, byrow =TRUE)rownames(Toy_Total_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_Total_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")#kable(Toy_Total_mat, caption = "Total Requirements Table ( Leontief inverse )") print("Total Requirements Table (Leontief inverse)")Toy_Total_matToy_Direct_mat <-diag(ncol(Toy_Total_mat)) -solve(Toy_Total_mat)rownames(Toy_Direct_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_Direct_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")#kable(round(Toy_Direct_mat, 4), caption = "Direct Requirements Table (input coefficients / structural matrix / national technical coefficients)") print("Direct Requirements Table (input coefficients / structural matrix / national technical coefficients)")Toy_Direct_mat %>%round(4)Toy_Total_Industry_Output <-c(263919, 171317, 244999, 708461, 3883386)Toy_Transactions_mat <-round(Toy_Direct_mat %*%diag(Toy_Total_Industry_Output), 2)colnames(Toy_Transactions_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")print("Interindustry Transactions Table")Toy_Transactions_mat %>%round(4)Region_Count <-4Toy_Xemp_mat <-matrix(c(500, 13, 100, 50, 400, 34, 30, 88,300, 120, 20, 123,200, 200, 75, 40,100, 10, 400, 4), nrow = Industry_Count, byrow =TRUE)rownames(Toy_Xemp_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_Xemp_mat) <-c("AllElse", "Brown", "Cook", "Dane")#kable(Toy_Xemp_mat, caption = "Regional Industry-Employment Table") print("Regional Industry-Employment Table" )Toy_Xemp_matToy_Direct_mat_r <-kronecker(diag(1, Region_Count), Toy_Direct_mat)```*Location Quotients*```{r toyLQ}############ Simple Location QuotientToy_LQ <-matrix(0, nrow=Industry_Count, ncol = Region_Count)for (i in1:Industry_Count){for (j in1:Region_Count){ Toy_LQ[i,j] <-((Toy_Xemp_mat[i,j])/sum(Toy_Xemp_mat[,j]))/(sum(Toy_Xemp_mat[i,])/sum(Toy_Xemp_mat)) }}rownames(Toy_LQ) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_LQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Simple Location Quotients")round(Toy_LQ, 4)Toy_LQc <-matrix(0, nrow = Industry_Count, ncol = Region_Count)for (i in1:Industry_Count){for (j in1:Region_Count){if (Toy_LQ[i,j] >1){ Toy_LQc[i,j] <-1 }else{ Toy_LQc[i,j] <- Toy_LQ[i,j] } }}rownames(Toy_LQc) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_LQc) <-c("AllElse", "Brown", "Cook", "Dane")#round(Toy_LQc, 4)Toy_Dir_LQ <-c()for (i in1:ncol(Toy_LQc)){ Toy_Dir_LQ[[i]] <- (diag(Toy_LQc[,i]) %*% Toy_Direct_mat) %>%round(4)rownames(Toy_Dir_LQ[[i]]) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")}names(Toy_Dir_LQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Regionial Direct Requirement Coeffecients from Simple Location Quotients")Toy_Dir_LQ``````{r toyCLQ}############ Cross-Industry Location Quotient#This allows for differing modifiers within a given row of the national matrix; that is, it allows for differing cell-by-cell adjustments within An rather than uniform adjustments along each row.Toy_CLQ <-c()for (i in1:Region_Count){ Toy_CLQ[[i]] <- (Toy_LQ[,i] %*%t(1/Toy_LQ[,i])) %>%round(4)rownames(Toy_CLQ[[i]]) <-c("Extract", "Fiber", "Grains", "Hogs", "Info") }names(Toy_CLQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Cross-Industry Location Quotients") Toy_CLQToy_CLQc <- Toy_CLQfor (r in1:length(Toy_CLQc)){for (i in1:nrow(Toy_CLQc[[r]])){for (j in1:ncol(Toy_CLQc[[r]])){if (Toy_CLQc[[r]][i,j] >1){ Toy_CLQc[[r]][i,j] <-1 }else{ Toy_CLQc[[r]][i,j] <- Toy_CLQc[[r]][i,j] } }}}Toy_Dir_CLQ <-c()for (i in1:length(Toy_CLQc)){ Toy_Dir_CLQ[[i]] <- (Toy_CLQc[[i]] * Toy_Direct_mat) %>%round(4)}names(Toy_Dir_CLQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Regionial Direct Requirement Coeffecients from Cross-Industry Location Quotients")Toy_Dir_CLQ``````{r toySLQ}############ Semilogarithmic Location QuotientToy_SLQ <-c()for (i in1:Region_Count){ Toy_SLQ[[i]] <- (Toy_LQ[,i] %*%t(1/(log2(1+ Toy_LQ[,i])))) %>%round(4)rownames(Toy_SLQ[[i]]) <-c("Extract", "Fiber", "Grains", "Hogs", "Info") }names(Toy_SLQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Semilogarithmic Location Quotients")Toy_SLQToy_SLQc <- Toy_SLQfor (r in1:length(Toy_SLQc)){for (i in1:nrow(Toy_SLQc[[r]])){for (j in1:ncol(Toy_SLQc[[r]])){if (Toy_SLQc[[r]][i,j] >1){ Toy_SLQc[[r]][i,j] <-1 }else{ Toy_SLQc[[r]][i,j] <- Toy_SLQc[[r]][i,j] } }}}Toy_Dir_SLQ <-c()for (i in1:length(Toy_SLQc)){ Toy_Dir_SLQ[[i]] <- (Toy_SLQc[[i]] * Toy_Direct_mat) %>%round(4)}names(Toy_Dir_SLQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Regionial Direct Requirement Coeffecients from Semilogarithmic Location Quotients")Toy_Dir_SLQ``````{r toyFLQ}############ Flegg Location QuotientToy_FLQ_delta <- .3Toy_lambda <- (log2(1+ (colSums(Toy_Xemp_mat)/sum(Toy_Xemp_mat))))^(Toy_FLQ_delta)Toy_FLQ <-c()for (i in1:Region_Count){ Toy_FLQ[[i]] <- ((Toy_LQ[,i] %*%t(1/Toy_LQ[,i])) * Toy_lambda[i]) %>%round(4)rownames(Toy_FLQ[[i]]) <-c("Extract", "Fiber", "Grains", "Hogs", "Info") }names(Toy_FLQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Flegg Location Quotients")Toy_FLQ Toy_FLQc <- Toy_FLQfor (r in1:length(Toy_FLQc)){for (i in1:nrow(Toy_FLQc[[r]])){for (j in1:ncol(Toy_FLQc[[r]])){if (Toy_FLQc[[r]][i,j] >1){ Toy_FLQc[[r]][i,j] <-1 }else{ Toy_FLQc[[r]][i,j] <- Toy_FLQc[[r]][i,j] } }}}Toy_Dir_FLQ <-c()for (i in1:length(Toy_FLQc)){ Toy_Dir_FLQ[[i]] <- (Toy_FLQc[[i]] * Toy_Direct_mat) %>%round(4)}names(Toy_Dir_FLQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Regionial Direct Requirement Coeffecients from Flegg Location Quotients")Toy_Dir_FLQ``````{r toyAFLQ}############ Augmented Flegg Location QuotientToy_LQa <- Toy_LQfor (i in1:nrow(Toy_LQa)){for (j in1:ncol(Toy_LQa)){if (Toy_LQ[i,j] >1){ Toy_LQa[i,j] <-log2(1+ Toy_LQ[i,j]) }else{ Toy_LQa[i,j] <-1 }}}Toy_AFLQ <- Toy_FLQfor (r in1:length(Toy_FLQ)){ Toy_AFLQ[[r]] <- (Toy_FLQ[[r]] %*%diag(Toy_LQa[,r])) %>%round(4)colnames(Toy_AFLQ[[r]]) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")}print("Augmented Flegg Location Quotients")Toy_AFLQ Toy_Dir_AFLQ <-c()for (i in1:length(Toy_AFLQ)){ Toy_Dir_AFLQ[[i]] <- (Toy_AFLQ[[i]] * Toy_Direct_mat) %>%round(4)}names(Toy_Dir_AFLQ) <-c("AllElse", "Brown", "Cook", "Dane")print("Regionial Direct Requirement Coeffecients from Augmented Flegg Location Quotients")Toy_Dir_AFLQ```Each Location Quotient is a different estimate of the regional supply proportion of good $i$. Using a Location Quotient in conjunction with the national-level Direct Requirements matrix, one can estimate an *intra*regional Direct Requirements (input/ technical coefficients) matrix. *Reprise*However, to get back to the central question: How similar are the inputs of Chicago to the outputs of Madison? we may only need to look as far as the Location Quotients themselves. We have established that if $LQ_{i}^{r} > 1$, then industry $i$ is concentrated in region $r$ relative to the nation. As such we can expect all demand of good $i$ to be satisfied regionally and regional surplus will be exported to other less concentrated regions. Similarly, if $LQ_{i}^{r} < 1$, then industry $i$ is less concentrated in region $r$ relative to the nation. Thus if we look for industries in Madison where $LQ_{i}^{r} > 1$, **and** in Chicago where $LQ_{i}^{r} < 1$, we may use this matching to address the central question. For example, the Fiber and Grains simple LQ for Dane County (i.e., Madison) are both greater than 1. Similarly, only the Fiber simple LQ for Brown County (i.e., Green Bay), and only the Grain simple LQ for All Else is less than 1, making either only a partial partner. However, the Fiber and Grains simple LQ for Cook County (i.e., Chicago) are both less than 1, implying it is the strongest sink and best matched for using goods for which Madison is concentrated. ```{r toyOverlay}Toy_export <- Toy_CLQ$DaneToy_import <- Toy_CLQ$Cookfor (i in1:Industry_Count){for (j in1:Industry_Count){if (Toy_export[i,j] <1){ Toy_export[i,j] <-0 }else{ Toy_export[i,j] <- Toy_export[i,j] } }}for (i in1:Industry_Count){for (j in1:Industry_Count){if (Toy_import[i,j] >1){ Toy_import[i,j] <-0 }else{ Toy_import[i,j] <- Toy_import[i,j] } }}print("Sub-industry CLQ Import/Export Compatibility - Madison as Exporter, Chicago as Importer")(Toy_export/Toy_import) %>%round(4)```Similarly, for the other sub-industry location quotients, if sector $i$ at the region-level is relatively smaller than sector $j$ at the region-level (i.e, $CLQ^{r}_{ij} <1$), then some of $j$'s need for $i$ in $r$ will need to be imported. One can divide two sub-industry location quotients, an export region by an import region, and derive the following relationships:NaN -> Exp < 1 & Imp > 1 Inf -> Exp >1 & Imp > 10.0 -> Exp < 1 & Imp < 1'##' -> Exp > 1 & Imp < 1In this example, Chicago will need to import: Extract, Fiber, Grains, and Hog commodities for the Info sector, Fiber for the Extract sector, Fiber and Grains for the Hogs sector, Nothing for the Grains sector, Grains for the Fiber sector, and finally Fiber, Grains, and Hogs for the Extract sector. While Madison will export its excess in the exact same categories! *MAPE*```{r toyMAPE}#MAPElocal({ MAT <-bdiag(Toy_Dir_LQ) n <-4 Toy_MAPE_mat <<-c()for (i inseq(from =1, to =length(Toy_LQ), by =5 )){for (j inseq(from =1, to =length(Toy_LQ), by =5 )){ Toy_MAPE_mat <<-c(Toy_MAPE_mat, mean((abs(c(MAT[j:j+n,j:j+n]) -c(MAT[i:i+n,i:i+n]))/c(MAT[i:i+n,i:i+n]))*100)) } } Toy_MAPE_mat <<-matrix(Toy_MAPE_mat, nrow = n, byrow =FALSE)}) colnames(Toy_MAPE_mat) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_MAPE_mat) <-c("AllElse", "Brown", "Cook", "Dane")Toy_MAPE_mat```Alternatively, we can ask the question: Are the inter-industry flows of Chicago similar to the inter-industry flows of Madison? Or in other words do they have congruent intraregional economic activity? To answer this we can compare the estimated regional level Direct Requirements (technical coefficients) through the application of Mean Absolute Percent Error. Let $\text{MAPE} = \left(\sum_{i=1}^{n}\sum_{j=1}^{n}\frac{|a_{ij} - \tilde{a}_{ij}|}{a_{ij}}\right) \times 100$. For example, using the regional Direct Requirements matrix derived from the simple location quotient procedure, we can see that among the possible matches, the inter-industry flows of Madison are most similar to Green Bay, with Chicago and Madison displaying the most incongruent intraregional economic activity. In this case more incongruity points toward more gains from trade. *Gravity*```{r toyGravity}Toy_Lat <-c(42.3, 44.49, 41.84, 43.07)Toy_Lon <-c(-89.04, -88.03, -87.77, -89.39)Toy_Dist_mat <-distm(cbind(Toy_Lon, Toy_Lat))/1000000colnames(Toy_Dist_mat) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_Dist_mat) <-c("AllElse", "Brown", "Cook", "Dane")#Toy_Q <- (1/(Toy_Dist_mat)^2)Toy_Q <-exp(-Toy_Dist_mat)Toy_Gravity_mat <-as.list(c())for (i in1:nrow(Toy_Xemp_mat)){ Toy_Gravity_mat[[i]] <- (Toy_Xemp_mat[i,] %*%t(Toy_Xemp_mat[i,])) /sum(Toy_Xemp_mat[i,]) * Toy_Qrownames(Toy_Gravity_mat[[i]]) <-c("AllElse", "Brown", "Cook", "Dane")}names(Toy_Gravity_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")print("Gravity Model Formulations - Interreginal Flows by Industry")Toy_Gravity_mat ```Similarly, the Gravity approach represents interregional interindustry transaction estimates. In this example, the "All Else" category dominates the analysis, and rightly so given that it is an aggregate residual category. Furthermore the “impedance” (distance) measure is undefinable for an aggregate "All Else" category. For illustration an arbitrary proxy location is used. Ignoring "AllElse", we may employ the Gravity approach to answer the question: Does Madison have more interregional interindustry transaction with Chicago or Green Bay? In this example, Chicago is a stronger trading partner for Extraction and Info, whereas Green Bay is a stronger trading partner for Fiber, Grains, and Hogs. Taking one step further and comparing the aggregate magnitudes of interregional interindustry transactions reveals that Green Bay has nearly twice as much trading value than Chicago does with Madison. Note, however, that when we employ an exponential distance decay impediance function, such that *intra*regional interindustry transactions are definable, then Madison will actually have more "pull" with itself than Greenbay for Fiber and Grains. In this case, we would estimate Madison to have interregional interindustry transactions with Chicago for Extraction and Info commodities, and only Hogs with Green Bay. And as for Grains and Fiber, interindustry transactions are mostly dominated by *intra*regional exchange. ### Extension*A nested and hierarchical approach*We have shown that it is feasible to construct regional-level technical coefficients matrix from a national-level Leontief matrix and regional industry employment. This then brings about some possible extensions. One is the possibility of performing a multistage regionalization, whereby first or second-level administrative divisions (e.g., states) are used to establish a coarse regionalization. Then either place specific technical coefficients can be derived using the administrative division technical coefficients or the administrative division regionalization can be used as the basis for constructing interregional compatibility. To derive an administrative technical coefficients matrix, we aggregate the place specific regional industry employment to the administrative level. Let $\mathbf{X^{r}}$ be the industry-by-region measures economic activity. If all $j$ regions make up the desired administrative division, then the aggregate administrative measure of economic activity is given by $\mathbf{x^{R}} = \mathbf{X^{r}i}$, i.e., the row sums of the industry-by-region economic activity matrix. The administrative division Direct Requirements (technical coefficients) matrix can then be derived using a Location Quotient approach. The simple administrative division location quotient is given by: $$LQ^{R}_{i} = \left( \frac{x^{R}_{i} / x^{R}} {x^{n}_{i} / x^{n}} \right) = \left( \frac{(\mathbf{X^{r}i})_{i} / \mathbf{iX^{r}i}} {x^{n}_{i} / x^{n}} \right) $$And the administrative Direct Requirements (technical coefficients) matrix is generated by:$$a^{R}_{ij} = \left\{\begin{align*}(LQ^{R}_{i})a^{n}_{ij} \space\space \text{if} \space\space LQ^{R}_{i} < 1 \\a^{n}_{ij} \space\space \text{if} \space\space LQ^{R}_{i} \geq 1\end{align*}\right\}$$From here we may iterate and derive regional technical coefficients from the administrative technical coefficients matrix using a location quotient approach. The second stage simple regional division location quotient is given by: $$\widetilde{LQ}^{r}_{i} = \left( \frac{x^{r}_{i} / x^{r}} {x^{R}_{i} / x^{R}} \right) = \left( \frac{x^{r}_{i} / x^{r}} {(\mathbf{X^{r}i})_{i} / \mathbf{iX^{r}i}} \right) $$And the administrative Direct Requirements (technical coefficients) matrix is generated by:$$\widetilde{a}^{rr}_{ij} = \left\{\begin{align*}(\widetilde{LQ}^{r}_{i})a^{R}_{ij} \space\space \text{if} \space\space \widetilde{LQ}^{r}_{i} < 1 \\a^{R}_{ij} \space\space \text{if} \space\space \widetilde{LQ}^{r}_{i} \geq 1\end{align*}\right\}$$We can then examine what, if any, functional differences exist between a deriving regional coefficients directly from the national level or from a two stage, national to administrative division to regional approach. Note, under the simple LQ approach, whenever $LQ \geq 1 \vdash a^{n}_{ij} = a^{rr}_{ij} = a^{R}_{ij} = \widetilde{a}^{rr}_{ij}$. Combining steps, we can specify the Administrative Direct Requirements directly from the National Direct Requirements as:$$\widetilde{a}^{rr}_{ij} = \left\{\begin{align*}(LQ^{r}_{i})a^{n}_{ij} \space\space \text{if} \space\space \widetilde{LQ}^{r}_{i} < 1 \text{ } \& \text{ } LQ^{R}_{i} < 1 \\(\widetilde{LQ}^{r}_{i})a^{n}_{ij} \space\space \text{if} \space\space \widetilde{LQ}^{r}_{i} < 1 \text{ } \& \text{ } LQ^{R}_{i} \geq 1 \\(LQ^{R}_{i})a^{n}_{ij} \space\space \text{if} \space\space \widetilde{LQ}^{r}_{i} \geq 1 \text{ } \& \text{ } LQ^{R}_{i} < 1 \\a^{n}_{ij} \space\space \text{if} \space\space \widetilde{LQ}^{r}_{i} \geq 1 \text{ } \& \text{ } LQ^{R}_{i} \geq 1 \end{align*}\right\}$$Note $$\widetilde{LQ}^{r}_{i} \cdot LQ^{R}_{i} = \left( \frac{x^{r}_{i} / x^{r}} {x^{R}_{i} / x^{R}} \right) \cdot \left( \frac{x^{R}_{i} / x^{R}} {x^{n}_{i} / x^{n}} \right) = \left( \frac{x^{r}_{i} / x^{r}} {x^{n}_{i} / x^{n}} \right) = LQ^{r}_{i}$$In this example, the national-level Direct Requirements matrix is, $a^{n}_{ij}$, is given by: ```{r}print("National Direct Requirement Coeffecients")Toy_Direct_mat %>%round(4)```And the intraregional Direct Requirements matrix is, $a^{rr}_{ij} \space \forall \space r$ using the simple LQ approach is is given by: ```{r}print("Regionial Direct Requirement Coeffecients from Simple Location Quotients")Toy_Dir_LQ```Suppose $R = \{\text{Brown, Cook, Dane}\}$, such that administrative-level Direct Requirements matrix, $a^{R}_{ij}$, using the simple LQ approach is is given by: ```{r}Toy_Xemp_mat_R <- Toy_Xemp_mat[,2:4] %*%rep(c(1), each =ncol(Toy_Xemp_mat[,2:4]))colnames(Toy_Xemp_mat_R) <-c("ADM1")Toy_LQ_R <-matrix(0, nrow=nrow(Toy_Xemp_mat_R), ncol =ncol(Toy_Xemp_mat_R))for (i in1:nrow(Toy_Xemp_mat_R)){for (j in1:ncol(Toy_Xemp_mat_R)){ Toy_LQ_R[i,j] <-((Toy_Xemp_mat_R[i,j])/sum(Toy_Xemp_mat_R[,j]))/(sum(Toy_Xemp_mat[i,])/sum(Toy_Xemp_mat)) }}rownames(Toy_LQ_R) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_LQ_R) <-c("ADM1")Toy_LQc_R <- Toy_LQ_Rfor (i in1:nrow(Toy_Xemp_mat_R)){for (j in1:ncol(Toy_Xemp_mat_R)){if (Toy_LQ_R[i,j] >1){ Toy_LQc_R[i,j] <-1 }else{ Toy_LQc_R[i,j] <- Toy_LQ_R[i,j] } }}rownames(Toy_LQc_R) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_LQc_R) <-c("ADM1")Toy_Dir_LQ_R <-c()for (i in1:ncol(Toy_LQc_R)){ Toy_Dir_LQ_R[[i]] <- (diag(Toy_LQc_R[,i]) %*% Toy_Direct_mat) %>%round(4)rownames(Toy_Dir_LQ_R[[i]]) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")}names(Toy_Dir_LQ_R) <-c("ADM1")#print("Administrative Division Direct Requirement Coefficients from Simple Location Quotients")Toy_Dir_LQ_R```And administrative intraregional Direct Requirements matrix, $\widetilde{a}^{rr}_{ij}$ using the simple LQ approach is is given by: ```{r}Toy_Xemp_mat_Rt <- Toy_Xemp_mat[,2:4]Toy_LQ_Rt <-matrix(0, nrow=nrow(Toy_Xemp_mat_Rt), ncol =ncol(Toy_Xemp_mat_Rt))for (i in1:nrow(Toy_Xemp_mat_Rt)){for (j in1:ncol(Toy_Xemp_mat_Rt)){ Toy_LQ_Rt[i,j] <-((Toy_Xemp_mat_Rt[i,j])/sum(Toy_Xemp_mat_Rt[,j]))/(sum(Toy_Xemp_mat_R[i,])/sum(Toy_Xemp_mat_R)) }}rownames(Toy_LQ_Rt) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_LQ_Rt) <-c("Brown", "Cook", "Dane")Toy_LQc_Rt <- Toy_LQ_Rtfor (i in1:nrow(Toy_Xemp_mat_Rt)){for (j in1:ncol(Toy_Xemp_mat_Rt)){if (Toy_LQ_Rt[i,j] >1){ Toy_LQc_Rt[i,j] <-1 }else{ Toy_LQc_Rt[i,j] <- Toy_LQ_Rt[i,j] } }}rownames(Toy_LQc_Rt) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_LQc_Rt) <-c("Brown", "Cook", "Dane")Toy_Dir_LQ_Rt <-c()for (i in1:ncol(Toy_LQc_Rt)){ Toy_Dir_LQ_Rt[[i]] <- (diag(Toy_LQc_Rt[,i]) %*% Toy_Direct_mat) %>%round(4)rownames(Toy_Dir_LQ_Rt[[i]]) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")}names(Toy_Dir_LQ_Rt) <-c("Brown", "Cook", "Dane")#print("Regional Direct Requirement Coefficients from Administrative Division Coefficients from Simple Location Quotients")Toy_Dir_LQ_Rt```*Reprise (Again)*``` {r}print("Simple Location Quotients")Toy_LQ %>%round(4)print("Administrative Division Simple Location Quotients")Toy_LQ_Rt %>%round(4)```Remember that under the original national paradigm Madison was an exporter (i.e., $LQ^{r}_{i} > 1$) for only Fiber and Grains and Chicago was the only region that was an importer (i.e., $LQ^{r}_{i} < 1$) of **both** Fiber and Grains. However, under the nested hierarchical paradigm, Madison is an exporter of Fiber and Grains and Extraction (i.e., $\widetilde{LQ}^{r}_{i} \geq 1$). Within the administrative division cluster regime, there is now no longer a region that is an importer of all three of Madison's exports. Chicago is still an importer of Fiber and Grains, but not Extraction. Whereas Green Bay is an importer of Extract and Fiber, but not Grains. ### Remission*Project Narrative Models*In addition to the methods established in the literature deployed to answer the question: How similar are the inputs of Chicago to the outputs of Madison? we may also adapt the Project Narrative methods, too (within the data restrictions). Let the "Absolute Total Use Coefficient" be given by $\mathbf{\alpha} = (\widehat{\mathbf{Zi}})^{-1}\mathbf{Z}\mathbf{X}^{r}$ and the "Relative Total Use Coefficient" be given by ${\mathbf{\rho}} = (\widehat{\mathbf{Zi}})^{-1}\mathbf{Z}\mathbf{X}^{r}(\widehat{\mathbf{iX}^{r}})^{-1}$. We may also specify "Absolute Regional Typology Coefficients" and "Relative Regional Typology Coefficients" for each commodity $i$ as $\mathbf{\phi}_{i} =\hat{\alpha}_{i}\mathbf{i}\mathbf{i}'[\hat{\alpha}_{i}\mathbf{i}\mathbf{i}'+\hat{\alpha}'_{i}\mathbf{i}\mathbf{i}']^{-1}$ and $\mathbf{\varphi}_{i} =\hat{\rho}_{i}\mathbf{i}\mathbf{i}'[\hat{\rho}_{i}\mathbf{i}\mathbf{i}'+\hat{\rho}'_{i}\mathbf{i}\mathbf{i}']^{-1}$. In addition, we may also specify "Gravity of Absolute Total Use Coefficients" and "Gravity of Relative Total Use Coefficients" for each commodity $i$ as $\theta_{i} = [\mathbf{{\alpha'}}_{i}\mathbf{\alpha}_{i}](\mathbf{\alpha}_{i}\mathbf{i})^{-1} \odot \mathbf{Q}$ and $\vartheta_{i} = [\mathbf{{\rho'}}_{i}\mathbf{\rho}_{i}](\mathbf{\rho}_{i}\mathbf{i})^{-1} \odot \mathbf{Q}$, where $\mathbf{Q} = e^{-\mathbf{d}}$ the exponential distance decay function.Alternatively, as <palign="center">$\displaystyle \alpha^{r}_{i} = \frac{\sum^{n}_{j=1} z_{ij} x^{r}_{j}} {\sum^{n}_{j=1} z_{ij}}$,$\displaystyle \rho^{r}_{i} = \frac{\alpha^{r}_{i}} {\sum^{n}_{j=1} x^{r}_{j}}$</p><palign="center">$\displaystyle \phi^{rs}_{i} = \frac{\alpha^{r}_{i}} {\alpha^{r}_{i} + \alpha^{s}_{i} }$,$\displaystyle \varphi^{rs}_{i} = \frac{\rho^{r}_{i}} {\rho^{r}_{i} + \rho^{s}_{i} }$</p><palign="center">$\displaystyle \theta^{rs}_{i} = \frac{\alpha^{r}_{i} \alpha^{s}_{i} } {\sum^{g}_{r=1} \alpha^{r}_{i}} Q^{rs}$,$\displaystyle \vartheta^{rs}_{i} = \frac{\rho^{r}_{i} \rho^{s}_{i} } {\sum^{g}_{r=1} \rho^{r}_{i}} Q^{rs}$</p>```{r toyPN}Toy_Alpha_mat <- (solve(diag(as.vector(Toy_Transactions_mat %*%rep(c(1), each=ncol(Toy_Transactions_mat))))) %*% Toy_Transactions_mat %*% Toy_Xemp_mat) %>%round(4)rownames(Toy_Alpha_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")print("Absolute Total Use Coefficients")Toy_Alpha_matToy_Rho_mat <- (solve(diag(as.vector(Toy_Transactions_mat %*%rep(c(1), each=ncol(Toy_Transactions_mat))))) %*% Toy_Transactions_mat %*% Toy_Xemp_mat %*%solve(diag(as.vector(rep(c(1), each=nrow(Toy_Xemp_mat)) %*% Toy_Xemp_mat ))) ) %>%round(4)rownames(Toy_Rho_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_Rho_mat) <-c("AllElse", "Brown", "Cook", "Dane")print("Relative Total Use Coefficients")Toy_Rho_mat Toy_Phi_mat_A <-c()for (i in1:Industry_Count){ Toy_Phi_tilde_mat <-c(Toy_Alpha_mat[i,]) %*%t(rep(c(1), each=ncol(Toy_Xemp_mat))) Toy_Phi_mat_A[[i]] <- (Toy_Phi_tilde_mat / (Toy_Phi_tilde_mat +t(Toy_Phi_tilde_mat))) %>%round(4)rownames(Toy_Phi_mat_A[[i]]) <-c("AllElse", "Brown", "Cook", "Dane")colnames(Toy_Phi_mat_A[[i]]) <-c("AllElse", "Brown", "Cook", "Dane") }names(Toy_Phi_mat_A) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")print("Regional Typology Coefficients - Absolute")Toy_Phi_mat_A Toy_Phi_mat_R <-c()for (i in1:Industry_Count){ Toy_Phi_tilde_mat <-c(Toy_Rho_mat[i,]) %*%t(rep(c(1), each=ncol(Toy_Xemp_mat))) Toy_Phi_mat_R[[i]] <- (Toy_Phi_tilde_mat / (Toy_Phi_tilde_mat +t(Toy_Phi_tilde_mat))) %>%round(4)rownames(Toy_Phi_mat_R[[i]]) <-c("AllElse", "Brown", "Cook", "Dane")colnames(Toy_Phi_mat_R[[i]]) <-c("AllElse", "Brown", "Cook", "Dane") }names(Toy_Phi_mat_R) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")print("Regional Typology Coefficients - Relative")Toy_Phi_mat_RToy_Lat <-c(42.3, 44.49, 41.84, 43.07)Toy_Lon <-c(-89.04, -88.03, -87.77, -89.39)Toy_Dist_mat <-distm(cbind(Toy_Lon, Toy_Lat))/1000000colnames(Toy_Dist_mat) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_Dist_mat) <-c("AllElse", "Brown", "Cook", "Dane")#Toy_Q <- (1/(Toy_Dist_mat)^2)Toy_Q <-exp(-Toy_Dist_mat)Toy_Gravity_mat_Alpha <-as.list(c())for (i in1:nrow(Toy_Alpha_mat)){ Toy_Gravity_mat_Alpha[[i]] <- ((Toy_Alpha_mat[i,] %*%t(Toy_Alpha_mat[i,])) /sum(Toy_Alpha_mat[i,]) * Toy_Q ) %>%round(4)rownames(Toy_Gravity_mat_Alpha[[i]]) <-c("AllElse", "Brown", "Cook", "Dane")}names(Toy_Gravity_mat_Alpha) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")print("Gravity of Absolute Total Use Coefficients")Toy_Gravity_mat_AlphaToy_Gravity_mat_Rho <-as.list(c())for (i in1:nrow(Toy_Rho_mat)){ Toy_Gravity_mat_Rho[[i]] <- ((Toy_Rho_mat[i,] %*%t(Toy_Rho_mat[i,])) /sum(Toy_Rho_mat[i,]) * Toy_Q ) %>%round(4)rownames(Toy_Gravity_mat_Rho[[i]]) <-c("AllElse", "Brown", "Cook", "Dane")}names(Toy_Gravity_mat_Rho) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")print("Gravity of Relative Total Use Coefficients")Toy_Gravity_mat_Rho```The Absolute Use Coefficient portrays national-level intermediate transaction proportions scaled by regional-level economic activity in the form of industry employment. In this example the Info sector is by far the largest purchaser of intermediate goods for every industry including itself. And Cook County happens to have the largest Info workforce, making it dominate the Absolute Use Coefficient table. This is only the case, however, when we use the national-level Transactions matrix to capture the interconnectedness of industries. If instead we employ the national-level technical coefficients matrix, which specifies the amount of industry $i$’s commodity used to produce of one unit of industry $j$ ’s commodity, the Absolute Use Coefficient table is dominated by the AllElse category with the overall largest workforce. In this way we may take the national-level technical coefficients matrix, $\mathbf{A^{n}}$, as a first approximation to regional technical coefficients, $\mathbf{\tilde{A}^r}$, and thus the numerator of the project narrative Absolute and Relative Use Coefficients becomes $\mathbf{\tilde{A}^{r}X^{r}}$ or $\mathbf{\tilde{Z}^{r}}$ a first approximation of the regional Transactions matrix.The Relative Use Coefficient portrays national-level intermediate transaction proportions scaled by regional-level industry employment proportions. The Relative and Absolute "Regional Typology Coefficients", $\phi$, portray a region-by-region Use Coefficient proportion for each commodity type. In other words, the row maximum assigns the best potential relationship for each commodity of that region. We can use the $\phi$ to answer the question: For which commodities is Madison a sink or a source? And which regions are complementary to those commodity balances? Looking across each row, whenever the coefficient is greater than 0.5 that region is a source *relative* to the column region and vice versa. If the coefficient is less than 0.5, the row region is a *relative* sink to the column region, for that commodity. In this example, the Relative $\phi$ shows only Green Bay as a potential relative source for Madison and only for the Extract, Fiber, and Hogs commodities.The Gravity of the Absolute and Relative "Total Use Coefficients", uses an alternate procedure to estimate the volume and sensitivity of the coefficient proportions. In this example, Madison (and Green Bay for that matter) is dominated by the "pull" of Chicago. ### Resurrection**Location specific metrics of industry input needs**We specify a vector of *location input needs* as the product of the national input coefficient matrix and the location specific total output vector, $\mathbf{Ax^{r}}$. Similarly, let the *share* or *relative input needs* vector be given as $\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1}$. In addition, we define the *actual* or *import input needs* vector as, $\max \{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\}$. (Similarly, let the *actual* exports vector $\mathbf{\tilde{x}^{r}}$ be defined as $\vert \min \{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\} \vert$)And let the *relative import input needs* vector be given by, $\max \{\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}, 0\}$.Note: one may concatenate the input needs vectors across each location to construct an industry-by-location matrix of input needs for each specification above. Specifically denoted by: $\mathbf{AX}$, $\mathbf{AX}\left(\mathbf{\widehat{iAX}}\right)^{-1}$, $\max \{\mathbf{AX} - \mathbf{X}, 0\}$, and $\max \{ \mathbf{AX}\left(\mathbf{\widehat{iAX}}\right)^{-1} - \mathbf{X}\left(\mathbf{\widehat{iX}}\right)^{-1}, 0 \}$ **Cross location industry compatibility**To elucidate the industrial compatibility across locations, we construct a *similarity index*. In general, it is defined on the metric space $(\mathbf{X}, D)$ where $\mathbf{X}$ is a set of vectors $\{\mathbf{x}_{i} \in \mathbb{R} : i \in \{1, \dots, n \}\}$ and $D:\mathbf{X} \times \mathbf{X} \rightarrow \mathbb{R}$ is a distance function. Specifying $D$ as a *Euclidean* or *2-norm distance*, we define the industrial compatibility distance or *simple similarity index* between two locations $r$ and $s$ as:$$D^{rs} = \Vert \mathbf{Ax^{r}} - \mathbf{x^{s}} \Vert_{2} = \sqrt{(\mathbf{Ax^{r}} - \mathbf{x^{s}})^{'}(\mathbf{Ax^{r}} - \mathbf{x^{s}})}$$In addition, let the *relative similarity index* between two locations $r$ and $s$ be given by:$$\Vert \mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1} \Vert_{2}$$The *import similarity index* between two locations $r$ and $s$ be given by:$$\Vert \max\{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\} - \mathbf{x^{s}} \Vert_{2}$$Let the *relative import similarity index* between two locations $r$ and $s$ be given by:$$\Vert \max\{\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}, 0\} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1} \Vert_{2}$$(I would argue that when a similarity index employs an import specification the "exporting" location should be specified instead as $\mathbf{\tilde{x}^{s}} = \vert \min \{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\} \vert$ )One may further weight a *similarity index* by a geographical impedance function, $\mathbf{Q}$. In general terms, let this metric be defined by $\mathfrak{D}\left(\mathbf{Ax^{r}}, \mathbf{x^{s}}, \mathbf{Q}\right)$. In practice, the geographical impedance between locations $r$ and $s$ is given by, $Q^{rs} = e^{-d^{rs}}$ or $Q^{rs} = 1/(d^{rs})^2$ where $d^{rs}$ is the great-circle distance between the two locations. Specifically, $$d^{rs} = R \cdot \arccos(\sin(\phi_{r})\sin(\phi_{s}) + \cos(\phi_{r})\cos(\phi_{s})\cos(\vert \lambda_{r} - \lambda_{s} \vert)) $$Where $\lambda_{r}, \phi_{r}$ and $\lambda_{s}, \phi_{r}$ are the longitude and latitude of locations $r$ and $s$, and $R$ is the radius of the reference sphere, which in this case is the *mean earth radius* from the WGS84 ellipsoid. As such the *spatial simple similarity index* is given by $$\mathfrak{D}^{rs} = D^{rs}/Q^{rs} = \sqrt{(\mathbf{Ax^{r}} - \mathbf{x^{s}})^{'}(\mathbf{Ax^{r}} - \mathbf{x^{s}})}/Q^{rs}$$Let the *spatial relative similarity index* between two locations $r$ and $s$ be given by:$$\sqrt{(\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1})^{'}(\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1})} / Q^{rs}$$Let the *spatial import similarity index* between two locations $r$ and $s$ be given by:$$\sqrt{([\mathbf{Ax^{r}} - \mathbf{x^{r}}] - \mathbf{x^{s}})^{'}([\mathbf{Ax^{r}} - \mathbf{x^{r}}] - \mathbf{x^{s}})}/Q^{rs}$$Let the *spatial relative import similarity index* between two locations $r$ and $s$ be given by:$$\sqrt{([\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}] - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1})^{'}([\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}] - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1})} / Q^{rs}$$For any given location-by-location Similarity Index matrix specification, the minimum "distance" by row is said to be the most compatible trading partner.Alternative specifications of the spatial distance are possible. One possibility is to discretize the impedance matrix as a binary proximity operator which is one when two locations share a boundary (or vertex) and zero otherwise. **Empirical Example**Suppose we still only have national level Input-Output tables (consisting of Make/Use tables, industry-by-industry, industry-by-commodity, and commodity-by-commodity total requirements tables, and commodity-by-industry direct requirements tables) and location specific industry activity (in the form of employment, payroll, and number of establishments). How might we implement the above methods?As before, we can derive a national-level, industry-by-industry direct requirements matrix, $\mathbf{A^{n}}$, from the national-level, industry-by-industry total requirements matrix, $\mathbf{L^{n}}$, using $\mathbf{A^{n}} = \mathbf{I} - (\mathbf{L^{n}})^{-1}$. Furthermore, we can construct the location specific total output vector, $\mathbf{x^{r}}$ from a payroll and labor share quotient. As a first pass assume unit labor share, such that location specific industry payroll is equivalent to location specific total output. Suppose these tables are given by:"National Total Requirements Table": $\mathbf{L}$```{r}Industry_Count <-5Toy_Total_mat <-matrix(c(1.2523, 0.0143, 0.0055, 0.0276, 0.0892,0.0251, 1.1230, 0.1450, 0.0331, 0.0709,0.0203, 0.0236, 1.0420, 0.0128, 0.0237,0.0092, 0.0171, 0.0088, 1.0048, 0.0105,0.4020, 0.2675, 0.1055, 0.5102, 1.7703), nrow = Industry_Count, byrow =TRUE)rownames(Toy_Total_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_Total_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")Toy_Total_mat```"National Direct Requirements Table": $\mathbf{A} = \mathbf{I} - (\mathbf{L})^{-1}$```{r}Toy_Direct_mat <-diag(ncol(Toy_Total_mat)) -solve(Toy_Total_mat)rownames(Toy_Direct_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_Direct_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")Toy_Direct_mat %>%round(4)```"Industry-by-Location Payroll Table": $$\mathbf{X} = \begin{bmatrix} \mathbf{x^{r}} & \mathbf{x^{s}} & \cdots & \mathbf{x^{l}} \end{bmatrix}$$```{r}Region_Count <-4Toy_X_mat <-matrix(c(4979, 487, 5498, 1517, 4512, 1395, 4287, 0,19912, 20003, 11322, 1305,23685, 201144, 2670, 58896,81009, 224686, 98075, 27797), nrow = Industry_Count, byrow =TRUE)rownames(Toy_X_mat) <-c("Extract", "Fiber", "Grains", "Hogs", "Info")colnames(Toy_X_mat) <-c("AllElse", "Brown", "Cook", "Dane")Toy_X_mat```"Industry-by-Location Input Needs Table": $\mathbf{AX}$```{r}Toy_Input_mat <- (Toy_Direct_mat %*% Toy_X_mat) Toy_Input_mat %>%round(2)```"Relative Industry-by-Location Input Needs Table": $\mathbf{AX}\left(\mathbf{\widehat{iAX}}\right)^{-1}$```{r}Toy_Input_mat_rel <- (Toy_Input_mat) %*%solve(diag(c((rep(c(1), each=ncol(Toy_Direct_mat)) %*% Toy_Input_mat))))colnames(Toy_Input_mat_rel) <-c("AllElse", "Brown", "Cook", "Dane")Toy_Input_mat_rel %>%round(4)```"Import Industry-by-Location Input Needs Table": $\max \{\mathbf{AX} - \mathbf{X}, 0\}$ ```{r}Toy_Input_mat_imp <-pmax(Toy_Input_mat - Toy_X_mat, 0)Toy_Input_mat_imp %>%round(2)```"Net Exports Table": $\max \{\mathbf{AX} - \mathbf{X}, 0\}$ $\mathbf{\tilde{X}} = \vert \min \{\mathbf{AX} - \mathbf{X}, 0\} \vert$ ```{r}Toy_Input_mat_exp <-abs(pmin(Toy_Input_mat - Toy_X_mat, 0))Toy_Input_mat_exp %>%round(2)```"Relative Import Industry-by-Location Input Needs Table": $\max \{\mathbf{AX}\left(\mathbf{\widehat{iAX}}\right)^{-1} - \mathbf{X}\left(\mathbf{\widehat{iX}}\right)^{-1}, 0\}$ ```{r}Toy_Input_mat_imp_rel <-pmax(Toy_Input_mat_rel - Toy_X_mat %*%solve(diag(c((rep(c(1), each=nrow(Toy_X_mat)) %*% Toy_X_mat)))), 0)colnames(Toy_Input_mat_imp_rel) <-c("AllElse", "Brown", "Cook", "Dane")Toy_Input_mat_imp_rel %>%round(4)```Assuming a Euclidean distance function, the similarity index is given by "Simple Similarity Index": $\Vert \mathbf{Ax^{r}} - \mathbf{x^{s}} \Vert_{2} ~ \forall ~ r,s \in l$In this example denoted $$ \begin{bmatrix} \Vert\mathbf{Ax^{a}} - \mathbf{x^{a}}\Vert_{2} & \Vert\mathbf{Ax^{a}} - \mathbf{x^{b}}\Vert_{2} & \Vert\mathbf{Ax^{a}} - \mathbf{x^{c}}\Vert_{2} & \Vert\mathbf{Ax^{a}} - \mathbf{x^{d}}\Vert_{2}\\ \Vert\mathbf{Ax^{b}} - \mathbf{x^{a}}\Vert_{2} & \Vert\mathbf{Ax^{b}} - \mathbf{x^{b}}\Vert_{2} & \Vert\mathbf{Ax^{b}} - \mathbf{x^{c}}\Vert_{2} & \Vert\mathbf{Ax^{b}} - \mathbf{x^{d}}\Vert_{2}\\ \Vert\mathbf{Ax^{c}} - \mathbf{x^{a}}\Vert_{2} & \Vert\mathbf{Ax^{c}} - \mathbf{x^{b}}\Vert_{2} & \Vert\mathbf{Ax^{c}} - \mathbf{x^{c}}\Vert_{2} & \Vert\mathbf{Ax^{c}} - \mathbf{x^{d}}\Vert_{2}\\ \Vert\mathbf{Ax^{d}} - \mathbf{x^{a}}\Vert_{2} & \Vert\mathbf{Ax^{d}} - \mathbf{x^{b}}\Vert_{2} & \Vert\mathbf{Ax^{d}} - \mathbf{x^{c}}\Vert_{2} & \Vert\mathbf{Ax^{d}} - \mathbf{x^{d}}\Vert_{2}\\ \end{bmatrix}$$```{r}Toy_sim <-c()Toy_sim <-matrix(0, nrow = Region_Count, ncol = Region_Count)for (i in1:Region_Count){for (j in1:Region_Count){ Toy_sim[i,j] <-norm((Toy_Input_mat[,i] - (Toy_X_mat[,j])), type ="2") } }colnames(Toy_sim) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_sim) <-c("AllElse", "Brown", "Cook", "Dane")Toy_sim %>%round(2) %>%cbind(., RowMin =c(colnames(Toy_sim)[apply(Toy_sim, 1, which.min)]))table(colnames(Toy_sim)[apply(Toy_sim, 1, which.min)])```"Relative Similarity Index": $\Vert \mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1} \Vert_{2} ~ \forall ~ r,s \in l$```{r}Toy_sim_rel <-c()Toy_sim_rel <-matrix(0, nrow = Region_Count, ncol = Region_Count)for (i in1:Region_Count){for (j in1:Region_Count){ Toy_sim_rel[i,j] <-norm((Toy_Input_mat_rel[,i] - (Toy_X_mat[,j] %*%solve(c((rep(c(1), each=ncol(Toy_Direct_mat)) %*% Toy_X_mat[,j] ))) )), type ="2") } }colnames(Toy_sim_rel) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_sim_rel) <-c("AllElse", "Brown", "Cook", "Dane")Toy_sim_rel %>%round(4) %>%cbind(., RowMin =c(colnames(Toy_sim_rel)[apply(Toy_sim_rel, 1, which.min)]))table(colnames(Toy_sim_rel)[apply(Toy_sim_rel, 1, which.min)])```"Import Similarity Index": $\Vert \max\{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\} - \mathbf{x^{s}} \Vert_{2} ~ \forall ~ r,s \in l$```{r}Toy_sim_imp <-c()Toy_sim_imp <-matrix(0, nrow = Region_Count, ncol = Region_Count)for (i in1:Region_Count){for (j in1:Region_Count){ Toy_sim_imp[i,j] <-norm((Toy_Input_mat_imp[,i] - (Toy_X_mat[,j])), type ="2") } }colnames(Toy_sim_imp) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_sim_imp) <-c("AllElse", "Brown", "Cook", "Dane")Toy_sim_imp %>%round(2) %>%cbind(., RowMin =c(colnames(Toy_sim_imp)[apply(Toy_sim_imp, 1, which.min)]))table(colnames(Toy_sim_imp)[apply(Toy_sim_imp, 1, which.min)])```"*Import Similarity Index - Net Exports": $\Vert \max\{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\} - \mathbf{\tilde{x}^{s}} \Vert_{2} ~ \forall ~ r,s \in l$```{r}Toy_sim_exp <-c()Toy_sim_exp <-matrix(0, nrow = Region_Count, ncol = Region_Count)for (i in1:Region_Count){for (j in1:Region_Count){ Toy_sim_exp[i,j] <-norm((Toy_Input_mat_imp[,i] - (Toy_Input_mat_exp[,j])), type ="2") } }colnames(Toy_sim_exp) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_sim_exp) <-c("AllElse", "Brown", "Cook", "Dane")Toy_sim_exp %>%round(2) %>%cbind(., RowMin =c(colnames(Toy_sim_exp)[apply(Toy_sim_exp, 1, which.min)]))table(colnames(Toy_sim_exp)[apply(Toy_sim_exp, 1, which.min)])```"Relative Import Similarity Index": $\Vert \max\{\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}, 0\} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1} \Vert_{2} ~ \forall ~ r,s \in l$```{r}Toy_sim_imp_rel <-c()Toy_sim_imp_rel <-matrix(0, nrow = Region_Count, ncol = Region_Count)for (i in1:Region_Count){for (j in1:Region_Count){ Toy_sim_imp_rel[i,j] <-norm((Toy_Input_mat_imp_rel[,i] - (Toy_X_mat[,j] %*%solve(c((rep(c(1), each=ncol(Toy_Direct_mat)) %*% Toy_X_mat[,j] ))) )), type ="2") } }colnames(Toy_sim_imp_rel) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_sim_imp_rel) <-c("AllElse", "Brown", "Cook", "Dane")Toy_sim_imp_rel %>%round(4) %>%cbind(., RowMin =c(colnames(Toy_sim_imp_rel)[apply(Toy_sim_imp_rel, 1, which.min)]))table(colnames(Toy_sim_imp_rel)[apply(Toy_sim_imp_rel, 1, which.min)])```"*Relative Import Similarity Index - Net Exports": $\Vert \max\{\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}, 0\} - \mathbf{\tilde{x}^{s}}(\mathbf{i\tilde{x}^{s}})^{-1} \Vert_{2} ~ \forall ~ r,s \in l$```{r}Toy_sim_exp_rel <-c()Toy_sim_exp_rel <-matrix(0, nrow = Region_Count, ncol = Region_Count)for (i in1:Region_Count){for (j in1:Region_Count){ Toy_sim_exp_rel[i,j] <-norm((Toy_Input_mat_imp_rel[,i] - (Toy_Input_mat_exp[,j] %*%solve(c((rep(c(1), each=ncol(Toy_Direct_mat)) %*% Toy_Input_mat_exp[,j] ))) )), type ="2") } }colnames(Toy_sim_exp_rel) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_sim_exp_rel) <-c("AllElse", "Brown", "Cook", "Dane")Toy_sim_exp_rel %>%round(4) %>%cbind(., RowMin =c(colnames(Toy_sim_exp_rel)[apply(Toy_sim_exp_rel, 1, which.min)]))table(colnames(Toy_sim_exp_rel)[apply(Toy_sim_exp_rel, 1, which.min)])```Adding the spatial elements:"Geographical Impedance": $Q^{rs} = e^{-d^{rs}} ~ \forall ~ r,s \in l$```{r}Toy_Lat <-c(42.3, 44.49, 41.84, 43.07)Toy_Lon <-c(-89.04, -88.03, -87.77, -89.39)Toy_Dist_mat <-distm(cbind(Toy_Lon, Toy_Lat))/1000000colnames(Toy_Dist_mat) <-c("AllElse", "Brown", "Cook", "Dane")rownames(Toy_Dist_mat) <-c("AllElse", "Brown", "Cook", "Dane")#Toy_Q <- (1/(Toy_Dist_mat)^2)Toy_Q <-exp(-Toy_Dist_mat)Toy_Q %>%round(4)```"Spatial Simple Similarity Index": $\Vert \mathbf{Ax^{r}} - \mathbf{x^{s}} \Vert_{2} / Q^{rs} ~ \forall ~ r,s \in l$```{r}(Toy_sim / Toy_Q) %>%round(2)```"Spatial Relative Similarity Index": $\Vert \mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1} \Vert_{2} / Q^{rs} ~ \forall ~ r,s \in l$```{r}(Toy_sim_rel / Toy_Q) %>%round(4)```"Spatial Import Similarity Index": $\Vert \max\{\mathbf{Ax^{r}} - \mathbf{x^{r}}, 0\} - \mathbf{x^{s}} \Vert_{2} / Q^{rs} ~ \forall ~ r,s \in l$```{r}(Toy_sim_imp / Toy_Q) %>%round(2)```"Spatial Relative Import Similarity Index": $\Vert \max\{\mathbf{Ax^{r}}(\mathbf{iAx^{r}})^{-1} - \mathbf{x^{r}}(\mathbf{ix^{r}})^{-1}, 0\} - \mathbf{x^{s}}(\mathbf{ix^{s}})^{-1} \Vert_{2} / Q^{rs} ~ \forall ~ r,s \in l$```{r}(Toy_sim_imp_rel / Toy_Q) %>%round(4)```For any given location-by-location Similarity Index matrix specification, the minimum "distance" by row is said to be the most compatible trading partner.**Proximity Example**```{r}Toy_prox <-matrix(rbind(c(1,1,0,1,0,0,0,0,0),c(1,1,1,0,1,0,0,0,0),c(0,1,1,0,0,1,0,0,0),c(1,0,0,1,1,0,1,0,0),c(0,1,0,1,1,1,0,1,0),c(0,0,1,0,1,1,0,0,1),c(0,0,0,1,0,0,1,1,0),c(0,0,0,0,1,0,1,1,1),c(0,0,0,0,0,1,0,1,1)), nrow =9)Toy_RUCC <-c(0,0,0,0,0,9,9,0,9)Toy_RUCC_prox <-sqrt(t(Toy_RUCC * Toy_prox ) * Toy_RUCC)diag(Toy_RUCC_prox) <-0```### Notes"The Assumption of Spatially Invariant Production Functions: The Problem of HeterogeneityAll classic nonsurvey methods assume that national total use coefficients - the production functions in full - hold at every conceivable regional dimension. Various suggestions have been made as to why this is unlikely to be observed in practice. Smith and Morrison (1974 p.22) identify anything from differences in productive efficiency to climatic variation as contributing to the phenomenon. However, it is argued here that, apart from variation due to stochastic observation error, differences in total use can be attributed to the violation of one fundamental assumption of the Leontief model, and that is that each defined sector is homogeneous. Homogeneity implies that the set of coefficients which describes each defined commodity's production is wholly unique. Hence there can be no variation in the means of production for any defined commodity, and if there is, for whatever reason, the commodity would require separate definition in order for the homogeneity assumption to hold. The reason for the assumption is quite straightforward. If each production function describes a diverse set of commodities, the pattem of linkage is hidden within average relationships (see Table 5.1 below), and hence the precision with which one can calculate, for example, multipliers is eroded.It is argued here that differences in regional and national total use derive from the fact that the production functions of the national model are heterogeneously defined. The argument is as follows. At the broadest levels of commodity definition i.e. agriculture, manufacture, regional production functions are merged together, and as such cannot be identified in the national model. As the definitions of the national model are increased, the production functions of regionally specialised commodities emerge i.e. dair>' farming, sea fishing, fish farming. At some much higher level of disaggregation - where homogeneity is approached - the definitions o f the national model are so fine that it is possible to identify the individual factories and firms operating within the nation. At this point, the national model becomes one of infinite-regions because it is possible to extract the input-output table for any conceivable spatial subset of the nation. Not only the problem of estimating regional production functions evaporated, the trade estimation problem has ceased to exist." - (Brand, 1998)FIN:::