Illustration by Rhiannon Newman for the Urban Institute

Three Lessons for Measuring Rural Strengths

Data@Urban
6 min readJan 14, 2022

How do we describe rural America based on its strengths? Instead of focusing on traditional, deficit-based measures like poverty, population loss, and unemployment, Urban researchers wanted to consider the factors that create high quality of life in small communities and towns. But what could these shared assets between diverse communities look like? Should we look at the availability of financial resources, a person’s influence over local and federal policy, or the social fabric that holds small towns together? Ultimately, rural assets include all these and more, which leaves us with a far harder question: How do we measure these strengths?

To better identify rural assets, we used a conceptual framework called the Community Capitals Framework (CCF). Widely used in the rural economic development field to help identify strengths, the CCF groups community assets into seven different types of capital: built, cultural, financial, human, natural, political, and social. This framework helped us identify 50 measures of rural strength. We then used principal component analysis and k-means cluster analysis to construct a typology that places each inhabited rural census tract into one of seven peer groups, according to their unique assets and capacity for growth.

During this process, we encountered new challenges and grappled with decisions that go beyond double-checking our programs and testing our output. Below, we highlight three lessons to support researchers using a similar approach to understanding local assets, guide decisions around data selection, and support a deeper understanding of results.

Lesson 1: Beware of bias

Defining and measuring assets is subjective. Extractive resources such as coal, oil, and gas could contribute to a community’s financial capital, at least for those who work in the industry or own the resources. But those same resources can also negative affect air quality and people’s health. Similarly, proximity to a highway can be beneficial for economic development but may also have a negative effect on air quality and public health. Convention centers may be useful for building cultural capital but are costly to construct. In many cases, what is perceived as a strength depends on perspective and local context.

We knew we couldn’t make these decisions alone. To mitigate our own bias, we consulted rural development practitioners, policymakers, researchers, and community members. We asked what concepts were most important for measuring rural assets within each capital and used their responses to finalize our list of measures.

Decisions weren’t always unanimous, but this approach yielded valuable information from rural stakeholders that helped us make final decisions. Our initial measure for educational attainment was the percentage of the population in a census tract with a bachelor’s degree or higher. Although some experts agreed with this approach, many highlighted the importance of a high school diploma or GED for employment in rural towns and cities. Because of this feedback, we decided to use the more inclusive measure.

Lesson 2: Balance methods and usefulness

In social science research, cluster analysis is frequently used to help identify meaningful patterns. It can be hard to know when your results are correct when so many questions — what is considered an outlier, which algorithm is best, what scaling methods should be used — do not have straight forward answers.

So, we balanced the robustness of the results with the usefulness of the typology to our audiences. We evaluated different scaling methods by looking closely at the resulting cluster output. Min-max scaling, which preserves the distribution of the raw measures but transforms them into a 0 to 1 scale, produced results driven by capitals with higher variance measures (e.g., natural capital, cultural capital, and political capital) and cluster sizes that ranged between 600 and 3,150 tracts.

We also ran our analysis using measures scaled with a standard scaling method (z-score). The use of z-scores produced more uneven group sizes, including a very small cluster of 19 remote tracts in Alaska and a large cluster of nearly 4,000 tracts. The creation of this very small group was likely because these tracts were very different from other tracts across a number of measures that received more importance in the standard scaling approach (many of them were distance measures, which had comparatively lower variances before scaling). Based on these results, we selected min-max scaling to achieve more even group sizes and results that didn’t emphasize distance measures, which are often correlated and represent only one dimension of rural life.

In other cases, we had to remove information from the cluster to increase its usefulness. Some experts expressed early concerns that the tool might alienate tribal communities by including a measure of land area that caused most tribal communities to cluster together. To see if the tool could help tribal communities identify nontribal peers, we removed this measure and reevaluated our results.

We found that the new groups were similar to the previous version based on their defining characteristics, and tribal census tracts were distributed fairly evenly across these remaining groups. This distribution suggested tribal tracts had more in common with nontribal tracts than with each other based on the remaining 50 measures. We removed the tribal land area measure from the cluster analysis but included it as a contextual measure, alongside with other information on key local governance actors, on the online tool.

Lesson 3: Communicate the tool’s strengths and limitations

We needed to communicate clearly what the typology was most suitable for: identifying peer groups nationally and regionally. Census tract 4 in New Mexico’s Los Alamos County, located in the northern part of the state, is classified within the diverse, outlying group, which is characterized by its rich racial, ethnic, and linguistic diversity. Using our tool, we can identify tract 4’s peers throughout the West and Southwest, as well as in Alaska and Hawaii.

But the tool is less useful for making state and local comparisons. New Mexico is a diverse state, and most of its tracts fall into the same group. When compared with the rest of the state, tract 4 does not rank as well for diversity: it is in the 68th percentile for racial and ethnic diversity and the 37th percentile for linguistic diversity. If we were to rerun the cluster analysis for rural New Mexico only, tract 4 would be more likely to be categorized as a center of wealth and health because it is around the 90th percentile or above for median individual income, median home value, life expectancy, and health insurance.

Communicating the strengths and limitations of research helps audiences understand how they can use our results. We have also made all underlying data available through Urban’s Data Catalog so users have detailed results without having to rely on a tract’s national peer group. Users can combine our data with additional sources to inform local investment decisions.

Hopefully, these data are another tool to help drive evidence-based decisionmaking in rural places. We also hope our work supports other researchers using similar approaches in data science methods in social science research.

-Yipeng Su

-Amanda Gold

Want to learn more? Sign up for the Data@Urban newsletter.

--

--

Data@Urban

Data@Urban is a place to explore the code, data, products, and processes that bring Urban Institute research to life.