How We Used Machine Learning to Predict Neighborhood Change
Measuring changes in neighborhood patterns as they occur can enable timely policy action to prevent displacement in gentrifying communities, mitigate depopulation and community decline, and encourage inclusive growth. But many previous efforts to measure neighborhood change across jurisdictions relied on national datasets such as the American Community Survey (ACS) which are collected monthly but are published years later. Analyses that use these kinds of datasets can only be conducted after neighborhood changes happen and create harm, such as displacement or blight. To achieve more timely results, some (PDF) studies (PDF) have focused on single jurisdictions using more frequently updated datasets collected at the local level, such as parcel-level files or building permits. Although collected at more frequent intervals and at super-local levels, these data are not comparable across jurisdictions.
These approaches leave a gap in the literature for new ways to identify neighborhood change before or as it occurs and that can be applied across jurisdictions so local policymakers and community organizations can quickly adopt and use the approach. In considering how to best fill this gap, we turned to machine learning (ML) methods, which hold potential to identify patterns from complex data across neighborhoods to now-cast or forecast neighborhood change. Researchers have previously used machine learning approaches to predict neighborhood change, such as predicting housing values using census data.
Pilot machine learning approach
To “train” our machine learning model, we needed to identify datasets that offer frequent, timely updates and nationwide coverage. The US Department of Housing and Urban Development (HUD) collects two valuable datasets that we hypothesized could be helpful for this kind of research:
1. Quarterly aggregate data collected by the United States Postal Service (USPS) on the counts of total, vacant, and “no-stat” addresses (addresses that USPS does not consider deliverable). These data can be used (PDF) to proxy for investment and disinvestment in neighborhoods over time.
2. Real-time data on the administration of HUD’s Housing Choice Voucher (HCV) Program. These data can show (PDF) changes in the low-income renter population and landlords’ willingness to accept housing vouchers.
Last year (PDF), the Urban Institute worked with experts in HUD’s Office of Policy Development and Research on a pilot project to assess whether we could use the USPS and HCV datasets to develop a model that could accurately now-cast neighborhood change between 2013 and 2018 in four core-based statistical areas (CBSAs): Akron, Cleveland, and Youngstown, Ohio and Washington, DC.
Before we could begin training the ML algorithm, we had to determine how to define and measure neighborhood change. We sought to measure three broad types of neighborhood change: gentrification, decline, and inclusive growth. Unfortunately, the existing housing literature has not reached a consensus on the definitions of these types of change. To arrive at meaningful definitions for our project, we scanned the literature to identify commonalities across different definitions to create our own starting definitions. We then tested and iterated on our draft definitions with local experts in our four pilot CBSAs to arrive at a set of definitions that reflected their local expertise and lived experience of change in their communities (see appendix A of the Cityscape article (PDF) on this research for more detail):
· Gentrifying: Neighborhoods with high displacement risk at the beginning of the time period that experience a reduction in the low-income population and increases in rents, home values, homeowners, residents with bachelor’s degrees, and non-low-income residents greater than the median for neighborhoods in their CBSA.
· Declining: Neighborhoods below the 75th percentile in their CBSA for home values and rents that experience decrease in total population and increases in the proportion of vacant addresses and low-income households greater than the median for neighborhoods in their CBSA.
· Inclusively Growing: Neighborhoods where growth in low-income and non-low-income households are positive and greater than the median for neighborhoods in their CBSA.
We assign each neighborhood in our analysis to one of the neighborhood change types or categorize it as “unchanging” if it doesn’t meet any of the definitions. With the final definitions, 37 (1.6 percent) of the neighborhoods in our four pilot CBSAs are classified as gentrifying, 188 (8.1 percent) as declining, 462 (19.9 percent as inclusively growing, and 1,630 (70.3 percent) as unchanging.
We used the USPS and HCV data, along with the 2016 ACS data (the latest year available in our prediction year of 2018, which has a wide array of information about US people and communities) to generate a number of variables, or features, we believed would be predictive of each of our neighborhood change types. We then randomly split the neighborhoods in our pilot CBSAs into a training set and a testing set. We fitted models on the training data using different machine learning algorithms to predict which neighborhood change outcome each neighborhood in the training set would experience in 2018.
We evaluated the models using the unseen testing set data to estimate how our model would perform on new data in future years. To contextualize the results, we compared the model results against two baseline approaches that approximate how policymakers and researchers might seek to evaluate neighborhood change without our model:
· Using the 2016 ACS data to measure neighborhood change: We view this as representing the status quo of relying on time-lagged national datasets for neighborhood change analysis.
· Using a set of simple rules for change in vouchers, total addresses, and active addresses between 2013 and 2018 to predict neighborhood change type in 2018: This represents a more transparent, less methodologically complex approach using the USPS and HCV data than our ML modelling approach.
For each approach we measure precision, or percentage of neighborhoods identified by the model as changing that actually experienced that type of change. We also looked at accuracy, or percentage of neighborhoods correctly identified by the model (changing and unchanging). Across both the precision and accuracy evaluation metrics (table 1), our best model outperformed the baselines.
Table 1: Performance of Machine Learning Model and Baselines
Although these results are promising, we found that the model suffered two key limitations. First, the model performance was significantly better at identifying declining and inclusively growing neighborhoods than gentrifying neighborhoods, likely because of the very small number of gentrifying neighborhoods in our data. Second, given that our model only focuses on four CBSAs in Washington, DC and Ohio, further testing was needed to examine whether the results generalize to other areas.
Teaming up with IBM
Over the past several months, researchers from the Urban Institute collaborated with IBM’s Data Science and AI Elite Team to build on this foundation and make several improvements to our approach.
First, the IBM team used a novel dataset provided by Zillow, an online real estate company that provides a platform to buy, sell, and rent real estate across the US. The Zillow dataset consists of housing values and renting costs across the country, which enabled us to calculate variations in local house prices and rents. The Zillow House Value Index reflects the average value for homes between the 35th and 65th percentiles, and the Zillow Observed Rent Index captures the average of listed rents in the 40th to 60th percentile range for all homes and apartments in a given region. The two datasets can capture the entire US market, granular data down to the zip-code level, and real-time updated data published on a monthly basis, allowing stakeholders and researchers to assess current changes in neighborhood conditions. Given these advantages, we chose to use the Zillow data to measure housing market changes instead of the USPS data, which are only available to governmental entities and nonprofit organizations.
Second, we expanded to eight CBSAs to increase the amount of data available for model training. We choose four additional CBSAs similar to the four in the original pilot in terms of economic growth and demographic composition based on data from the ACS: Baltimore, Maryland; Charlotte, North Carolina; Raleigh, North Carolina; and Richmond, Virginia.
After joining the Zillow data with the ACS and HUD HCV data, we cleaned the newly created dataset and turned to building upon the work of the original pilot to derive new insights and increase model performance.
The team tested several different machine learning algorithms on the expanded dataset before selecting XGBoost as the best approach, which is a machine learning algorithm that trains many different models and combines what they learn to make predictions. The IBM team also dropped highly correlated variables from the dataset to keep only those that provided relevant information to model performance. The resulting variables were distributed across six main categories:
· income: household income, percentage and absolute change high income, and percentage and absolute change low income
· housing demographics: percentage owner-occupied housing, renter-occupied housing, and vacant housing
· home value: mean home value and percentage and absolute change in mean home value over 4 months, 12 months, and 5 years
· rent costs: mean rent, percentage and absolute change in mean rents over 4 months, 12 months, and 5 years
· housing choice vouchers: current number and change in vouchers distributed
· age and education demographics: percentage of population older than 25 and the percentage of the population older than 25 with a bachelor’s degree and without a bachelor’s degree
Although the new model’s accuracy was lower than the original pilot, there was significant improvement in both weighted recall — the percentage of all changing neighborhoods the model identifies as experiencing the correct type of change — and weighted precision, that is, the percentage of neighborhoods identified to change that actually changed. Weighted metrics use the distribution of the different neighborhood change types to give more importance to the correct classification of minority classes (gentrifying, declining, and inclusively growing) by penalizing each wrong majority class (unchanging) classification.
Finalizing the model
To improve the model’s performance, the IBM team took a two-step approach: train a binary XGBoost classifier to quantify the probability that a neighborhood was gentrifying, then use the probabilities output from the model as an input into a second XGBoost model predicting whether a tract was gentrifying, declining, inclusively growing, or unchanging. The team used this approach to improve results for gentrifying neighborhoods, given their small numbers in even the expanded data.
The team first tried to use the two-step approach for all eight CBSAs together, following the pilot’s approach of modeling multiple CBSAs using a single model. But in this approach, the model performance varied significantly by CBSA. As the team explored this result, they found that the top variables driving neighborhood change were different for each CBSA. As such, the IBM team decided to build a model tailored specifically for each CBSA. Modeling at the CBSA level improved model performance substantially for all but one of the CBSAs in the sample and considerably out-performed the previous ACS baseline.
Table 2: Comparing Final Model Performance and Previous ACS Baseline
Looking to the future
Future work should focus on assessing whether the most important variables that contribute to neighborhood change within each CBSA can be used to cluster CBSAs. Although the individual-CBSA results are promising, we suspect the performance could be improved by having more training data — or more neighborhoods — through creating meaningful clusters of CBSAs. One potential approach is clustering based on important variables that contribute to neighborhood change and creating one model for each cluster of CBSAs instead of a model for each individual CBSA. Then, individual jurisdictions could use the appropriate “cluster model” for their jurisdiction to predict local neighborhood change. Future work could also add in additional high-value datasets (PDF), such as Home Mortgage Disclosure Act data, LEHD Origin-Destination Employment Statistics data, or the USPS data on active addresses used in the initial pilot that HUD makes available for work conducted by nonprofits, academic researchers, and local governments.
Throughout the partnership between the Urban Institute and IBM, the team emphasized developing tools that enabled collaborative work and asset production so policymakers and community organizations could leverage the resulting approaches and tailor them to their own communities. To that extent, the IBM Cloud Pak for Data platform was used to foster collaboration between the IBM and Urban teams by allowing them to easily share assets, such as Jupyter Notebooks. The IBM team also made use of IBM Cloud Pak for Data services such as the Auto-AI capabilities in IBM Watson Studio to rapidly establish model performance baselines before moving on to more sophisticated approaches.
Extending our work to your community
Machine learning is a powerful tool for policymakers, researchers, organizers, and community members to identify neighborhoods likely to undergo change and empower those leaders to take actions that mitigate harmful effects, such as displacement of marginalized communities or population decline, and promote the positive effects, such as inclusive growth.
We produced a set of assets that aims to make our work as accessible as possible so you can apply our approach to your own community or develop your own models!
Jump into our project by downloading our Cloud Pak for Data Accelerator project , where you can review a cheat sheet of how the different pieces fit together, explore all of the data required to get started, and run all of the Jupyter Notebooks for data processing and modeling. To dig deeper into our methods, read our white paper.