How can data communicators lead inclusivity efforts for visualizations?
Following the June 2021 release of our Do No Harm Guide on how to take an equitable and inclusive approach to data communication, my coauthor Alice Feng and I have presented our work at numerous panels, conferences, and other engagements, where we’ve been asked thought-provoking questions.
I have compiled some of the most common queries with responses in an effort to answer all of these great questions for those who have attended our presentations and for those who have not. It’s important for us to keep an eye toward inclusivity and equity in how we represent people and communities in our data visualizations, which can help our colleagues take the same approach throughout the data collection, analysis, and communication process. If you’d like to learn more about the Do No Harm Guide, please check out the full report or a recording of my presentation at the September 2020 New York City Data Visualization meetup.
1. How can we reconcile inclusiveness with an audience’s preconceptions?
In his book, How to Be an Antiracist, Ibram X. Kendi writes, “What people see in themselves and others has meaning and manifests itself in ideas and actions and policies, even if what they are seeing is an illusion.” Kendi also equates race to a “mirage” but says that doesn’t make it any less real to viewer. As data communicators, we are not going to overcome the racial constructs we have built. Instead, we can try to guide our readers and users toward a better, more inclusive view of race, ethnicity, gender, sexuality, and other characteristics.
An audience member at the EU Open Data Days conference asked if we should be classifying people at all. That is, if race is a construct, as Kendi says, wouldn’t it be better to avoid using it? As data communicators, I don’t think we can ignore these differences — or believe we are better off if we do — especially when present and historical inequities require attention and discussion. We can contribute more by providing solutions through our data communication efforts.
2. How can we approach using icons and colors while keeping inclusivity at the forefront of visualizations?
As much as I worry about icons not being inclusive, that doesn’t mean we should abandon them altogether. We need to be careful about which icons we choose and how we use them. I’ve started moving toward abstract shapes, and, even though such shapes don’t necessarily enable the reader to directly connect with the image of a person, seeing the individual data point is useful. The more we use diverse and inclusive icons, the more readers will be exposed to and accustomed to seeing them, which hopefully means they will become more easily recognizable.
For colors, I’m not sure there are specific palettes we consider inherently inclusive. There are palettes to avoid, like those tied to skin tone or gender stereotypes. Our example in the Do No Harm Guide is the June 2020 MIT Student Diversity Dashboard, where students of color were represented in shades of red, and white students were represented in blue (the current version has a slightly different color palette). That palette creates a perspective that white students are the default because they were given a distinct hue. It’s not that the red-blue color palette itself is inherently exclusive, but the way it was used was not inclusive.
I recognize we took a particularly US-centric perspective in the Do No Harm Guide and that how human characteristics and identities are categorized and discussed around the world can differ. In one presentation, I was informed that the word “colored” is included in official surveys in South Africa. But “colored” has a different connotation in South Africa than in the US. We are certainly interested in better understanding these international differences, and I am currently working up to a new project on this. So stay tuned.
3. What other resources are available to learn more about diversity and applying inclusivity to data visualization?
Unfortunately, there isn’t a large body of work on inclusivity and equity as it applies to data visualization. We included resources at the end of the Do No Harm Guide but want to highlight a few:
§ Catherine D’Ignazio and Lauren F. Klein’s Data Feminism and Caroline Criado Perez’s Invisible Women: If you want to refine your thinking about how data and data visualization fails to account for gender, I can’t recommend these books enough.
§ Sarah Williams’ Data Action: Using Data for Public Good: If you want to see how data and data visualization can help solve real world problems, this is the book for you. Williams explores several examples of how data were collected, analyzed, and communicated to empower people and improve lives.
§ G. Cristina Mora’s Making Hispanics: How Activists, Bureaucrats, and Media Constructed a New American: If you’ve ever wondered how the US arrived at the words “Hispanic” and “Latino” to describe that ethnic group, this book is a fascinating read. Mora argues it was the convergence of three primary trends: the civil rights movement in the 1960s and 70s; the federal government’s data collection efforts, primarily through the US Census Bureau; and the growth of Spanish language television.
§ There are also several resources that address different aspects of inclusivity in collecting data and writing about the results:
· Child Trends’ “Ethnic Equity Perspective in Research: Practical Guidance for the Research Process” (PDF)
· Actionable Intelligence for Social Policy’s “A Toolkit for Centering Racial Equity Throughout Data Integration”
· Chicago Beyond’s “Why Am I Always Being Researched?”
For learning data visualization specifically, I would recommend the work of Alberto Cairo, Cole Nussbaumer Knaflic, Andy Kirk, and Nathan Yau, as well as my most recent book, Better Data Visualizations. If you are interested in other resources, I also maintain lists of my favorite blogs, books, and other resources on my website.
4. What techniques do you recommend to be as true as possible to the reality of the data?
Mark Twain famously said, “There are three types of lies: lies, damned lies, and statistics.” Obviously, as data communicators, we don’t want our work to be seen as inherently untrue or misleading, so we must work to present our visualizations as accurately as possible.
Take choropleth maps, which add color to geographic units on a map to represent the data. This approach can be problematic: On the one hand, people love maps because they are familiar, and they can see themselves in the data. On the other hand, having the geographic unit not necessarily correspond to the importance of the data value presents an inherent problem. When making a map, we need to ask if the story we are trying to tell is inherently geographic. In other words, ask if a map is the best way to present the data.
Ultimately, data communicators should be attuned to the realities of the variation in their data, no matter the type of visualization. Many data visualizations can show variation in our data more precisely than simple aggregates like means, medians, and specific percentiles. Consider this animated gif from Justin Matejka and George Fitzmaurice, which shows seven datasets changing over time. The three graph types all show the same datasets, but the box plot, which shows five specific percentile points, doesn’t change. The raw data and the violin plots more clearly show the data.
Other graph types like strip plots or beeswarm charts show more of our data than specific points in the distribution. Of course, these options don’t work for every sample size: showing 200 individual data points is going to be much different than showing 200,000 individual data points. For more options, I have an ongoing collection of data visualizations in my data visualization catalog.
5. How can data communicators lead inclusivity efforts for visualizations?
In many organizations, the data visualization team is a gatekeeper to publishing the final product. Whether it’s to finalize some code or to ensure final visualizations are meeting style guide standards, the data visualization team often has the last look. As such, it’s important for us to keep an eye toward inclusivity and equity in how we represent people and communities, which can help our colleagues take the same approach throughout the data collection, analysis, and communication process.
Ultimately, ensuring inclusivity in data visualization is a constant and ongoing effort, as is our work on the Do No Harm Guide. Questions like these from active data practitioners and communicators continue to inform this work as it grows. Hopefully, these answers can help you to apply these lessons to your own data visualizations as we strive for inclusivity.