Illustration by Alysheia Shaw-Dansby

Exploring AI: How Urban Is Piloting Guidelines around Using AI in Our Research and Policy Work

7 min readMay 23, 2024

Everyone I talk to these days is thinking about artificial intelligence (AI). My peers at research organizations, nonprofits, government agencies, and philanthropies are hungry to learn what others are doing to avoid reinventing the wheel. The Urban Institute’s own AI plan is to build on the shoulders of these peers, whom I’ve learned from extensively as I gathered information across sectors this past year.

In the spirit of giving back, I want to show how Urban has explored the use of AI in general, and generative AI more specifically, in the hopes that others find this information just as useful.

Urban’s framework for piloting AI

As soon as ChatGPT exploded on the scene last year, Urban began thinking about how to integrate generative AI tools into our research and operations work. Beginning in September 2023, Urban surveyed staff and created internal processes and committees to guide our decisions on how best to pilot and implement these tools.

Currently, as the chief information officer, I lead two AI advisory committees:

1) a group of senior leaders who represent a wide array of Urban’s research and operational centers and offices

2) a group of early career to midlevel employees who use AI regularly outside of work and are interested in ensuring its appropriate use in the workplace

Each committee is charged with advising on institutional guidelines around AI during our AI pilot but are not responsible for creating institutional policy. Though having two committees does require additional work, their small size — each has only six or seven members — and unique perspectives make for lively, informative discussions and give Urban invaluable insight on how to balance institutional risk against practical value. The senior leaders have helped steer institutional policy, establish boundaries, and discuss how to sensitively approach cultural elements of change, while earlier career colleagues have experimented extensively with AI and have helped craft detailed, grounded guardrails formed from their experience using the tools.

For our AI pilot, approximately 10 percent of staff volunteered to participate, representing a broad swath of departments, functions, and experience. In March 2024, Urban began to pilot seven different generative AI tools:

The tools were selected based on input from Urban staff, the two advisory committees, Urban’s board, and a tour of AI conversations I undertook in 2023 with philanthropic, think tank, nonprofit, and government leaders on AI. Ultimately, I used my judgment to select tools I felt had the best chance at making a major impact in supporting programming, summarization, and documentation while reducing risk as much as possible.

Our plan is to run the pilot through the end of the year and green-light tools that demonstrate they can sufficiently improve the quality or efficiency of Urban’s work relative to cost and improve employee experience without reducing employee autonomy or sense of self-worth.

Creating guidelines for our AI pilot

During our AI pilot, participating staff had to adhere to Urban’s existing polices and processes. Additionally, we developed the following set of pilot-specific guidelines based on general best practices from external institutions and some careful considerations customized to our work and organization:

· All outputs must be labeled as generated by generative AI, documented, and reviewed by the employee or a relevant expert for accuracy.

· Use of generative AI tools must be reported to supervisors and project leads.

· Employees must get consent from others on the project to use the AI tool, and without consent, the tool cannot be used.

· No AI tools should be used in activities that may directly affect the rights of individuals. However, research studies investigating such AI tools are permitted.

· Generative AI tools shall not be used with data subject to a specific data use agreement or institutional review board without approval from the relevant bodies overseeing those processes.

· All participants should be made aware of the risks associated with the tool and the data agreements the tool operates under.

· Only AI tools approved as part of the AI pilot are permitted for use.

· When using tools with a shorter track record of institutional trust at Urban (Scribe and Elicit), staff should initially avoid uploading or inputting any confidential or sensitive information into the tools.

Urban has created a central library organized by the seven tools. For each tool, users are asked to report use case information across three categories: tips and tricks for optimal tool use, examples of efficiency gained or lost, and recommendations about risk mitigation not previously considered. For example, users could create use case entries for the Copilot for Microsoft 365 section titled “providing writing feedback” and “summarizing Word documents,” each of which explore tips and tricks, efficiency, and risk mitigation. Pilot users will be reminded to share information on a regular basis, and some will be asked to present to staff as the pilot progresses. Failure to share information will mean that pilot users risk having their access to the tools revoked.

What we’ve learned so far

With our AI pilot, we wanted to understand and evaluate how each of the AI tools store, use, and transfer data, and whether those risks are acceptable. We evaluated each tool thoroughly in February 2024 and assigned each a risk rating, keeping in mind that these risks are specific to Urban and our current system configuration:

Microsoft Products (Github Copilot, Microsoft Copilot with commercial data protection, Copilot for Microsoft 365, and Teams Premium): Normal risk

  • Data are stored, processed, and transferred using similar processes to Urban’s existing Office applications and licenses. Data are sent to the Microsoft GPT model within the security and compliance boundaries of Microsoft. Urban may request Microsoft delete stored data. Prompts and suggestions for Github Copilot are deleted immediately.
  • Internal security risks may be elevated for certain products, such as Copilot for Microsoft 365 or Github Copilot, depending on the configuration of an organization’s environment and how the Microsoft AI tool uses that environment. Many organizations have reported difficulty configuring their environments in a way that prevents AI tools from surfacing confidential information to people within the organization without proper permissions. And certain permissions or uses in Github Copilot, such as allowing suggestions that match public code or using it outside of integrated development environments for chat and code completions, may increase the risk of copyright violations or data security if they are not prohibited.

Elicit: Normal risk

  • Elicit stores papers that are uploaded to the service, along with standard site activity information. Otherwise, Elicit provides little information on how data are stored, processed, or secured.
  • The company is relatively new and is a new procurement relationship for Urban, which would normally indicate that the tool should be rated as higher risk. However, for Urban’s business case, AI pilot users conducting literature reviews are unlikely to upload sensitive documents or confidential information and are equally unlikely to enter sensitive prompts. In addition, organizations may request to work with Elicit under an enterprise agreement to establish custom privacy guarantees.

Zoom AI Companion: Higher risk

  • Although Zoom states it doesn’t train its AI models with user data, they do say they may invoke other third-party AI models to provide the services. Zoom doesn’t permit those third-party companies to use user data to train or improve AI models, but this data sharing makes it more difficult for Urban to request the deletion of data or to control potential data use should any of these companies or their vendors change their terms of use.

Scribe: Higher risk

  • Scribe is a new company and vendor relationship for Urban, and internal processes are often confidential, leading to a higher risk designation for Urban. Scribe may share user data with third-party contractors as necessary. Data are encrypted at rest and in transfer, and there are additional tools available (for a price) for users to mask sensitive data. Urban may request that Scribe delete data, though Scribe may retain data if they believe there’s a legal reason.

Looking Forward

Generative AI is evolving rapidly, and it’s clear from my conversations across Urban and with my peers that we are all seeking to learn as quickly as possible. I hope this summary of Urban’s work helps educate others across nonprofit, research, and philanthropic sectors, just as others have helped to educate us. More importantly, I hope Urban’s careful, thoughtful approach helps others implement safer, more effective, and more equitable AI tools. We’ll be sure to share an update on this blog and in other forums once we’ve learned more from this current pilot phase.

Ultimately, I remain optimistic that with hard work, partnership, and information sharing across sectors, we can all harness the benefits of this new wave of technology, empowering our staff, mitigating risk, and making technology work better for everyone. I hope our research and policy work on the subject continues to grow and improve the adoption of sensible government regulation and enhancements in industry. But we’ve got a lot of work to do. Let’s get to it.

— Graham MacDonald

Want to learn more? Sign up for the Data@Urban newsletter.




Data@Urban is a place to explore the code, data, products, and processes that bring Urban Institute research to life.