Available Benchmarks
Choose a benchmark to run.
Each benchmark tests a specific AI use case in municipal government. Select the one that matches your system.
Live
311 Chatbots
Evaluates AI-powered 311 chatbots for task performance, safety, and accessibility. Tests 20 expert-designed scenarios covering pothole reports, permit questions, trash schedules, and more.
20 scenarios · ~15 minutes · Free
Roundtable pending
Generative AI for Police Reports
Evaluates AI tools that assist law enforcement with incident report writing. Practitioner roundtable in development — benchmark will be published once validated.
In development · Expected 2026
New benchmarks are added after completing a practitioner roundtable. Volunteer for an upcoming roundtable →