Available Benchmarks

Choose a benchmark to run.

Each benchmark tests a specific AI use case in municipal government. Select the one that matches your system.

Live

311 Chatbots

Evaluates AI-powered 311 chatbots for task performance, safety, and accessibility. Tests 20 expert-designed scenarios covering pothole reports, permit questions, trash schedules, and more.

20 scenarios  ·  ~15 minutes  ·  Free

Run this benchmark → View methodology →

Roundtable pending

Generative AI for Police Reports

Evaluates AI tools that assist law enforcement with incident report writing. Practitioner roundtable in development — benchmark will be published once validated.

In development  ·  Expected 2026

Coming Soon

New benchmarks are added after completing a practitioner roundtable. Volunteer for an upcoming roundtable →