Changelog
Release Date
August 24, 2025
August 24, 2025
Weekly Digest
Tracing
- Transcript group display in the front-end + some quality of life for dealing with many transcripts in an agent run
- Disable tracing by setting an environment variable
- Miscellaneous improvements: better error messaging, more robust ingestion, support for new formats, the ability to have fine-grained control over which libraries get instrumented, and bug fixes
Ingestion
- Improved speed and reliability when ingesting large Inspect
.eval
files; uploading logs of multiple GBs now works
Rubrics and judges
- Bring your own API keys for running rubrics with OpenAI/Anthropic/Gemini models
- Choose a specific LLM when running rubrics
Release Date
August 17, 2025
August 17, 2025
Weekly Digest
Transcript ingestion
- Removed metadata schemas, so ingestion no longer requires predefined schemas
- More robust Inspect one-click ingestion: Docent used to silently drop metadata and scores when using our file upload or drag-and-drop ingestion; this is now fixed.
- Auto-reload agent run list after successful file upload
Tracing (now in public preview)
- Transcript groups: Ability to group transcripts in the Docent UI, with support for hierarchical relationships between groups.
- Incremental loading: Tracing data is now loaded into the collection as it is being run.
- Improved instrumentation: Auto-instrumentation is now more targeted and can also be manually configured.
- Improved stability: Various bug fixes, primarily related to supporting a wider range of models and formats.
- Documentation: Now publicly available.
Charts
- Export charts as PNG and CSV
- Resizable chart area
Misc
- Upgraded to Claude 4 Sonnet from Claude 3.7 Sonnet for rubric evaluation
- Internal infrastructure improvements
Release Date
August 10, 2025
August 10, 2025
Weekly Digest
Rubrics and judges
- Rubric versioning: Edits to rubrics are now versioned, allowing you to compare results to past iterations
Charts & metadata
- Any numeric metadata usable as a chart measure: Previously, only scores were allowed
- Charting uses latest rubric version to avoid plotting stale judge results
Infrastructure
- Improved scalability of application serving infra
Release Date
August 3, 2025
August 3, 2025
Weekly Digest
Clustering & Charts
- Clustering metrics: Allow switching between counting total results vs. unique agent runs
- Interactive chart filtering: Click on chart elements to filter data dynamically
- Improved chart creation workflow: Auto-populate new charts with valid data
- Fix charts sharing: Previously did not work
(New) Tracing & instrumentation
This feature is still in private preview
- Auto-instrumentation of OpenAI and Anthropic Python SDKs
- Simple Python SDK with decorators and context managers for tracing agent scaffolds
- Auto transcript splitting logic that converts individual LLM calls into contiguous transcripts
Infrastructure
- VPN networking infrastructure for private access to Docent
- End-to-end PostHog analytics for internal observability
Release Date
July 27, 2025
Major ReleaseJuly 27, 2025
Majorv0.1.1-alpha
(New) Rubric search
- Complete rewrite of global search: Significantly better search and clustering reliability, especially across concurrent users
- New rubric schema for searches that enforces inclusion and exclusion rules for more precise results
(New) Quantitative visualization
- Charts for plotting agent runs, judge results, cluster centroids, and statistics. Allows for multi-dimensional slicing and filtering.
Data & ingestion
- Inspect logs in the web UI: Import & view Inspect logs directly from the website
- Embeddings at upload: Trigger embeddings computation as part of the upload flow
SDK & APIs
- Query agent runs via SDK for easier scripting/automation
Release Date
July 1, 2025
Major ReleaseJuly 1, 2025
Majorv0.1.0-alpha
Performance & scalability
- Performance improvements: Docent now smoothly supports 50-100K transcripts, up from 0.5-1K previously
- Faster search: Added re-ranking to global search, returning results ~5x faster
- Robust background processing: Searches now run as background batch jobs, resilient to frontend disconnections
(New) Multi-user collaboration
- Multi-user support and access controls: Share Docent results with others and control what they can do
- Self-hosting ready: Deploy once and support any number of users on the same instance
(New) Multi-agent
- Basic multi-agent transcript support: Basic transcript format for multiple agents
Bug Fixes & Stability
- Fixed numerous bugs and improved overall system reliability
Release Date
March 24, 2025
Major ReleaseMarch 24, 2025
MajorResearch Preview
The initial release of Docent, a system for analyzing and intervening on agent behavior.
Read more