Starter page — expand with the evaluation dashboard and drift workflow.
What the hosted evaluation adds
- Continuous re-evaluation — re-run the eval pairs on a schedule, not just once, and track accuracy over time.
- Drift detection — flag when an upstream dbt model a definition depends on changes, so a human re-confirms before the agent trusts stale context.
- Correction harvesting — turn analyst corrections into new labeled ground truth that feeds the next evaluation.
