> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nodaldata.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Evaluation as you build it

> The interview harvests labeled ground truth, so you can measure accuracy with vs. without context.

<Info>Starter page — expand with a sample eval report and the harness interface.</Info>

The distinctive thing about building context through an interview is that **the measurement comes
for free**. Every disambiguation the analyst makes ("active client means X, not Y") is at once a
context entry and a labeled eval pair. Building context *is* harvesting ground truth.

## The local eval delta

After you define a domain, run the delta to see how much the context helped:

```
"Run the eval delta on session-financials."
```

The open-source `eval_harness/` runs your agent **with** the context and **without** it against the
harvested pairs and reports the accuracy difference — a concrete number you can show.

## Format-agnostic

The harness reads ACF, dbt models and docs, or raw markdown, normalizes them, and measures the
delta the same way — so you can evaluate context you already have, not just ACF.

## From one-shot to continuous

The one-shot, run-locally eval delta is **free**. Continuous re-evaluation, drift detection, and
observability across a team are the hosted product —
[see enterprise evaluation](/enterprise/evaluation).
