Skip to content

Specs

Implementation-ready (or implementation-planned) specifications for changes that require code modifications.

Proposed (Prediction Modes)

Spec Title Description
061 Total PHQ-8 Score Prediction Predict total score (0-24) instead of item-level
062 Binary Depression Classification Binary classification (PHQ-8 >= 10)
063 Severity Inference Prompt Policy Allow inference from temporal/intensity markers

These specs address the task validity problem: PHQ-8 item-level frequency scoring is often underdetermined from DAIC-WOZ transcripts (see docs/clinical/task-validity.md).

Deferred

Archived (Implemented)

Implemented specs are distilled into canonical (non-archive) documentation under docs/:

Pipeline Robustness (Specs 053-057) - PR #92, 2026-01-03

Spec Title Canonical Doc Location
053 Evidence Hallucination Detection Evidence Extraction, Features
054 Strict Evidence Schema Validation Evidence Extraction, Exceptions
055 Embedding NaN Detection Artifact Generation, Debugging
056 Failure Pattern Observability Error Handling, Debugging
057 Embedding Dimension Strict Mode Artifact Generation, Configuration

JSON Reliability (Specs 058-060) - 2026-01-04

Spec Title Canonical Doc Location
058 Increase PydanticAI Default Retries Configuration, JSON audit
059 json-repair Fallback Evidence Extraction, JSON audit
060 Retry Telemetry Metrics Error Handling, Debugging

Other Implemented Specs

Historical spec texts remain in docs/_archive/specs/ for provenance, but the active documentation should not require them.