ADR 0008 — Hybrid stack: schema-driven priority chains + reactive vision

Date: 2026-05-22
Status: Accepted
Feature: URL-import
Affects: url_import_spec.md § IV (pipeline), § V (schemas)

Context

После ADR 0007 — code-first principle accepted. Но как именно code и vision комбинируются?

Юзер задал architecture explicitly:

"А почему ты не подумал над миксом этих двух подходов — нужно придумать чёткий стек забора информации с кода, если стек не заполняется на 100% с помощью кодозабирателей, то подключаем механизмы скринов."

Это не "code OR vision" (binary choice), а "code first, vision только для незаполненных полей" (priority chain с reactive fallback).

Decision

Three-layer architecture:

1. Schema-driven extraction (per-field priority chains)

Каждое поле ComponentSpec имеет независимую цепочку источников:

color:     source_map → CSS variable → computed_style → vision
spacing:   CSS variable → computed_style → vision
state:     ARIA → JS event listeners → DOM diff → vision
typography: CSS variable → computed_style → vision

Следствие: частичный отказ одного слоя ≠ отказ всего pipeline. Coverage считается per-field, не monolithic.

2. Coverage gate (после code extraction)

coverage = filled_required_fields / total_required_fields
if (coverage >= 0.90) → пропускаем LLM enrichment, идём к TSX generation
if (coverage < 0.90)  → text LLM enrichment (Gemini Flash-Lite, ~$0.0005/component)

3. Reactive vision (только после acceptance fail)

if (acceptanceGate(generatedTsx).ok) → done (mode 'code-only')
else → visionEnrich(failedFields) → re-generate → re-acceptance

См ADR 0011 для деталей reactive activation.

Consequences

Pros:

Cost — паid path активируется ТОЛЬКО когда нужно (~30-38% URLs реально triggering vision)
Robustness — отказ слоя не топит pipeline
Provenance — каждое поле знает откуда оно пришло (debug + audit)
Расширяемость — добавить новый источник = добавить tier в priority chain

Cons:

Pipeline сложнее tested (per-field paths require comprehensive tests)
Provenance bookkeeping overhead в storage (~5-10% больше per ComponentSpec)

Schema impact

type ComponentSpec = {
  // ... все поля содержат provenance
  props: { [name]: { type, required, default?, provenance: Provenance } };
  states: { [name]: { ...style_overrides, provenance: Provenance } };
  tokens: { ...каждый token с provenance };
  // ...
};
 
type Provenance = {
  source: 'source_map' | 'css_variable' | 'computed_style' | 'dom' | 'aria'
        | 'llm_inference' | 'vision';
  layer: 1 | 2 | 3 | 4 | 5 | 6 | 7;
  confidence: 0-1;
  raw_value?: any;
  extracted_at: ISO8601;
  model_version?: string;  // REQUIRED if source in llm/vision
};

Alternatives rejected

A. Monolithic extraction с single confidence

❌ Любой failed sub-extraction = entire spec invalid
❌ Дебагабельность zero — нет per-field provenance

B. Vision-always (не reactive)

❌ См ADR 0011 — false vision activations

C. Multi-pass refinement (cycle code/vision/code...)

❌ Unbounded cost spike potential
❌ Diminishing returns — single vision pass обычно достаточно

Cross-references

Main spec § IV — pipeline implementation
Main spec § V.2 — Provenance schema
ADR 0007 — code-first foundation
ADR 0011 — reactive vision activation

ADR 0007 — URL-import: code-first, не vision-first ADR 0009 — Acceptance gate: 3 эмпирических bool, не weighted scores