ADR 0011 — Vision activation: reactive, не predictive

Date: 2026-05-22
Status: Accepted
Feature: URL-import
Affects: url_import_spec.md § IV Phase 7

Context

После acceptance gate (ADR 0009) если code path не прошёл — нужно подключить vision. Вопрос: когда именно?

Изначальный draft (rejected): predictive triggers

"Если компонент содержит canvas → vision"
"Если CSS-in-JS detected → vision"
"Если 4+ states → vision (complex)"

Problems:

Heuristics arbitrary, не data-derived
False positives — vision активируется для cases когда code path бы справился
Cost waste — $0.005 per ложная активация × 20-30% URLs = реальный overhead
Hard to evolve — добавить новое правило = revisit все existing

Decision

Reactive activation — vision подключается ТОЛЬКО после code path fails acceptance gate:

async function pipeline(url) {
  const spec = await codeExtract(url);                    // Phase 1-4
  const tsx = await generateTsx(spec);                    // Phase 5
  const test = await acceptanceGate(tsx, screenshot);     // Phase 6
 
  if (test.ok) {
    return { tsx, mode: 'code-only', cost: 0.001 };
  }
 
  // Reactive: vision только сейчас, и только на failed fields
  const enriched = await visionEnrich(spec, test.failedFields, screenshot);  // Phase 7
  const tsx2 = await generateTsx(enriched);
  const test2 = await acceptanceGate(tsx2, screenshot);
 
  if (test2.ok) return { tsx: tsx2, mode: 'code+vision', cost: 0.005 };
 
  // Full vision fallback
  const visionSpec = await visionExtractFull(url, screenshot);
  // ...
}

Key property: каждое vision activation = response к concrete acceptance failure, не speculation.

Why reactive wins

Zero false positives — vision не запускается "на всякий случай"
Targeted enrichment — только failedFields, не вся spec заново
Cost transparency — $0.005 per vision activation, считается per URL
Pipeline self-tuning — bad code path improvements automatically reduce vision activations

Trade-off

Latency: при code fail добавляется vision call (~5-15s)
- Mitigated: OmniParser pre-filter (Tier 2+) reduces VLM tokens 5-20×
- Acceptable — total p95 < 30s budget держится
Worst case: 2 generations per component if first fails (regen после vision enrich)
- Mitigated: vision activates только на failed fields, не whole spec

Consequences

Pros:

0 false vision activations
Cost scales с complexity автоматически (simple sites cheap, complex pay more)
Adding new code extraction layers reduces vision usage без code changes

Cons:

Slower на failure path (sequential: code → fail → vision → retry)
Не "fail fast" в смысле that complex sites вижу expensive

Distribution impact

При reactive activation:

Bootstrap (M1-3): 38% URLs hit vision (code path immature)
Steady (M6+): 30% URLs hit vision (cache + atoms + LoRA mature)
Mature (M12+): 18% URLs hit vision (compound effects)

vs predictive activation: 50%+ baseline vision usage из-за false positives.

Alternatives rejected

A. Predictive triggers (original)

❌ False positives waste cost
❌ Heuristics не calibrated

B. Always vision (vision-first, см ADR 0007)

❌ 5× cost overhead, документ rejected этот подход целиком

C. Vision in parallel с code, take whichever better

❌ Vision cost paid even when code wins (~$0.005 per URL guaranteed)
❌ "Whichever better" requires comparison logic → back to weighted scoring (см ADR 0009)

Cross-references

Main spec § IV Phase 7
Main spec § 0.6 Pivot 5 — pivot history
ADR 0007 — code-first foundation
ADR 0009 — what defines "failure" triggering reactive

ADR 0010 — Completeness: матрица combinations, не point check ADR 0012 — Uniqueness: user decides, не algorithm