ADRs
ADR 0011 — Vision activation: reactive, не predictive
  • Date: 2026-05-22
  • Status: Accepted
  • Feature: URL-import
  • Affects: url_import_spec.md § IV Phase 7

Context

После acceptance gate (ADR 0009) если code path не прошёл — нужно подключить vision. Вопрос: когда именно?

Изначальный draft (rejected): predictive triggers

  • "Если компонент содержит canvas → vision"
  • "Если CSS-in-JS detected → vision"
  • "Если 4+ states → vision (complex)"

Problems:

  • Heuristics arbitrary, не data-derived
  • False positives — vision активируется для cases когда code path бы справился
  • Cost waste — $0.005 per ложная активация × 20-30% URLs = реальный overhead
  • Hard to evolve — добавить новое правило = revisit все existing

Decision

Reactive activation — vision подключается ТОЛЬКО после code path fails acceptance gate:

async function pipeline(url) {
  const spec = await codeExtract(url);                    // Phase 1-4
  const tsx = await generateTsx(spec);                    // Phase 5
  const test = await acceptanceGate(tsx, screenshot);     // Phase 6
 
  if (test.ok) {
    return { tsx, mode: 'code-only', cost: 0.001 };
  }
 
  // Reactive: vision только сейчас, и только на failed fields
  const enriched = await visionEnrich(spec, test.failedFields, screenshot);  // Phase 7
  const tsx2 = await generateTsx(enriched);
  const test2 = await acceptanceGate(tsx2, screenshot);
 
  if (test2.ok) return { tsx: tsx2, mode: 'code+vision', cost: 0.005 };
 
  // Full vision fallback
  const visionSpec = await visionExtractFull(url, screenshot);
  // ...
}

Key property: каждое vision activation = response к concrete acceptance failure, не speculation.

Why reactive wins

  1. Zero false positives — vision не запускается "на всякий случай"
  2. Targeted enrichment — только failedFields, не вся spec заново
  3. Cost transparency — $0.005 per vision activation, считается per URL
  4. Pipeline self-tuning — bad code path improvements automatically reduce vision activations

Trade-off

  • Latency: при code fail добавляется vision call (~5-15s)
    • Mitigated: OmniParser pre-filter (Tier 2+) reduces VLM tokens 5-20×
    • Acceptable — total p95 < 30s budget держится
  • Worst case: 2 generations per component if first fails (regen после vision enrich)
    • Mitigated: vision activates только на failed fields, не whole spec

Consequences

Pros:

  • 0 false vision activations
  • Cost scales с complexity автоматически (simple sites cheap, complex pay more)
  • Adding new code extraction layers reduces vision usage без code changes

Cons:

  • Slower на failure path (sequential: code → fail → vision → retry)
  • Не "fail fast" в смысле that complex sites вижу expensive

Distribution impact

При reactive activation:

  • Bootstrap (M1-3): 38% URLs hit vision (code path immature)
  • Steady (M6+): 30% URLs hit vision (cache + atoms + LoRA mature)
  • Mature (M12+): 18% URLs hit vision (compound effects)

vs predictive activation: 50%+ baseline vision usage из-за false positives.

Alternatives rejected

A. Predictive triggers (original)

  • ❌ False positives waste cost
  • ❌ Heuristics не calibrated

B. Always vision (vision-first, см ADR 0007)

  • ❌ 5× cost overhead, документ rejected этот подход целиком

C. Vision in parallel с code, take whichever better

  • ❌ Vision cost paid even when code wins (~$0.005 per URL guaranteed)
  • ❌ "Whichever better" requires comparison logic → back to weighted scoring (см ADR 0009)

Cross-references