Web Tools Policy Optimization Roadmap

Related Linear issue: MUL-267

Context

The current web evidence guard solved the immediate quality issue:

It enforces web_search -> web_fetch evidence coverage in runtime.
It blocks snippet-only finalization in key web-dependent cases.

However, semantic intent detection currently relies on hard-coded regex cue groups in packages/core/src/agent/web-tools-policy.ts. This is deterministic but not ideal for long-term maintainability and multilingual robustness.

Problem Statement

Current limitations:

Semantic classification logic is tightly coupled with runtime enforcement code.
Pattern lists are code-level constants, making iteration high-friction.
Coverage expansion risks overfitting and regression without a stronger benchmark loop.

Target Architecture

Use a hybrid policy model:

Deterministic guardrail layer (must keep)

Tool-trace based invariants (e.g. search/fetch sequencing, minimum successful fetch count).

Semantic decision layer (new)

Lightweight model/classifier returns decision + confidence + reason codes.

Rulepack fallback layer (refactor existing patterns)

Externalized locale-aware cue packs for conservative fallback only.

Migration Plan

Phase 1: Decouple configuration

Move regex cue groups out of web-tools-policy.ts into a policy registry.
Keep behavior equivalent.

Phase 2: Add semantic classifier path

Add an optional semantic decision step with confidence threshold.
Preserve deterministic tool-trace constraints as final authority.

Phase 3: Observability and tuning

Emit run-log fields for policy decision source:
- tool-trace
- semantic
- fallback-pattern
Add benchmark slices focused on false-positive/false-negative policy triggers.

Phase 4: Reduce hard-coded fallback

Keep only minimal safety patterns in code.
Shift language/phrase evolution to versioned config updates.

Acceptance Criteria

No large hard-coded regex arrays in runtime policy file.
Semantic decision path is independently testable and feature-flagged.
Baseline behavior remains backward-compatible for existing guard cases.
Benchmark report shows equal or lower policy misfire rate.

Non-goals

Replacing deterministic tool-trace enforcement with pure model decisions.
Expanding scope to unrelated tool policy domains in the same iteration.

2.4 KiB Raw Blame History