IntelligenceX Roadmap

Status: In progress

Ops Agent Backlog

For the "smart colleague" chat agent roadmap (ADPlayground/TestimoX/ComputerX/EventViewerX + runtime parallelism), see Docs/agent-superpowers-backlog.md.

Chat/Tools Decoupling Migration

Track contract-first Chat/Tools migration execution against PLAN.md and PLAN-EXECUTION-ORDER.md.
Keep ADR state current in InternalDocs/architecture/adr-0001-chat-tools-contract-boundary.md.

PR Feedback Churn

Reviewer E2E Launch Plan

Status: In progress

Goal: reviewer + static analysis + onboarding (CLI + Web) feel "done" end-to-end for a new repo.

Acceptance (Definition Of Done)

A new user can run intelligencex setup wizard on a clean machine and reach "PR created" without manual repo edits.
A new user can run intelligencex setup web and reach "PR created" without manual repo edits.
First merged onboarding PR produces a successful review on the next PR (sticky summary + inline when supported).
Review comment always includes reviewed SHA and an explicit diff-range label (base -> head).
Static analysis runs before review, publishes artifacts, and the review comment always renders analysis status (pass/unavailable) even when findings are zero.
Static analysis gate behavior is predictable: failing types/severities are documented and match observed CI results.
Dependabot identity limitation is documented and visible during onboarding (reviews may be authored by github-actions).

Phase A — Onboarding UX (CLI + Web)

CLI wizard: add "Enable static analysis" toggle and pack picker (default all-50).
Web UI: add "Enable static analysis" toggle and pack picker (default all-50).
CLI + Web: show a final "Effective config" preview (review + analysis) before Apply.
CLI + Web: surface the Dependabot secrets limitation in the UI copy (why bot identity may differ).
CLI + Web: add a post-Apply "Verify" step (workflow present, config present if requested, required secrets present, last runs links).

Phase B — Setup Output (Workflow + reviewer.json)

Setup config writer: include analysis section in .intelligencex/reviewer.json when static analysis is enabled (create + merge paths).
Setup presets: define recommended tiers for analysis packs (all-50, all-100, all-500) and a "no analysis" option.
Ensure workflow/config generation stays stable across upgrades (managed block upgrades do not delete user customization outside managed block).

Phase C — Review Reliability (Reduce Churn, Increase Continuity)

Add a diff range option for incremental review (for example reviewDiffRange: last-reviewed uses the last reviewed commit from the sticky summary as base).
Add a deterministic "replay" mode for debugging: load a saved PR snapshot + artifacts and run review formatting without provider calls.
Ensure thread triage does not repeatedly re-suggest already-addressed items; prefer dedupe and summary stability over rewriting.

Phase D — Static Analysis Productization

Wizard: explain analysis gate semantics (which types/severities fail the check) and link to docs.
Add "list packs" affordance in onboarding (CLI and Web) so users can browse available packs.
Provide an optional "export analyzer config" path for IDE support (explicit opt-in, never default).
Add a CI guardrail: intelligencex analyze validate-catalog and pack integrity checks run on every PR that touches Analysis/Catalog or Analysis/Packs.

Phase E — Docs + Samples

Promote Docs/reviewer/static-analysis.md from Draft to stable docs (align examples with actual wizard output).
Add "First PR checklist" doc: what to expect after merging onboarding PR and how to debug common issues.
Add screenshots (CLI + Web) for the "Configure" step and the "Verify" step.

Phase F — End-To-End Tests

Add tests for setup plan generation: ensure enabling analysis produces analysis in reviewer.json and does not regress existing review settings.
Add tests for config merge behavior (existing reviewer.json + enable analysis preserves unrelated user keys).
Add unit tests for SetupAnalysisPacks.TryNormalizeCsv (empty/default, invalid chars, max ids/length, dedupe).
Add tests for web setup validation: analysis fields rejected when not applicable (config override, update-secret/cleanup), and rejected when analysisEnabled != true but gate/packs provided.

Engine Roadmap

Status: In progress

Now — Phase 1–2 (concrete)

Add error classification enum + mapping (S)
Add diagnostic context in reviewer output (request id, retry count, provider) (S)
Add retry policy options in config (backoff + max attempts) (M)
Gate fail-open to transient errors only (S)
Add connectivity preflight (DNS/TLS) with actionable errors (M)
Add diff-range selection + default (current/pr-base/first-review) (M)
Add include/exclude glob filters for files (S)
Add smart chunking (group related hunks) (M)
Add review intent presets (security/perf/maintainability) (S)

Phase 0 — Scope + success criteria

Define "engine" scope (review pipeline, providers, context builder, formatter, thread triage).
Capture success metrics (review latency, failure rate, reviewer usefulness score).
Decide default review mode + model policy (safe defaults).

Phase 1 — Reliability + diagnostics

Classify error types (transient vs auth vs config vs provider) with explicit codes.
Add structured diagnostics block to reviewer output (request id, retry count, provider).
Add connectivity preflight (DNS/TLS) with actionable error messages.
Add configurable retry policy with exponential backoff + jitter.
Add fail-open gating for transient errors only (explicit config).

Phase 2 — Context quality

Add diff-range strategy options (current/pr-base/first-review) with default.
Add file filters (include/exclude globs).
Add binary/generated file skipping.
Add smart chunking (keep related hunks together; avoid orphaned changes).
Add language-aware hints (prompt includes detected languages).
Add "review intent" presets (security/perf/maintainability).

Phase 3 — Review output + UX

Add optional "reasoning level" label in header (low/medium/high) when provider supports it.
Add optional usage/limits line near model header (opt-in; ChatGPT auth).
Add structured findings schema for bots/automation (severity + file + line).
Add summary stability (avoid noisy rewording across reruns).
Add “triage mode” that only checks open threads.

Phase 4 — Thread triage + auto-resolve

Keep thread triage in main review comment (configurable placement).
Support "explain why not resolved" replies (optional).
Add diff-based auto-resolve checks (explicit evidence required).
Add per-bot policies (auto-resolve only for our bot by default).
Add PR comment summarizing what was auto-resolved.

Phase 5 — Provider abstraction

Centralize provider contracts (capabilities, limits, streaming, auth).
Add provider capability flags (usage API, reasoning level, streaming).
Add opt-in provider fallback (e.g., OpenAI → Copilot).
Add provider health checks and circuit breaker.

Phase 5.5 — Code host support (Azure DevOps Services)

Define code-host interface (PR metadata, files, diff, comments, threads).
Add ADO auth options (PAT, System.AccessToken) + env var mapping.
Phase 1: summary-only PR comments (no inline) using ADO REST APIs (PR-level changes endpoint for full file list).
Document Azure auth scheme heuristic + override guidance.
Document PR-level changes behavior (uses pull request changes endpoint).
Phase 2: inline comments with iteration + line mapping support.
Phase 3: thread triage + auto-resolve via thread status updates.
Add CLI flags/config: provider=azure, azureOrg, azureProject, azureRepo, azureBaseUrl, azureTokenEnv.
Add ADO pipeline templates (onboarding, permissions, secrets).

Phase 6 — Performance + cost

Add response streaming where supported (show partial progress).
Add cache for context artifacts (diff, file lists, PR metadata).
Add concurrency controls to avoid API throttling.
Consider shared HttpClient/IHttpClientFactory for Azure DevOps client.
Add token budgeting per file/group with hard caps.
Add optional "budget exceeded" summary behavior.

Phase 7 — Security + trust

Redact sensitive data before prompt (secrets, tokens, private keys).
Sanitize Azure DevOps API error payloads in logs.
Add "untrusted PR" guardrails (no secret access, no write actions).
Add workflow integrity check (block self-modifying workflow runs).
Add audit logging for secrets usage.

Phase 8 — Testing + validation

Add deterministic test harness with recorded provider responses.
Add golden-file tests for formatter output stability.
Add smoke tests for thread triage + auto-resolve.
Add integration test for usage/limits display.

Phase 9 — Developer experience

Provide local "engine replay" CLI (load PR snapshot + run offline).
Provide structured JSON output mode for integrations.
Add config validator with helpful errors + schema links.

Onboarding Roadmap (Wizard + CLI)

Status: In progress

Phase 0 — Goals & constraints

Confirm onboarding goals (wizard + PR-only path + upgrade path)
Confirm default auth choice (vendor OAuth for single repo, BYO App for org)
Confirm secret handling policy (auto if Sodium, manual fallback)
Confirm UI choice (local web UI + Spectre.Console wizard)

Phase 1 — Core setup architecture (shared by CLI + UI)

Add SetupHost orchestration layer (single source of truth)
Add wizard state model (repos, config, auth, apply mode)
Implement Plan → Apply flow with dry-run output
Implement upgrade/modify detection (existing workflow/config)

Phase 2 — CLI Wizard (Spectre.Console)

Phase 3 — GitHub App Manifest (BYO App)

Manifest generation (pre-filled app definition)
Open GitHub "Create App from Manifest"
Handle callback and exchange code for app id + PEM
App install flow (select repos / all repos)
Store app credentials locally for reuse

Phase 4 — Secrets handling

Auto-encrypt and upload secrets when Sodium available
Manual secret fallback (print export + instructions)
Support INTELLIGENCEX_AUTH_KEY for encrypted store

Phase 5 — Local Web UI Wizard

Phase 6 — Reviewer improvements

Early auth validation with actionable errors
Safer auth store handling in reviewer
Explicit secrets in workflow (no secrets: inherit)
Retry extra ResponseEnded + fail-open summary option

Phase 7 — Docs & README

README rewrite (what it is, trust model, quickstart)
Docs: wizard onboarding, CLI quickstart, security/trust
Screenshot placeholders + asset folder

Phase 8 — Copilot (experimental)

Keep CLI Copilot optional provider
Research native Copilot feasibility
Add provider toggle in wizard

Phase 9 — DevEx automation

Auto-resolve IntelligenceX bot review threads after fixes (CLI command or GitHub App action)

Review Feedback Backlog (Bots)

Collapsed by PR. Includes only explicit checklist items found in bot reviews/comments.

PR #95 Fix duplicate weekly labels in usage summary

PR #109 Improve static-analysis visibility in review comments

PR #112 Address remaining static-analysis follow-up TODOs

PR #113 Improve static analysis policy readability with enabled-rules preview

PR #114 Improve static analysis policy with explicit failing/clean rule previews

Make AddOutcomeLines null-safe for findings input and preserve safe policy rendering when findings are null. Links: #114 (comment)
Refactor TryBuildBasePolicy multi-out return shape to a typed context object for maintainability. Links: #114 (comment)
Add docs note describing deterministic ordering and truncation behavior for outcome preview lines. Links: #114 (comment), #114 (comment)
Add regression test for null findings with a present load report. Links: #114 (comment)
Add regression test asserting deterministic ordering for failing and outside-pack preview sections. Links: #114 (comment)
Sort failing-rule preview by finding count (desc) then rule id to keep truncation behavior focused on highest-impact failures. Links: #114 (comment)
Align static-analysis docs sample with actual truncation behavior when only 5 enabled rules are shown. Links: #114 (comment)
Keep explicit aggregate outside-pack count assertion in analysis policy tests to protect status/count semantics. Links: #114 (comment)

PR #115 Harden static analysis policy load path and preview tests

PR #116 Cleanup static-analysis review followups and TODO backlog

Remove raw-string snapshot literal in analysis-policy test to keep Windows/net472 compilation compatible. Links: https://github.com/EvotecIT/IntelligenceX/actions/runs/21765222612/job/62799147040
Set analysis config mode explicitly in snapshot test setup to avoid default-coupled expectations. Links: #116 (comment)
Make AssertTextBlockEquals trim both ends to reduce non-semantic formatting noise failures. Links: #116 (comment)
Make NormalizeNewlines null-safe for future helper reuse. Links: #116 (comment)
Keep full policy block snapshot strict in this scenario to intentionally catch line-order and formatting regressions. Links: #116 (comment)

PR #85 Static analysis catalog + CLI export

Update analysis loading to use reviewFiles so analysis findings respect filters. Links: #85 (review)
Add resilience around catalog loading to avoid failing reviews when the catalog is missing or unreadable. Links: #85 (review)
Expand severity normalization to handle “critical” (and other high-severity values if expected). Links: #85 (review)
Null-guard pack rules before adding to policy output. Links: #85 (comment)
Normalize SeverityOverrides into a case-insensitive dictionary. Links: #85 (comment)
Catch malformed glob patterns to avoid ArgumentException. Links: #85 (comment)
Treat unknown severities distinctly from none to avoid silent suppression. Links: #85 (comment)
Log malformed glob patterns so config errors are visible. Links: #85 (comment)

PR #99 Add maintainability LOC rule and split setup runner

PR #127 Add tiered analysis packs and rule inventory formats

list-rules should recognize --help, -h, and help and return usage text with exit 0. Links: #127 (comment), #127 (comment)
Keep --format json output machine-parseable by writing warnings to stderr instead of stdout. Links: #127 (comment)
Emit [] for empty rule sets in JSON mode instead of text output. Links: #127 (comment)
Test helper should capture both stdout and stderr so diagnostics stream moves do not hide regressions. Links: #127 (comment)

PR #129 Expand C# static-analysis catalog and tier pack coverage

Correct malformed CA5350 description text in generated catalog metadata. Links: #129 (comment), #129 (comment)
Normalize CA5389 title casing in generated metadata for user-facing consistency. Links: #129 (comment), #129 (comment)
Document interpreter-based catalog refresh command in README for portability. Links: #129 (comment), #129 (comment)
Avoid selecting defaultSeverity: none rules in tier packs so enabled packs do not silently ship disabled analyzer entries. Links: #129 (comment)
Select latest NetAnalyzers NuGet package using semantic version ordering. Links: #129 (comment)

Uh oh!

FilesExpand file tree

TODO.md

Latest commit

History

TODO.md

File metadata and controls

IntelligenceX Roadmap

Ops Agent Backlog

Chat/Tools Decoupling Migration

PR Feedback Churn

Reviewer E2E Launch Plan

Acceptance (Definition Of Done)

Phase A — Onboarding UX (CLI + Web)

Phase B — Setup Output (Workflow + reviewer.json)

Phase C — Review Reliability (Reduce Churn, Increase Continuity)

Phase D — Static Analysis Productization

Phase E — Docs + Samples

Phase F — End-To-End Tests

Engine Roadmap

Now — Phase 1–2 (concrete)

Phase 0 — Scope + success criteria

Phase 1 — Reliability + diagnostics

Phase 2 — Context quality

Phase 3 — Review output + UX

Phase 4 — Thread triage + auto-resolve

Phase 5 — Provider abstraction

Phase 5.5 — Code host support (Azure DevOps Services)

Phase 6 — Performance + cost

Phase 7 — Security + trust

Phase 8 — Testing + validation

Phase 9 — Developer experience

Onboarding Roadmap (Wizard + CLI)

Phase 0 — Goals & constraints

Phase 1 — Core setup architecture (shared by CLI + UI)

Phase 2 — CLI Wizard (Spectre.Console)

Phase 3 — GitHub App Manifest (BYO App)

Phase 4 — Secrets handling

Phase 5 — Local Web UI Wizard

Phase 6 — Reviewer improvements

Phase 7 — Docs & README

Phase 8 — Copilot (experimental)

Phase 9 — DevEx automation

Review Feedback Backlog (Bots)