Releases: orenlab/codeclone
CodeClone 2.0.0: a baseline-aware structural review platform for Python.
CodeClone 2.0.0 is the first stable 2.x release: a baseline-aware structural review platform for Python with CLI, MCP, VS Code, Claude Desktop, Codex, and GitHub Action surfaces built on one canonical report contract.
Highlights
- Canonical package layout — global architecture refactor completed in 2.0.0b6; the entire runtime now lives in stable, directly importable modules.
- Stable CLI, report, MCP, and baseline contracts for the full 2.x line.
- Adaptive dependency-depth scoring — project-relative model replacing the old fixed threshold; avg/p95/max chain metrics surface in reports and CLI.
- Security Surfaces — report-only trust-boundary inventory: subprocess, filesystem mutations, crypto, dynamic loading. No vulnerability claims, no score impact.
- Coverage Join — fuse external Cobertura XML into the structural review to surface hotspots against real pytest coverage.
- Native client surfaces — VS Code extension, Claude Desktop bundle, Codex plugin, and composite GitHub Action, all over the same
codeclone-mcpcontract. - Dedicated PyPI README and refreshed CodeClone branding.
Install
uv tool install codeclone
uv tool install "codeclone[mcp]"See CHANGELOG.md for the full contract and migration details.
CodeClone 2.0.0b7: is a beta hotfix for packaging-only issues found after the 2.0.0b6 publish.
Packaging
- Constrain the optional MCP extra to
httpx>=0.27.1,<1so prerelease install flows such as
uv tool install --pre "codeclone[mcp]"do not resolve incompatiblehttpx 1.0.dev*builds through the upstream MCP
dependency graph. - Pin the preview VS Code extension packaging tool to
@vscode/vsce@2.25.0, removing the vulnerable transitive
uuid<14chain frompackage-lock.jsonwhile preserving.vsixpackaging. - Keep local pre-commit runs stable after package builds by letting mypy use the configured source roots and ignoring
generatedbuild/andsite/artifacts.
CodeClone 2.0.0b6: land the architecture split, adaptive dependency profiling, and security review surfaces
The global package refactor lands here: the entire runtime moves onto the canonical module layout and legacy shims are removed for good. On top of that, dependency-depth scoring is replaced with an adaptive project-relative model, and the report/cache contracts advance to surface the new depth profile and the report-only security_surfaces layer.
Package layout and contracts
- Move the runtime fully onto the canonical package layout:
main+surfaces/cli,surfaces/mcp,core,analysis,baseline,cache,contracts,report/document,report/renderers, andreport/html. - Remove remaining legacy root shims and stale compatibility modules in favor of direct canonical imports.
- Remove stale deleted-file cache entries and trim post-refactor import tails that were inflating dependency depth and clone pressure.
- Bump report schema to
2.10and cache schema to2.6for additive dependency depth profile fields andsecurity_surfacesfacts; keep clone baseline schema2.1and metrics-baseline schema1.2unchanged. - Preserve deterministic contracts and read-only MCP semantics across the new layout.
Dependency depth scoring
- Replace the old fixed dependency-depth penalty (
max_depth > 8) with an adaptive internal-graph profile based onavg_depth,p95_depth, andmax_depth. - Keep dependency cycles as the hard signal; treat acyclic depth as adaptive pressure relative to the project's own dependency profile.
- Limit dependency-depth scoring to the internal module graph instead of external imports such as
typingorargparse. - Surface the dependency depth profile in the canonical report, HTML Dependencies tab, and CLI/CI summaries.
Security surfaces
- Add
metrics.families.security_surfaces: a report-only exact inventory of security-relevant capability surfaces and trust-boundary code. - Surface compact
security_surfacesfacts in canonical report JSON, CLI Metrics, HTML Quality, text/markdown projections, and MCP summaries /metrics_detail. - Keep the layer honest: no vulnerability claims, no score impact, no gates, no SARIF security findings, and no baseline truth.
Tooling, docs, and UX
- Refresh AGENTS, docs/book, and changelog content for the b6 package layout and report schema
2.10. - Tighten preview client metadata and install guidance for VS Code, Claude Desktop, and Codex.
- Replace the Codex plugin shell snippet with a repo-local shell-free launcher, and parallelize VS Code post-run MCP artifact hydration.
- Add a quiet one-time VS Code extension hint in interactive VS Code terminals, tracked per CodeClone version next to the resolved project cache path.
CodeClone 2.0.0b5: coverage-aware metrics and baseline-honest review surfaces
Contracts, metrics, and review surfaces
- Report schema
2.8: addcoverage_adoption,api_surface,coverage_join, and optionalclones.suppressed.*(forgolden_fixture_paths); separate coverage hotspots vs scope gaps. - Baselines: clone
2.1, metrics1.2; compactapi_surfacepayload (local_nameon disk, qualnames at runtime); read-compatible with2.0/1.1. - Add public/private visibility classification for public-symbol metrics (no clone/fingerprint changes).
- Add annotation/docstring adoption coverage: parameter, return, public docstrings, explicit
Any. - Add opt-in API surface inventory + baseline diff (snapshots, additions, breaking changes).
- Add coverage join (
--coverage): per-function facts + findings for below-threshold or missing-in-scope functions;
current-run only (not baseline truth, no fingerprint impact). - Add
golden_fixture_paths: exclude matching clone groups from health/gates while keeping suppressed facts. - Add gates:
--min-typing-coverage,--min-docstring-coverage,--fail-on-typing-regression,--fail-on-docstring-regression,--fail-on-api-break,--fail-on-untested-hotspots,--coverage-min. - Surface adoption/API/coverage-join in MCP, CLI Metrics, report payloads, and HTML (Overview + Quality subtab).
- Preserve embedded metrics and optional
api_surfacein unified baselines. - Cache
2.5: make analysis-profile compatibility API-surface-aware; invalidate stale non-API warm caches; preserve parameter order; align warm/cold API diffs.
MCP, HTML, and client interpretation
- Surface effective analysis profile in report meta, MCP summary/triage, and HTML subtitle.
- Add
health_scope,focus,new_by_source_kindto MCP summary/triage. - Make baseline mismatch explicit (python tags + no-valid-baseline signal).
- Surface
Coverage Joinfacts and the optionalcoverageMCP help topic in the VS Code extension when the connected server supports them. - Prefer workspace-local launchers over
PATH(Poetry fallback). - Add
workspace_rootto force project.venvselection.
Safety and maintenance
- Validate
git_diff_refas safe single-revision expressions. - Replace segment digest
repr()with canonical JSON bytes (determinism). - Align CI coverage gate (
fail_under = 99) and refreshactions/checkoutpin. - Refresh branch metadata/docs for
2.0.0b5; update README badge to89 (B).
CodeClone 2.0.0b4: with first-class MCP, VS Code, Claude, and Codex surfaces
MCP server
- Add
help(topic=...)tool for workflow guidance, baseline semantics, analysis profile, and review-state routing
(tool count: 20 → 21). - Add
analysis_profilehelp topic for explicit conservative-first / deeper-review threshold guidance. - Enrich
_SERVER_INSTRUCTIONSwith triage-first workflow, budget-aware drill-down, and conservative-first threshold
guidance so MCP-capable clients receive structured behavioral context on connect. - Optimize MCP payloads: short finding IDs (sha256-based for block clones), compact
derivedsection projection,
boundedmetrics_detailwith pagination. - Fix MCP initialize metadata so
serverInfo.versionreports the CodeClone package version rather than the underlying
mcpruntime version.
Report contract
- Bump canonical report schema to
2.3. - Add
metrics.overloaded_modules— report-only module-hotspot ranking by size, complexity, and coupling pressure. - Surface Overloaded Modules across JSON, text/markdown, HTML, and MCP without affecting findings, health, or gates.
- Normalize the canonical family name and MCP/report output to
overloaded_modules;god_modulesremains accepted as a
read-only MCP input alias during transition.
CLI and HTML
- Align CLI and HTML scope summaries with canonical inventory totals.
- Redesign Overview tab: Executive Summary becomes 2-column (Issue Breakdown + Source Breakdown) with scan scope in
the section subtitle; Overloaded Modules section replaces the earlier stretched module-hotspot layout.
Documentation
- Add Health Score chapter: scoring inputs, report-only layers, phased expansion policy.
- Document that future releases may lower scores due to broader scoring model, not only worse code.
IDE and client integration (preview)
- Add VS Code extension (
codeclone-mcpclient) with baseline-aware triage, source drill-down, Explorer decorations,
and HTML-report bridging. - Add conservative, deeper-review, and custom analysis profiles to the VS Code extension and pass them through to MCP.
- Add limited Restricted Mode: onboarding works in untrusted workspaces, analysis stays gated until trust is granted.
- Add Node unit tests, extension-host smoke tests, and
.vsixpackaging. - Tighten the VS Code extension to current VS Code UX guidance: one primary editor action, titled Quick Picks,
per-view icons, non-button tree details, and a hard minimum local CodeClone version gate (>= 2.0.0b4). - Add Claude Desktop
.mcpbbundle wrapper for the localcodeclone-mcplauncher with pre-loaded review instructions,
explicit launcher settings, platform auto-discovery (macOS, Linux, Windows), local-stdio enforcement, signal
forwarding, and deterministic package build smoke. - Add a native Codex plugin with repo-local discovery metadata, bundled
codeclone-mcpconfig, pre-loaded instructions,
and two skills: conservative-first full review and quick hotspot discovery.
Internal
- Extract shared
_json_iomodule for deterministic JSON serialization across baseline, cache, and report paths. - Remove low-signal structural clone noise surfaced by stricter analysis passes without touching golden fixture debt.
CodeClone 2.0.0b3: MCP, UX and Platform Tightening
2.0.0b3 is the release where CodeClone stops looking like "a strong analyzer with extras" and starts looking like a coherent platform: canonical-report-first, agent-facing, CI-native, and product-grade.
Licensing & packaging
- Re-license source code to MPL-2.0 while keeping documentation under MIT.
- Ship dual
LICENSE/LICENSE-docsfiles and sync SPDX headers.
MCP server (new)
- Add optional
codeclone[mcp]extra withcodeclone-mcplauncher (stdioandstreamable-http). - Introduce a read-only MCP surface with 20 tools, fixed resources, and run-scoped URIs for analysis, changed-files
review, run comparison, findings / hotspots / remediation, granular checks, and gate preview. - Add bounded run retention (
--history-limit),--allow-remoteguard, and rejectcache_policy=refreshto preserve
read-only semantics. - Optimize MCP payloads for agents with short ids, compact summaries/cards, bounded
metrics_detail, and slim
changed-files / compare-runs responses — without changing the canonical report contract. - Make MCP explicitly triage-first and budget-aware: clients are guided toward summary/triage → hotspots /
check_*→
single-finding drill-down instead of broad early listing. - Add
cache.freshnessmarker andget_production_triage/codeclone://latest/triagefor compact production-first
overview. - Improve run-comparison honesty:
compare_runsnow reportsmixed/incomparable, andclones_onlyruns surface
health: unavailableinstead of placeholder values. - Harden repository safety: MCP analysis now requires an absolute repository root and rejects relative roots like
.
to avoid analyzing the wrong directory. - Fix hotlist key resolution for
production_hotspotsandtest_fixture_hotspots. - Bump cache schema to
2.3(stale metric entries rebuilt, not reused).
Report contract
- Bump canonical report schema to
2.2. - Add canonical
meta.analysis_thresholds.design_findingsprovenance and move threshold-aware design findings fully
into the canonical report, so MCP and HTML read the same design-finding universe. - Add
derived.overview.directory_hotspotsand render it in the HTML Overview tab asHotspots by Directory.
CLI
- Add
--changed-only,--diff-against, and--paths-from-git-difffor changed-scope review and gating with
first-class summary output.
SARIF
- Stabilize
primaryLocationLineHash(line numbers excluded), add run-uniqueautomationDetails.id/
startTimeUtc, set explicitkind: "fail", and move ancillary fields toproperties.
HTML report
- Add
Hotspots by Directoryto the Overview tab, surfacing directory-level concentration forall,clones, and low-cohesion findings with scope-aware badges and compact counts. - Add IDE picker (PyCharm, IDEA, VS Code, Cursor, Fleet, Zed) with persistent selection.
- Add clickable file-path deep links across all tabs and stable
finding-{id}anchors.
GitHub Action
- Ship Composite Action v2 with configurable quality gates, SARIF upload to Code Scanning, and PR summary comments.
CodeClone 2.0.0b2: fix UI errors and update deps
Dependencies
- Upgrade requests (dev dep) to 2.33.0 for extract_zipped_paths security fix (CVE-2026-25645)
HTML
- Fix page-level horizontal scrolling in wide table tabs by constraining overflow to local table wrappers (#14).
- Fix mobile header brand block layout on narrow viewports (#15).
- Make mobile navigation tabs sticky and horizontally scrollable with scroll-shadow affordance.
- Keep Overview KPI micro-badges inside cards at extreme browser/mobile widths.
- Restyle Report Provenance summary badges to match the card-style badge language used across the report.
CodeClone 2.0.0b1: evolves from a structural clone detector into a baseline-aware code-health and CI governance tool for Python
Major upgrade: CodeClone evolves from a structural clone detector into a baseline-aware code-health and CI governance tool for Python.
Architecture
- Stage-based pipeline (
pipeline.py): discovery → processing → analysis → reporting → gating. - Domain layers:
models.py,metrics/,report/,grouping.py. - Baseline schema
2.0, report schema2.1, cache schema2.2;fingerprint_versionremains1.
Code-Health Analysis
- Seven health dimensions: clones, complexity, coupling, cohesion, dead code, dependencies, coverage.
- Piecewise clone scoring curve: mild penalty below 5% density, steep 5–20%, aggressive above 20%.
- Dimension weights: clones 25%, complexity 20%, cohesion 15%, coupling 10%, dead code 10%, dependencies 10%, coverage 10%.
- Grade bands: A ≥90, B ≥75, C ≥60, D ≥40, F <40.
Detection Thresholds
- Lowered function-level
--min-locfrom 15 to 10 (configurable via CLI/pyproject.toml). - Lowered block fragment gate from loc≥40/stmt≥10 to loc≥20/stmt≥8.
- Lowered segment fragment gate from loc≥30/stmt≥12 to loc≥20/stmt≥10.
- All six thresholds configurable via
[tool.codeclone]inpyproject.toml.
Detection Quality
- Conservative dead-code detector: skips tests, dunders, visitors, protocol stubs.
- Module-level PEP 562 hooks (
__getattr__,__dir__) are treated as non-actionable dead-code candidates. - Exact qualname-based liveness with import-alias resolution.
- Canonical inline suppression syntax:
# codeclone: ignore[dead-code]on declarations. - Structural finding families:
duplicated_branches,clone_guard_exit_divergence,clone_cohort_drift.
Configuration and CLI
- Config from
pyproject.tomlunder[tool.codeclone]; precedence: CLI > pyproject.toml > defaults. - Optional-value report flags:
--html,--json,--md,--sarif,--textwith deterministic default paths. --open-html-report,--timestamped-report-paths,--cipreset.- Explicit
--no-progress/--progress,--no-color/--colorflag pairs.
HTML Report
- Overview: KPI grid with health gauge (baseline delta arc), Executive Summary (issue breakdown + source breakdown),
Health Profile radar chart. - KPI cards show baseline-aware tone:
✓ baselinedpill when all items are accepted debt,+Nred badge for
regressions. - Get Badge modal: grade-only and score+grade variants, shields.io preview, Markdown/HTML embeds, copy feedback.
- Report Provenance modal with section cards, SVG icons, boolean badges.
- Responsive layout with dark/light theme toggle and system theme detection.
Baseline and Contracts
- Unified baseline flow: clone keys + optional metrics in one file.
- Metrics snapshot integrity via
meta.metrics_payload_sha256. - Report contract: canonical
meta/inventory/findings/metrics+ derivedsuggestions/overview+integrity. - SARIF:
%SRCROOT%anchoring,baselineState, rich rule metadata. - Cache compatibility now keys off the full six-threshold analysis profile
(function + block + segment thresholds), not only the top-level function gate.
Performance
- Unified AST collection pass (merged 3 separate walks).
- Suppression fast-path: skip tokenization when
codeclone:absent. - Cache dirty flag: skip
save()on warm path when nothing changed. - Adaptive multiprocessing, batch statement hashing, deferred HTML import.
Docs and Publishing
- MkDocs site with Material theme and GitHub Pages workflow.
- Live sample reports (HTML, JSON, SARIF).
- PyPI-facing README now uses published docs URLs instead of repo-relative doc links.
Packaging
- Package metadata stays explicitly beta (
2.0.0b1,Development Status :: 4 - Beta). pyproject.tomlmoved to SPDX-stylelicense = "MIT"andproject.license-filesfor modern setuptools builds without release-time deprecation warnings.
Stability
- Exit codes unchanged:
0/2/3/5. - Fingerprint contract unchanged:
BASELINE_FINGERPRINT_VERSION = "1". - Coverage gate:
>=99%.
CodeClone 1.4.4: Perfomance Fix
Performance
- Optimized HTML snippet rendering hot path:
- file snippets now reuse cached full-file lines and slice ranges without
repeated full-file scans - Pygments modules are loaded once per importer identity instead of
re-importing for each snippet
- file snippets now reuse cached full-file lines and slice ranges without
- Optimized block explainability range stats:
- replaced repeated full
ast.walk()scans per range with a per-file
statement index +bisectwindow lookup
- replaced repeated full
Tests
- Preserved existing golden/contract behavior for
1.4.xand kept report output
semantics unchanged while improving runtime overhead.
Contract Notes
- No baseline/cache/report schema changes.
- No clone detection or fingerprint semantic changes.
CodeClone 1.4.3: Cache compatibility now respects --min-loc/--min-stmt
Cache Contract
- Cache schema bumped from
v1.2tov1.3. - Added signed analysis profile to cache payload:
payload.ap.min_locpayload.ap.min_stmt
- Cache compatibility now requires
payload.apto match current CLI analysis thresholds. On mismatch, cache is ignored withcache_status=analysis_profile_mismatchand analysis continues without cache.
CLI
- CLI now constructs cache context with effective
--min-locand--min-stmtvalues, so cache reuse is consistent
with active analysis thresholds.
Tests
- Added regression coverage for analysis-profile cache mismatch/match behavior in:
tests/test_cache.pytests/test_cli_inprocess.py
Contract Notes
- Baseline contract is unchanged (
schema v1.0,fingerprint version 1). - Report schema is unchanged (
v1.1); cache metadata adds a newcache_statusenum value.