Changelog

Every release, every fix. Keep-a-changelog format, semver.

All notable changes to llmwiki will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Versions below 1.0 are pre-production — API and file formats may change.

[Unreleased]

Added

  • Image download + local storage pipeline (#96) -- new llmwiki/image_pipeline.py module (stdlib-only) that finds remote ![alt](https://...) image references in converted markdown, downloads them to raw/assets/ with content-addressable filenames (sha256(url)[:16].<ext> via urllib.request + hashlib), and rewrites the markdown refs to point at the local copies. Four public functions: find_remote_images() (regex scanner returning (url, alt, line_number) tuples), download_image() (single-URL fetcher with graceful failure -- never crashes, logs warnings), rewrite_image_refs() (URL-to-local substitution), process_markdown_images() (orchestrator for one file returning (downloaded, failed, skipped) counts). Rate-limited at 1 req/s via time.sleep(1). New --download-images flag on llmwiki sync runs the pipeline over every converted .md file after conversion. Build step (build.py) now copies raw/assets/ to site/assets/ when the directory exists so downloaded images are served alongside the HTML. 30 new tests covering find (https found, local skipped, empty, multi-line, dedup, title syntax, query params), download (hash filename, cache hit, graceful network/HTTP failure, missing extension fallback, auto-mkdir), rewrite (replacement, preservation of local refs, mixed content), and process (dry-run counting, full rewrite, failed-keeps-original, duplicate dedup, rate-limit sleep verification).

  • Cursor adapter graduated to production (#37) -- llmwiki.adapters.cursor now has SUPPORTED_SCHEMA_VERSIONS, platform-aware session_store_path covering macOS/Linux/Windows, synthetic fixture at tests/fixtures/cursor/minimal.jsonl, snapshot test, converter round-trip test, graceful-degradation test (unknown record types silently skipped), and doc page at docs/adapters/cursor.md. The adapter discovers .jsonl files under Cursor's per-workspace storage directories and derives project slugs from the workspace hash.

  • Gemini CLI adapter graduated to production (#38) -- llmwiki.adapters.gemini_cli now has SUPPORTED_SCHEMA_VERSIONS, XDG-aware session_store_path covering ~/.gemini, ~/.config/gemini, and ~/.local/share/gemini, synthetic fixture at tests/fixtures/gemini_cli/minimal.jsonl, snapshot test, converter round-trip test, graceful-degradation test, and doc page at docs/adapters/gemini-cli.md. Discovers both .jsonl and chat-*.json/session-*.json patterns.
  • PDF adapter graduated to production (#39) -- llmwiki.adapters.pdf now has SUPPORTED_SCHEMA_VERSIONS, enabled config flag, extract_text() returning (body_md, metadata) tuple with encrypted-PDF handling (tries empty-password decrypt, skips gracefully), convert_pdf() producing frontmatter'd markdown (slug, project, title, date, pages, author, type), min_pages/max_pages filtering, redact callback support, and PDF metadata extraction (title, author, creation date from document info). Convert pipeline (convert.py) routes .pdf files through convert_pdf(). Test fixture at tests/fixtures/pdf/sample.pdf (2-page PDF with title + author metadata), 14 tests covering extraction, conversion, discovery, filtering, redaction, and availability. Doc page at docs/adapters/pdf.md.
  • GitLab Pages deployment workflow (#49) — new .gitlab-ci.yml.example template with three-stage pipeline: build_site (installs llmwiki + builds to public/), privacy_check (greps for PII patterns), and pages (deploys to GitLab Pages on default branch). New docs/deploy/gitlab-pages.md with quick start, configuration (custom domain, private projects, Python version), and troubleshooting. README updated to link both GitHub Pages and GitLab Pages deployment docs.
  • PyPI release automation via GitHub Actions (#42) — new .github/workflows/release.yml triggered on version tag push (v*.*.*). Builds sdist + wheel via python -m build, publishes to PyPI via OIDC trusted publisher (no long-lived API tokens), signs artifacts with Sigstore, and creates a GitHub Release with build artifacts + signature bundles attached. Pre-release tags (rc, alpha, beta, dev) are automatically marked as pre-release. Updated docs/maintainers/RELEASE_PROCESS.md with the new automated flow.
  • Lazy-load search index in per-project chunks (#47) — split the monolithic search-index.json into a small meta index (projects + static pages) plus per-project chunk files under search-chunks/<project>.json. The command palette, wikilink preview, and related-pages sidebar all use a shared chunked loader that fetches the meta index first (instant palette rendering with project entries) then fetches session chunks in parallel on first demand. Backwards-compatible with the old flat-array format. Reduces initial page transfer by ~50%+ on large wikis (the meta index is typically <1 KB vs 400+ KB for the full index).
  • Scheduled sync templates (#48) — ready-to-install templates for running llmwiki sync on a daily schedule: macOS launchd plist, Linux systemd timer + service, and Windows Task Scheduler XML. Each template defaults to 03:00 daily with catch-up-on-wake/boot support. New docs/scheduled-sync.md with install/uninstall instructions for all three platforms, plus a privacy note confirming scheduled runs use the same config and redaction rules as manual runs. Cross-linked from README.
  • WCAG 2.1 AA accessibility audit + fixes (#46) — ran axe-core via axe-playwright-python against 4 page types (home, projects index, sessions index, session detail) and fixed every violation. Darkened --text-muted from #94a3b8 to #6b7280 in light mode (2.56:1 → 4.84:1 on white) and from #64748b to #8b9bb5 in dark mode (4.09:1 → 6.97:1 on dark bg) for WCAG AA contrast compliance on all muted text. Overrode highlight.js GitHub theme .hljs-keyword / .hljs-type from #d73a49 to #c23a40 in light mode (4.17:1 → 4.82:1 on --bg-code). Added underlines to footer and breadcrumb links so they are distinguishable without relying on color alone (WCAG 1.4.1). Added a skip-to-content link (<a class="skip-link">) visible on first Tab press that jumps to <main id="main-content">. New scripts/a11y_audit.py for re-running the scan locally. Documented everything in docs/accessibility.md with a contrast table, keyboard nav checklist, and the full list of fixes. Current status: 0 axe-core violations across all 4 page types.
  • Playwright + pytest-bdd end-to-end tests (#45) — new tests/e2e/ harness that builds a minimal demo site, serves it on a random free port via stdlib http.server, and drives a real Chromium browser with Playwright. Scenarios are written in Gherkin via pytest-bdd so the feature files read like plain English specs. 11 feature files / 62 scenarios covering: homepage (nav + hero + projects grid), session detail page (breadcrumbs, hljs code blocks, article sections), command palette (Cmd+K opens + focuses input + filters results), keyboard navigation (g h / g p / g s / ?), mobile bottom nav (Search button opens palette, Theme button toggles dark mode), theme toggle (flips data-theme + hljs stylesheet swap), copy-as-markdown (clipboard contains session content), responsive (9 viewport widths × 3 pages = horizontal-scroll + hero-visible regression gate), edge cases (empty palette, rapid typing, 404 handling, print-media layout, clean console), accessibility (non-empty nav text, aria-labels, tab order into header, Escape trap, prefers-reduced-motion), and visual regression (12 screenshots per breakpoint × theme combo uploaded as CI artifacts). Shared step library in tests/e2e/steps/ui_steps.py uses Playwright's auto-waiting locators; console-error + pageerror events are recorded per scenario via an autouse fixture so the "clean console" assertion works without per-test wiring. New [e2e] opt-in extras group in pyproject.toml (installs pytest-playwright + pytest-bdd + pytest-html). Default pytest tests/ run excludes tests/e2e/ via --ignore=tests/e2e so fast unit iteration is unaffected. New .github/workflows/e2e.yml runs the suite on PRs touching build.py, viz modules, or tests/e2e/** — with cached Chromium binaries, retain-on-failure Playwright trace uploads, a self-contained pytest-html report, and the screenshots directory uploaded as a separate artifact. New "Running E2E tests" section in the README covers install + run commands. No impact on the existing 439-test unit suite.
  • Maintainer governance scaffold (#62) — complete governance surface for the project: new CODE_OF_CONDUCT.md (Contributor Covenant 2.1), SECURITY.md (disclosure process + in-scope/out-of-scope + privacy-first architecture summary), and a new docs/maintainers/ folder with 7 canonical docs — README.md (index), ARCHITECTURE.md (one-page system diagram + layer boundaries + what NOT to add), REVIEW_CHECKLIST.md (canonical code-review criteria with blocker/nit classification), RELEASE_PROCESS.md (step-by-step version-bump checklist), TRIAGE.md (label taxonomy + stale policy + escalation rules), ROADMAP.md (living near-term plan + release theme table), DECLINED.md (graveyard of declined ideas with date + rationale, pre-seeded with 13 entries covering all v0.5-v0.9 non-goals). New .github/ meta files: .github/ISSUE_TEMPLATE/{feature_request,bug_report,chore}.yml (structured YAML forms), .github/PULL_REQUEST_TEMPLATE.md (rewritten to match the REVIEW_CHECKLIST), .github/CODEOWNERS (per-layer ownership), .github/workflows/pr-lint.yml (enforces conventional-commit title + CHANGELOG updated + no new runtime deps). Four Claude Code slash commands: /review-pr <N> (runs the REVIEW_CHECKLIST against a PR), /triage-issue <N> (applies TRIAGE.md taxonomy), /release <version> (walks RELEASE_PROCESS.md step by step), /maintainer (meta-skill that loads every governance doc). New "For maintainers" section in README.md linking to every doc and slash command. This is the scaffolding that lets the project scale past single-maintainer without losing consistency.
  • llmwiki export-qmd — qmd collection exporter (#59) — new llmwiki/export_qmd.py module + export-qmd CLI subcommand. Writes a self-contained tobi/qmd collection to any output directory: qmd.yaml manifest (with glob patterns for each wiki layer — sources/, entities/, concepts/, syntheses/, projects/, top-level), a README.md with step-by-step instructions for qmd index + Claude Desktop MCP config snippet, an executable index.sh one-liner, and a full copy of every .md under wiki/ preserving structure (including _context.md folder stubs from #60, which qmd's own context system can read). Non-goals: shipping qmd as a dep (it's TypeScript; llmwiki stays Python), running it automatically from llmwiki build. Karpathy's LLM Wiki gist explicitly recommends qmd for hybrid search at scale — this closes the loop without adding a runtime dep. 19 new tests cover manifest rendering, README structure, script shebang + executable bit, wiki-tree copy (preserves structure, skips non-markdown, preserves _context.md), end-to-end export (empty wiki, nested tree, idempotence). Real smoke test against the current wiki copied 27 files and produced a valid collection directory.
  • Auto-generated vs-comparison pages (#58) — new llmwiki/compare.py module + build step. Scans every model entity from #55, walks all 2-combinations, scores each pair by number of shared structured fields (provider + context window + pricing + benchmarks + modalities). Pairs above min_shared_fields (default 3) get an auto-generated page at /vs/<slug-a>-vs-<slug-b>.html with (1) a side-by-side info table with difference highlighting, (2) a shared-benchmark SVG bar chart, (3) a price-delta paragraph identifying the cheaper model and % difference, and (4) a stub ## Summary section for the user or LLM to fill in later. Pairs are sorted by score descending and capped at max_pairs (default 500) so the combinatorial explosion stays under control on large wikis. Slugs are alphabetically enforced so every pair has one canonical URL. User overrides at wiki/vs/<slug>.md replace the auto-gen for that URL. New /vs/index.html page with a sortable table, new "Compare" nav link. Seeded a second model entity (wiki/entities/GPT5.md) so the demo build ships with a real Claude Sonnet 4 vs GPT-5 comparison. 22 new tests.
  • Project topics — GitHub-style tag chips on project cards + pages — new llmwiki/project_topics.py module and wiki/projects/<slug>.md per-project profile convention. Each profile file carries frontmatter with a topics: list ([rust, blog, ssg]), optional description, and optional homepage URL. Build-time surfaces topics as pill-shaped chips on the home-page project cards (4 chips + overflow collapse into +N more) AND as a hero strip on each project detail page (up to 12 chips, with the description paragraph and a clickable homepage link if present). When no profile exists, falls back to aggregating session tags: frontmatter with universal noise tags (claude-code, session-transcript, demo) filtered out and a min_count ≥ 2 threshold so one-off stragglers don't crowd the strip. Full dark-mode styling via theme vars; chips hover with the accent color. Seeded four profiles to ship with the repo (demo-blog-engine, demo-ml-pipeline, demo-todo-api, llm-wiki) and added a gitignore exception for wiki/projects/ so user profiles are committed by default. 24 new tests cover profile loading, topics normalization (lowercase + dedup), session-tag aggregation with noise filtering, precedence rules, chip rendering, overflow collapse, HTML escaping, URL encoding for linked chips.
  • Append-only changelog field + timeline + pricing sparkline (#56) — new llmwiki/changelog_timeline.py module consumes an optional changelog: list in model entity frontmatter and renders three surfaces: (1) a vertical timeline widget on each model detail page (newest-first, with from→to deltas colored by direction — price cuts green, price hikes red, benchmark lifts green, numeric context expansions shown with an up arrow), (2) an inline pricing sparkline (stdlib SVG) that appears when the changelog has ≥2 dated input_per_1m changes so readers can see the trend at a glance, and (3) a "Recently updated · last 30 days" card on the home page listing any model entity that changed recently. Append-only by design — if an entry is wrong, add a correcting entry rather than rewriting history. The frontmatter parser's naive comma-split on bracketed JSON arrays is papered over by a stitch-and-reparse fallback in parse_changelog(), with a regression test locking it in place. Numeric deltas get K/M suffix formatting, string deltas (e.g. license changes) render as strike-through → bold. All HTML-escaped. 27 new tests. Seeded wiki/entities/ClaudeSonnet4.md with a real 4-entry changelog (launch → price cut → context expansion → SWE-bench update) so the feature is visible immediately on the live build.
  • Structured model-profile schema + /models/ section (#55) — new llmwiki/schema.py (stdlib-only TypedDict validator) and llmwiki/models_page.py (renderer) add an opt-in schema for entity pages with entity_kind: ai-model. Pages can declare provider, inline-JSON model / pricing / benchmarks blocks, and a modalities list. The build pipeline discovers every valid model page under wiki/entities/, validates it (bad data → warnings, not build crashes), renders a structured info-card at the top of a per-model detail page (/models/<slug>.html), and emits a sortable /models/index.html table with every benchmark key used anywhere as a column. 13 well-known benchmark keys get pretty labels (gpqa_diamond → "GPQA Diamond", swe_bench → "SWE-bench", mmlu → "MMLU", etc.); unknown keys pass through for forward compatibility. Price formatting supports USD/EUR/GBP and falls back to currency-prefixed for anything else. New nav-bar link "Models" active on the detail and index pages. Full docs in docs/reference/entity-schema.md, seeded example in wiki/entities/ClaudeSonnet4.md. 36 new tests across test_schema.py (21) and test_models_page.py (15): happy path, minimum-viable page, validation warnings, non-numeric benchmarks, out-of-range scores rejected, malformed JSON treated as empty + warning, unknown benchmark keys allowed, HTML escaping, benchmark sort order, table column union.
  • Folder-level _context.md files (#60) — new llmwiki/context_md.py module + convention, borrowed from tobi/qmd's context pattern. An optional _context.md file can sit alongside pages in any wiki folder (e.g. wiki/entities/_context.md) to describe what the folder is for and which queries should traverse it. When a Claude Code /wiki-query session walks the tree, it reads the folder's context file first and uses the summary to decide whether to descend — saving context tokens on every deep query instead of sampling random pages to infer a folder's purpose. build.py's discover_sources() now skips _context.md so these files never pollute the session index, search index, or AI-consumable exports. CLAUDE.md Query + Lint workflows document the convention, and find_uncontexted_folders() powers a new /wiki-lint warning for folders with >10 pages but no context stub. Ships with three seeded stubs (wiki/entities/_context.md, wiki/concepts/_context.md, wiki/sources/_context.md) to show the pattern. 19 new tests.
  • Token usage card, project timeline, and site-wide stats (#66) — new llmwiki/viz_tokens.py module (stdlib-only) consumes the token_totals frontmatter key (#63) and renders three related views. Session card shows Input / Cache creation / Cache read / Output in a stacked-bar layout with a cache-hit-ratio badge (green ≥80%, yellow 50–79%, red <50%) — formula matches Anthropic's definition (cache_read / (cache_read + cache_creation + input), excludes output). Project timeline is a log-scale area chart of daily total tokens over the project lifetime + aggregate cache hit ratio in the header. Site-wide stats on index.html show four cards: total tokens, average per session, best-cache-hit project (linked), heaviest project (linked). Human-readable K/M/B number formatting via shared format_tokens() helper. Full dark-mode palette via --token-input / --token-cache-creation / --token-cache-read / --token-output / --token-area-fill / --token-area-stroke CSS custom properties. Sessions missing token_totals (older converter output) degrade gracefully — card renders nothing instead of crashing. 45 new tests.
  • Tool-calling bar chart (#65) — new llmwiki/viz_tools.py module (stdlib-only, pure-SVG) renders a horizontal bar chart of tool usage from the tool_counts frontmatter key (#63). One chart per session page (sorted descending, top 10 tools + "Other (N tools)" overflow row) and one aggregate chart per project page. Category-based coloring: I/O (Read/Write/Edit — blue), Search (Grep/Glob/WebSearch — purple), Execution (Bash/Skill — orange), Network (WebFetch/mcp__* — green), Planning (Agent/TodoWrite/ExitPlanMode — slate). Tooltips show {tool}: {count} calls ({pct}%). Long tool names (e.g. mcp__Claude_in_Chrome__tabs_context_mcp) are truncated to 28 chars in the label column but preserved in the tooltip. Dark-mode variants for every category via --tool-cat-* CSS custom properties. 38 new tests cover: JSON-string and dict frontmatter shapes, aggregation across sessions, category mapping for every standard + MCP tool, sort order, overflow collapse, tooltip grammar, XSS defense, bar-width scaling.
  • GitLab/GitHub-style 365-day activity heatmap (#64, #72) — new llmwiki/viz_heatmap.py module (stdlib-only, pure-SVG) renders a full-year rolling contribution grid at build time. One aggregate heatmap on site/index.html counting every main session across all projects, one per-project heatmap on each projects/<slug>.html page scoped to that project. Sunday-aligned 53-column grid (369–371 cells depending on where the build date lands in the week), five-level quantile bucketing computed over non-zero days so sparse projects don't collapse into a single color, empty days always render as level-0 so the grid dimensions stay constant. Dark-mode variant via --heatmap-0..4 CSS custom properties (light: ebedf0→216e39, dark: 161b22→39d353). A11y: role="img" + descriptive aria-label, per-cell <title> tooltips. Replaces the v0.4 JS-based tiny-strip heatmap. 22 new tests lock the window bounds, quantile math, SVG structure, and sparse-data edge cases.
  • Session metrics frontmatter (#63) — converter now emits five new keys per session as JSON inline: tool_counts, token_totals (input / cache_creation / cache_read / output), turn_count, hour_buckets (UTC-normalised ISO-hour → activity count), and duration_seconds. Foundation for the v0.8 visualization stack (#64 heatmap / #65 tool chart / #66 token card). Stdlib-only; byte-identical on re-run. 24 new tests.
  • Changelog page (#72) — CHANGELOG.md now renders as a first-class page at site/changelog.html with a nav-bar link, narrow reading column, keep-a-changelog typography, and the same theme/print styles as the rest of the wiki.
  • highlight.js syntax highlighting (#73) — replaced server-side Pygments/codehilite with client-side highlight.js v11.9.0 loaded from a pinned jsdelivr CDN. Both GitHub light (github.min.css) and GitHub dark (github-dark.min.css) themes are preloaded; the runtime swaps the disabled flag on <link> tags when the theme toggles so code blocks stay in sync with the rest of the page. Code fences now emit plain <pre><code class="language-xxx"> via the fenced_code extension. Lighter build (no optional Python dep), consistent look across every page, auto-detection for untagged blocks. 15 new tests.
  • Public demo deployment (#73) — .github/workflows/pages.yml now builds a demo site from the eight dummy sessions under examples/demo-sessions/ on every push to master and deploys it to GitHub Pages. No personal data. Three fictional projects (demo-blog-engine Rust SSG, demo-ml-pipeline DistilBERT fine-tune, demo-todo-api FastAPI CRUD) with realistic code fences so visitors can see highlight.js and the full session UX immediately.
  • README screenshots (#73) — added six embedded screenshots under docs/images/ (home, sessions index, session detail, changelog, projects index, code-heavy session) captured from the demo site with headless Chrome at 2x device pixel ratio.

Changed

  • No optional highlight dependency (#73) — pip install -e '.[highlight]' is now a no-op alias kept only for backwards-compatibility with v0.4 install docs. setup.sh, setup.bat, pyproject.toml, and CI workflows no longer install Pygments.

Fixed

  • Horizontal overflow on small viewports (#45) — three bugs in the nav/layout CSS caused the body to scroll horizontally below 1024px and made the mobile-bottom-nav breakpoint mismatch the common 768 tablet cutoff. (1) The top-nav text anchors (Home / Projects / Sessions / Models / Compare / Changelog) stayed inline at every width and overflowed a 768px viewport — now hidden via @media (max-width: 1023px) so tablet + mobile use the command palette or the bottom nav for navigation instead. (2) The .mobile-bottom-nav breakpoint was max-width: 720px — now max-width: 767px so the 767/768 tablet boundary is respected. (3) The .recently-updated-item grid had a fixed 180px 100px 1fr template that overflowed sub-400px viewports — now collapses to a single column below 640px and truncates overflowing cell content with text-overflow: ellipsis. Caught by the new responsive E2E scenarios in #45. Desktop (1024+) layout is unchanged.
  • Raw HTML in session prose leaking into the DOM (#74) — a session transcript that mentioned <textarea> (or any tag-shaped substring) in prose used to pass through the markdown library unescaped, leaving an unclosed element that swallowed every following tag — including the <script> that boots highlight.js. The v0.5 swap from server-side Pygments to client-side hljs (#73) made this pre-existing bug catastrophic: once the script was inside a stuck textarea, no code block on the page ever got highlighted. Fixed by a new _EscapeRawHtmlPreprocessor that runs in the Python markdown pipeline after fenced_code (priority 25) and before html_block (priority 20), escaping <tagname> / </tagname> patterns outside inline backtick spans. Inline/fenced code, HTML comments (<!-- llmwiki:metadata -->), bare < in math, blockquotes, tables, headings, and link syntax are all untouched. 9 new regression tests lock it down. Verified on a real 169-code-block session page: 0/172 → 172/172 highlighted after the fix.
  • Code-fence truncation eating pages (#72) — truncate_chars / truncate_lines used to cut content mid-code-block, leaving the opening ``` without a closing fence. The markdown parser then swallowed everything that followed as one giant block (user-visible example: the "Full Directory Tree" section on subagent pages). Fixed by counting unbalanced fences in the kept portion and injecting a closing fence before the truncation marker. 5 new tests; 30 previously-mangled session files regenerated.
  • Sync crash on corrupt JSONL bytes (#72) — a single stray non-UTF-8 byte in a session transcript used to abort the entire llmwiki sync run with UnicodeDecodeError. parse_jsonl now opens with errors="replace" and silently drops non-dict records (rare stray scalars from partial writes that previously crashed filter_records with AttributeError).

[0.4.0] — 2026-04-08

Theme: AI + human dual-format. Every page ships both as HTML for humans AND as machine-readable .txt + .json siblings for AI agents, alongside site-level exports that follow open standards (llms.txt, JSON-LD, sitemap, RSS).

Added

Part A — AI-consumable exports (llmwiki/exporters.py)

  • llms.txt — short index per the llmstxt.org spec with project list, machine-readable links, and AI-agent entry points
  • llms-full.txt — flattened plain-text dump of every wiki page, ordered project → date, capped at 5 MB for pasteable LLM context
  • graph.jsonld — schema.org JSON-LD @graph representation with CreativeWork nodes for the wiki, projects, and individual sessions, all linked via isPartOf relations
  • sitemap.xml — standard sitemap with lastmod timestamps and priority hints
  • rss.xml — RSS 2.0 feed of the newest 50 sessions
  • robots.txt — with explicit llms.txt + sitemap.xml references for AI-agent-aware crawlers
  • ai-readme.md — AI-specific entry point explaining navigation structure, machine-readable siblings, and MCP tool surface
  • Per-page .txt siblings next to every sessions/<project>/<slug>.html — plain text version stripped of all markdown/HTML for fast AI consumption
  • Per-page .json siblings with structured frontmatter + body text + SHA-256 + outbound wikilinks — ideal for RAG or structured-data agents
  • Schema.org microdata on every session page (itemscope/itemtype="https://schema.org/Article" + headline + datePublished + inLanguage)
  • <link rel="canonical"> on every session page for SEO and duplicate-indexing prevention
  • Open Graph tags (og:type, og:title, og:description, article:published_time)
  • <!-- llmwiki:metadata --> HTML comment at the top of every session page — AI agents scraping HTML can parse metadata without fetching the separate .json sibling
  • wiki_export MCP tool (7th tool on the MCP server) — returns any AI-consumable export format by name (llms-txt, llms-full-txt, jsonld, sitemap, rss, manifest, or list). Capped at 200 KB per response.

Part B — Human polish

  • Reading time estimates on every session page (X min read in the metadata strip)
  • Related pages panel at the bottom of session pages (3-5 related sessions computed from shared project + entities, all client-side from search-index.json)
  • Activity heatmap on the home page — SVG cells with per-day intensity gradient
  • Mark highlighting support (<mark> styled with the accent color) for search results
  • Deep-link icons on every h2/h3/h4 in the content — hover to reveal, click to copy a canonical URL with #anchor to the clipboard
  • .txt and .json download buttons in the session-actions strip next to Copy-as-markdown

Part C — Cross-cutting infra

  • Build manifest (llmwiki/manifest.py) — generates site/manifest.json on every build with SHA-256 hashes of all files, total sizes, perf-budget check, and budget violations list
  • Link checker (llmwiki/link_checker.py) — walks site/ verifying every internal <a href>, <link href>, and <script src> resolves to an existing file. External URLs are skipped. Strict regex filters out code-block artifacts.
  • Performance budget targets declared in manifest.py (cold build <30s, total site <150 MB, per-page <3 MB, CSS+JS <200 KB, llms-full.txt <10 MB)
  • New CLI subcommands: llmwiki check-links, llmwiki export <format>, llmwiki manifest (all with --fail-on-* flags for CI integration)

Tests

  • 24 new tests in tests/test_v04.py covering exporters, manifest, link checker, MCP wiki_export, schema.org microdata, canonical links, per-page siblings, and CLI subcommands
  • 95 tests passing total (was 71 in v0.3)

Fixed

  • Link checker rewritten to only match <a> / <link> / <script> tag hrefs (not URLs inside code blocks). The initial naive regex was catching runaway multi-line matches from rendered tool-result output.
  • Canonical URLs and .txt/.json sibling links now use the actual HTML filename stem (date-slug) instead of the frontmatter slug field, which was causing broken link reports.

[0.3.0] — 2026-04-08

Added

  • pyproject.toml — full PEP 621 metadata, PyPI-ready. Optional dep groups: highlight (pygments), pdf (pypdf), dev (pytest+ruff), all. Declared entry point llmwiki = llmwiki.cli:main.
  • Eval framework (llmwiki/eval.py) — 7 structural quality checks (orphans, broken links, frontmatter coverage, type coverage, cross-linking, size bounds, contradiction tracking) totalling 100 points. New CLI: llmwiki eval [--check ...] [--json] [--fail-below N]. Zero LLM calls, pure structural analysis, runs in under a second on a 300-page wiki.
  • Codex CLI adapter graduated from v0.2 stub → production with SUPPORTED_SCHEMA_VERSIONS = ["v0.x", "v1.0"], two session store roots, config override, and hashed-path slug derivation.
  • i18n docs scaffold — translations of getting-started.md in Chinese (zh-CN), Japanese (ja), and Spanish (es) under docs/i18n/. Each linked back to the English master with a sync date.
  • 15 new tests covering the eval framework, pyproject, i18n scaffold, and version bump.

Deferred to v0.5+

  • OpenCode / OpenClaw adapter
  • Homebrew formula
  • Local LLM via Ollama (optional synthesis backend)

(per explicit user direction — none of these block a v0.3.0 release)

[0.2.0] — 2026-04-08

Added

  • Three new slash commands: /wiki-update (surgical in-place page update), /wiki-graph (knowledge graph generator), /wiki-reflect (higher-order self-reflection)
  • llmwiki/graph.py — walks every [[wikilink]] and produces graph/graph.json (canonical) + graph/graph.html (vis.js). Reports top-linked, top-linking, orphans, broken edges. CLI: llmwiki graph [--format json|html|both].
  • llmwiki/watch.py — file watcher with polling + debounce. Detects mtime changes in agent session stores and auto-runs llmwiki sync after the debounce window. CLI: llmwiki watch [--adapter ...] [--interval N] [--debounce M]. Stdlib only, no watchdog dep.
  • llmwiki/obsidian_output.py — bidirectional Obsidian output mode. Copies the compiled wiki into a subfolder of an Obsidian vault with backlinks and a README. CLI: llmwiki export-obsidian --vault PATH [--subfolder NAME] [--clean] [--dry-run].
  • Full MCP server (llmwiki/mcp/server.py) — graduated from v0.1 2-tool stub to 6 production tools: wiki_query (keyword search + page content), wiki_search (raw grep), wiki_list_sources, wiki_read_page (path-traversal guarded), wiki_lint (structural report), wiki_sync (trigger converter).
  • Cursor adapter (llmwiki/adapters/cursor.py) — detects Cursor IDE install on macOS/Linux/Windows, discovers workspace storage.
  • Gemini CLI adapter (llmwiki/adapters/gemini_cli.py) — detects ~/.gemini/ sessions.
  • PDF adapter (llmwiki/adapters/pdf.py) — optional pypdf dep, user-configurable roots, disabled by default.
  • Hover-to-preview wikilinks in the HTML viewer — floating preview cards fetched from the client-side search index.
  • Timeline view on the sessions index — compact SVG sparkline showing session frequency per day.
  • CLAUDE.md extended with /wiki-update, /wiki-graph, /wiki-reflect slash command docs and new page types (comparisons/, questions/, archive/).
  • 21 new tests covering adapters, graph builder, Obsidian output, MCP server, file watcher, and CLI subcommands.

[0.1.0] — 2026-04-08

Initial public release.

Added

  • Python CLI (python3 -m llmwiki) with sync, build, serve, init subcommands
  • Claude Code adapter (llmwiki.adapters.claude_code) — converts ~/.claude/projects/*/*.jsonl to markdown
  • Codex CLI adapter stub (llmwiki.adapters.codex_cli) — scaffold for v0.2
  • Karpathy-style wiki schema in CLAUDE.md and AGENTS.md
  • God-level HTML generator (llmwiki.build)
  • Inter + JetBrains Mono typography
  • Light/dark theme toggle with data-theme attribute + system preference
  • Global search via pre-built JSON index
  • Cmd+K command palette
  • Keyboard shortcuts (/ search, g h home, j/k next/prev session)
  • Syntax highlighting via Pygments (optional dep)
  • Collapsible tool-result sections (click to expand, auto-collapse > 500 chars)
  • Breadcrumbs on session pages
  • Reading progress bar on long pages
  • Sticky table headers on the sessions index
  • Copy-as-markdown and copy-code buttons (with document.execCommand fallback for HTTP)
  • Mobile-responsive breakpoints
  • Print-friendly CSS
  • One-click scripts for macOS/Linux (setup.sh, build.sh, sync.sh, serve.sh)
  • One-click scripts for Windows (setup.bat, build.bat, sync.bat, serve.bat)
  • .claude/commands/ slash commands: wiki-sync, wiki-build, wiki-serve, wiki-query, wiki-lint
  • .claude/skills/llmwiki-sync/SKILL.md — global skill for auto-discovery
  • GitHub Actions CI workflow (.github/workflows/ci.yml) — lint + build smoke test
  • Documentation: getting-started, architecture, configuration, claude-code, codex-cli
  • Redaction config with username, API key, token, and email patterns
  • Idempotent incremental sync via .ingestion-state.json mtime tracking
  • Live-session detection — skips sessions with activity in the last 60 minutes
  • Sub-agent session support — rendered as separate pages linked from parent