All notable changes to llmwiki will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Versions below 1.0 are pre-production — API and file formats may change.
[Unreleased]¶
Added¶
-
Image download + local storage pipeline (#96) -- new
llmwiki/image_pipeline.pymodule (stdlib-only) that finds remoteimage references in converted markdown, downloads them toraw/assets/with content-addressable filenames (sha256(url)[:16].<ext>viaurllib.request+hashlib), and rewrites the markdown refs to point at the local copies. Four public functions:find_remote_images()(regex scanner returning(url, alt, line_number)tuples),download_image()(single-URL fetcher with graceful failure -- never crashes, logs warnings),rewrite_image_refs()(URL-to-local substitution),process_markdown_images()(orchestrator for one file returning(downloaded, failed, skipped)counts). Rate-limited at 1 req/s viatime.sleep(1). New--download-imagesflag onllmwiki syncruns the pipeline over every converted.mdfile after conversion. Build step (build.py) now copiesraw/assets/tosite/assets/when the directory exists so downloaded images are served alongside the HTML. 30 new tests covering find (https found, local skipped, empty, multi-line, dedup, title syntax, query params), download (hash filename, cache hit, graceful network/HTTP failure, missing extension fallback, auto-mkdir), rewrite (replacement, preservation of local refs, mixed content), and process (dry-run counting, full rewrite, failed-keeps-original, duplicate dedup, rate-limit sleep verification). -
Cursor adapter graduated to production (#37) --
llmwiki.adapters.cursornow hasSUPPORTED_SCHEMA_VERSIONS, platform-awaresession_store_pathcovering macOS/Linux/Windows, synthetic fixture attests/fixtures/cursor/minimal.jsonl, snapshot test, converter round-trip test, graceful-degradation test (unknown record types silently skipped), and doc page atdocs/adapters/cursor.md. The adapter discovers.jsonlfiles under Cursor's per-workspace storage directories and derives project slugs from the workspace hash. - Gemini CLI adapter graduated to production (#38) --
llmwiki.adapters.gemini_clinow hasSUPPORTED_SCHEMA_VERSIONS, XDG-awaresession_store_pathcovering~/.gemini,~/.config/gemini, and~/.local/share/gemini, synthetic fixture attests/fixtures/gemini_cli/minimal.jsonl, snapshot test, converter round-trip test, graceful-degradation test, and doc page atdocs/adapters/gemini-cli.md. Discovers both.jsonlandchat-*.json/session-*.jsonpatterns. - PDF adapter graduated to production (#39) --
llmwiki.adapters.pdfnow hasSUPPORTED_SCHEMA_VERSIONS,enabledconfig flag,extract_text()returning(body_md, metadata)tuple with encrypted-PDF handling (tries empty-password decrypt, skips gracefully),convert_pdf()producing frontmatter'd markdown (slug, project, title, date, pages, author, type),min_pages/max_pagesfiltering,redactcallback support, and PDF metadata extraction (title, author, creation date from document info). Convert pipeline (convert.py) routes.pdffiles throughconvert_pdf(). Test fixture attests/fixtures/pdf/sample.pdf(2-page PDF with title + author metadata), 14 tests covering extraction, conversion, discovery, filtering, redaction, and availability. Doc page atdocs/adapters/pdf.md. - GitLab Pages deployment workflow (#49) — new
.gitlab-ci.yml.exampletemplate with three-stage pipeline:build_site(installs llmwiki + builds topublic/),privacy_check(greps for PII patterns), andpages(deploys to GitLab Pages on default branch). Newdocs/deploy/gitlab-pages.mdwith quick start, configuration (custom domain, private projects, Python version), and troubleshooting. README updated to link both GitHub Pages and GitLab Pages deployment docs. - PyPI release automation via GitHub Actions (#42) — new
.github/workflows/release.ymltriggered on version tag push (v*.*.*). Builds sdist + wheel viapython -m build, publishes to PyPI via OIDC trusted publisher (no long-lived API tokens), signs artifacts with Sigstore, and creates a GitHub Release with build artifacts + signature bundles attached. Pre-release tags (rc,alpha,beta,dev) are automatically marked as pre-release. Updateddocs/maintainers/RELEASE_PROCESS.mdwith the new automated flow. - Lazy-load search index in per-project chunks (#47) — split the monolithic
search-index.jsoninto a small meta index (projects + static pages) plus per-project chunk files undersearch-chunks/<project>.json. The command palette, wikilink preview, and related-pages sidebar all use a shared chunked loader that fetches the meta index first (instant palette rendering with project entries) then fetches session chunks in parallel on first demand. Backwards-compatible with the old flat-array format. Reduces initial page transfer by ~50%+ on large wikis (the meta index is typically <1 KB vs 400+ KB for the full index). - Scheduled sync templates (#48) — ready-to-install templates for running
llmwiki syncon a daily schedule: macOS launchd plist, Linux systemd timer + service, and Windows Task Scheduler XML. Each template defaults to 03:00 daily with catch-up-on-wake/boot support. Newdocs/scheduled-sync.mdwith install/uninstall instructions for all three platforms, plus a privacy note confirming scheduled runs use the same config and redaction rules as manual runs. Cross-linked from README. - WCAG 2.1 AA accessibility audit + fixes (#46) — ran axe-core via
axe-playwright-pythonagainst 4 page types (home, projects index, sessions index, session detail) and fixed every violation. Darkened--text-mutedfrom#94a3b8to#6b7280in light mode (2.56:1 → 4.84:1 on white) and from#64748bto#8b9bb5in dark mode (4.09:1 → 6.97:1 on dark bg) for WCAG AA contrast compliance on all muted text. Overrode highlight.js GitHub theme.hljs-keyword/.hljs-typefrom#d73a49to#c23a40in light mode (4.17:1 → 4.82:1 on--bg-code). Added underlines to footer and breadcrumb links so they are distinguishable without relying on color alone (WCAG 1.4.1). Added a skip-to-content link (<a class="skip-link">) visible on first Tab press that jumps to<main id="main-content">. Newscripts/a11y_audit.pyfor re-running the scan locally. Documented everything indocs/accessibility.mdwith a contrast table, keyboard nav checklist, and the full list of fixes. Current status: 0 axe-core violations across all 4 page types. - Playwright + pytest-bdd end-to-end tests (#45) — new
tests/e2e/harness that builds a minimal demo site, serves it on a random free port via stdlibhttp.server, and drives a real Chromium browser with Playwright. Scenarios are written in Gherkin via pytest-bdd so the feature files read like plain English specs. 11 feature files / 62 scenarios covering: homepage (nav + hero + projects grid), session detail page (breadcrumbs, hljs code blocks, article sections), command palette (Cmd+K opens + focuses input + filters results), keyboard navigation (g h/g p/g s/?), mobile bottom nav (Search button opens palette, Theme button toggles dark mode), theme toggle (flipsdata-theme+ hljs stylesheet swap), copy-as-markdown (clipboard contains session content), responsive (9 viewport widths × 3 pages = horizontal-scroll + hero-visible regression gate), edge cases (empty palette, rapid typing, 404 handling, print-media layout, clean console), accessibility (non-empty nav text, aria-labels, tab order into header, Escape trap, prefers-reduced-motion), and visual regression (12 screenshots per breakpoint × theme combo uploaded as CI artifacts). Shared step library intests/e2e/steps/ui_steps.pyuses Playwright's auto-waiting locators; console-error + pageerror events are recorded per scenario via an autouse fixture so the "clean console" assertion works without per-test wiring. New[e2e]opt-in extras group inpyproject.toml(installspytest-playwright+pytest-bdd+pytest-html). Defaultpytest tests/run excludestests/e2e/via--ignore=tests/e2eso fast unit iteration is unaffected. New.github/workflows/e2e.ymlruns the suite on PRs touchingbuild.py, viz modules, ortests/e2e/**— with cached Chromium binaries, retain-on-failure Playwright trace uploads, a self-contained pytest-html report, and the screenshots directory uploaded as a separate artifact. New "Running E2E tests" section in the README covers install + run commands. No impact on the existing 439-test unit suite. - Maintainer governance scaffold (#62) — complete governance surface for the project: new
CODE_OF_CONDUCT.md(Contributor Covenant 2.1),SECURITY.md(disclosure process + in-scope/out-of-scope + privacy-first architecture summary), and a newdocs/maintainers/folder with 7 canonical docs —README.md(index),ARCHITECTURE.md(one-page system diagram + layer boundaries + what NOT to add),REVIEW_CHECKLIST.md(canonical code-review criteria with blocker/nit classification),RELEASE_PROCESS.md(step-by-step version-bump checklist),TRIAGE.md(label taxonomy + stale policy + escalation rules),ROADMAP.md(living near-term plan + release theme table),DECLINED.md(graveyard of declined ideas with date + rationale, pre-seeded with 13 entries covering all v0.5-v0.9 non-goals). New.github/meta files:.github/ISSUE_TEMPLATE/{feature_request,bug_report,chore}.yml(structured YAML forms),.github/PULL_REQUEST_TEMPLATE.md(rewritten to match the REVIEW_CHECKLIST),.github/CODEOWNERS(per-layer ownership),.github/workflows/pr-lint.yml(enforces conventional-commit title + CHANGELOG updated + no new runtime deps). Four Claude Code slash commands:/review-pr <N>(runs the REVIEW_CHECKLIST against a PR),/triage-issue <N>(applies TRIAGE.md taxonomy),/release <version>(walks RELEASE_PROCESS.md step by step),/maintainer(meta-skill that loads every governance doc). New "For maintainers" section inREADME.mdlinking to every doc and slash command. This is the scaffolding that lets the project scale past single-maintainer without losing consistency. llmwiki export-qmd— qmd collection exporter (#59) — newllmwiki/export_qmd.pymodule +export-qmdCLI subcommand. Writes a self-contained tobi/qmd collection to any output directory:qmd.yamlmanifest (with glob patterns for each wiki layer —sources/,entities/,concepts/,syntheses/,projects/, top-level), aREADME.mdwith step-by-step instructions forqmd index+ Claude Desktop MCP config snippet, an executableindex.shone-liner, and a full copy of every.mdunderwiki/preserving structure (including_context.mdfolder stubs from #60, which qmd's own context system can read). Non-goals: shipping qmd as a dep (it's TypeScript; llmwiki stays Python), running it automatically fromllmwiki build. Karpathy's LLM Wiki gist explicitly recommends qmd for hybrid search at scale — this closes the loop without adding a runtime dep. 19 new tests cover manifest rendering, README structure, script shebang + executable bit, wiki-tree copy (preserves structure, skips non-markdown, preserves_context.md), end-to-end export (empty wiki, nested tree, idempotence). Real smoke test against the current wiki copied 27 files and produced a valid collection directory.- Auto-generated vs-comparison pages (#58) — new
llmwiki/compare.pymodule + build step. Scans every model entity from #55, walks all 2-combinations, scores each pair by number of shared structured fields (provider + context window + pricing + benchmarks + modalities). Pairs abovemin_shared_fields(default 3) get an auto-generated page at/vs/<slug-a>-vs-<slug-b>.htmlwith (1) a side-by-side info table with difference highlighting, (2) a shared-benchmark SVG bar chart, (3) a price-delta paragraph identifying the cheaper model and % difference, and (4) a stub## Summarysection for the user or LLM to fill in later. Pairs are sorted by score descending and capped atmax_pairs(default 500) so the combinatorial explosion stays under control on large wikis. Slugs are alphabetically enforced so every pair has one canonical URL. User overrides atwiki/vs/<slug>.mdreplace the auto-gen for that URL. New/vs/index.htmlpage with a sortable table, new "Compare" nav link. Seeded a second model entity (wiki/entities/GPT5.md) so the demo build ships with a real Claude Sonnet 4 vs GPT-5 comparison. 22 new tests. - Project topics — GitHub-style tag chips on project cards + pages — new
llmwiki/project_topics.pymodule andwiki/projects/<slug>.mdper-project profile convention. Each profile file carries frontmatter with atopics:list ([rust, blog, ssg]), optionaldescription, and optionalhomepageURL. Build-time surfaces topics as pill-shaped chips on the home-page project cards (4 chips + overflow collapse into+N more) AND as a hero strip on each project detail page (up to 12 chips, with the description paragraph and a clickable homepage link if present). When no profile exists, falls back to aggregating sessiontags:frontmatter with universal noise tags (claude-code,session-transcript,demo) filtered out and amin_count ≥ 2threshold so one-off stragglers don't crowd the strip. Full dark-mode styling via theme vars; chips hover with the accent color. Seeded four profiles to ship with the repo (demo-blog-engine,demo-ml-pipeline,demo-todo-api,llm-wiki) and added a gitignore exception forwiki/projects/so user profiles are committed by default. 24 new tests cover profile loading, topics normalization (lowercase + dedup), session-tag aggregation with noise filtering, precedence rules, chip rendering, overflow collapse, HTML escaping, URL encoding for linked chips. - Append-only changelog field + timeline + pricing sparkline (#56) — new
llmwiki/changelog_timeline.pymodule consumes an optionalchangelog:list in model entity frontmatter and renders three surfaces: (1) a vertical timeline widget on each model detail page (newest-first, with from→to deltas colored by direction — price cuts green, price hikes red, benchmark lifts green, numeric context expansions shown with an up arrow), (2) an inline pricing sparkline (stdlib SVG) that appears when the changelog has ≥2 datedinput_per_1mchanges so readers can see the trend at a glance, and (3) a "Recently updated · last 30 days" card on the home page listing any model entity that changed recently. Append-only by design — if an entry is wrong, add a correcting entry rather than rewriting history. The frontmatter parser's naive comma-split on bracketed JSON arrays is papered over by a stitch-and-reparse fallback inparse_changelog(), with a regression test locking it in place. Numeric deltas get K/M suffix formatting, string deltas (e.g. license changes) render as strike-through → bold. All HTML-escaped. 27 new tests. Seededwiki/entities/ClaudeSonnet4.mdwith a real 4-entry changelog (launch → price cut → context expansion → SWE-bench update) so the feature is visible immediately on the live build. - Structured model-profile schema +
/models/section (#55) — newllmwiki/schema.py(stdlib-only TypedDict validator) andllmwiki/models_page.py(renderer) add an opt-in schema for entity pages withentity_kind: ai-model. Pages can declareprovider, inline-JSONmodel/pricing/benchmarksblocks, and amodalitieslist. The build pipeline discovers every valid model page underwiki/entities/, validates it (bad data → warnings, not build crashes), renders a structured info-card at the top of a per-model detail page (/models/<slug>.html), and emits a sortable/models/index.htmltable with every benchmark key used anywhere as a column. 13 well-known benchmark keys get pretty labels (gpqa_diamond→ "GPQA Diamond",swe_bench→ "SWE-bench",mmlu→ "MMLU", etc.); unknown keys pass through for forward compatibility. Price formatting supports USD/EUR/GBP and falls back to currency-prefixed for anything else. New nav-bar link "Models" active on the detail and index pages. Full docs indocs/reference/entity-schema.md, seeded example inwiki/entities/ClaudeSonnet4.md. 36 new tests acrosstest_schema.py(21) andtest_models_page.py(15): happy path, minimum-viable page, validation warnings, non-numeric benchmarks, out-of-range scores rejected, malformed JSON treated as empty + warning, unknown benchmark keys allowed, HTML escaping, benchmark sort order, table column union. - Folder-level
_context.mdfiles (#60) — newllmwiki/context_md.pymodule + convention, borrowed from tobi/qmd's context pattern. An optional_context.mdfile can sit alongside pages in any wiki folder (e.g.wiki/entities/_context.md) to describe what the folder is for and which queries should traverse it. When a Claude Code/wiki-querysession walks the tree, it reads the folder's context file first and uses the summary to decide whether to descend — saving context tokens on every deep query instead of sampling random pages to infer a folder's purpose.build.py'sdiscover_sources()now skips_context.mdso these files never pollute the session index, search index, or AI-consumable exports.CLAUDE.mdQuery + Lint workflows document the convention, andfind_uncontexted_folders()powers a new/wiki-lintwarning for folders with >10 pages but no context stub. Ships with three seeded stubs (wiki/entities/_context.md,wiki/concepts/_context.md,wiki/sources/_context.md) to show the pattern. 19 new tests. - Token usage card, project timeline, and site-wide stats (#66) — new
llmwiki/viz_tokens.pymodule (stdlib-only) consumes thetoken_totalsfrontmatter key (#63) and renders three related views. Session card shows Input / Cache creation / Cache read / Output in a stacked-bar layout with a cache-hit-ratio badge (green ≥80%, yellow 50–79%, red <50%) — formula matches Anthropic's definition (cache_read / (cache_read + cache_creation + input), excludes output). Project timeline is a log-scale area chart of daily total tokens over the project lifetime + aggregate cache hit ratio in the header. Site-wide stats onindex.htmlshow four cards: total tokens, average per session, best-cache-hit project (linked), heaviest project (linked). Human-readable K/M/B number formatting via sharedformat_tokens()helper. Full dark-mode palette via--token-input / --token-cache-creation / --token-cache-read / --token-output / --token-area-fill / --token-area-strokeCSS custom properties. Sessions missingtoken_totals(older converter output) degrade gracefully — card renders nothing instead of crashing. 45 new tests. - Tool-calling bar chart (#65) — new
llmwiki/viz_tools.pymodule (stdlib-only, pure-SVG) renders a horizontal bar chart of tool usage from thetool_countsfrontmatter key (#63). One chart per session page (sorted descending, top 10 tools + "Other (N tools)" overflow row) and one aggregate chart per project page. Category-based coloring: I/O (Read/Write/Edit — blue), Search (Grep/Glob/WebSearch — purple), Execution (Bash/Skill — orange), Network (WebFetch/mcp__* — green), Planning (Agent/TodoWrite/ExitPlanMode — slate). Tooltips show{tool}: {count} calls ({pct}%). Long tool names (e.g.mcp__Claude_in_Chrome__tabs_context_mcp) are truncated to 28 chars in the label column but preserved in the tooltip. Dark-mode variants for every category via--tool-cat-*CSS custom properties. 38 new tests cover: JSON-string and dict frontmatter shapes, aggregation across sessions, category mapping for every standard + MCP tool, sort order, overflow collapse, tooltip grammar, XSS defense, bar-width scaling. - GitLab/GitHub-style 365-day activity heatmap (#64, #72) — new
llmwiki/viz_heatmap.pymodule (stdlib-only, pure-SVG) renders a full-year rolling contribution grid at build time. One aggregate heatmap onsite/index.htmlcounting every main session across all projects, one per-project heatmap on eachprojects/<slug>.htmlpage scoped to that project. Sunday-aligned 53-column grid (369–371 cells depending on where the build date lands in the week), five-level quantile bucketing computed over non-zero days so sparse projects don't collapse into a single color, empty days always render as level-0 so the grid dimensions stay constant. Dark-mode variant via--heatmap-0..4CSS custom properties (light: ebedf0→216e39, dark: 161b22→39d353). A11y:role="img"+ descriptivearia-label, per-cell<title>tooltips. Replaces the v0.4 JS-based tiny-strip heatmap. 22 new tests lock the window bounds, quantile math, SVG structure, and sparse-data edge cases. - Session metrics frontmatter (#63) — converter now emits five new keys per session as JSON inline:
tool_counts,token_totals(input / cache_creation / cache_read / output),turn_count,hour_buckets(UTC-normalised ISO-hour → activity count), andduration_seconds. Foundation for the v0.8 visualization stack (#64 heatmap / #65 tool chart / #66 token card). Stdlib-only; byte-identical on re-run. 24 new tests. - Changelog page (#72) —
CHANGELOG.mdnow renders as a first-class page atsite/changelog.htmlwith a nav-bar link, narrow reading column, keep-a-changelog typography, and the same theme/print styles as the rest of the wiki. - highlight.js syntax highlighting (#73) — replaced server-side Pygments/codehilite with client-side highlight.js v11.9.0 loaded from a pinned jsdelivr CDN. Both GitHub light (
github.min.css) and GitHub dark (github-dark.min.css) themes are preloaded; the runtime swaps thedisabledflag on<link>tags when the theme toggles so code blocks stay in sync with the rest of the page. Code fences now emit plain<pre><code class="language-xxx">via thefenced_codeextension. Lighter build (no optional Python dep), consistent look across every page, auto-detection for untagged blocks. 15 new tests. - Public demo deployment (#73) —
.github/workflows/pages.ymlnow builds a demo site from the eight dummy sessions underexamples/demo-sessions/on every push tomasterand deploys it to GitHub Pages. No personal data. Three fictional projects (demo-blog-engineRust SSG,demo-ml-pipelineDistilBERT fine-tune,demo-todo-apiFastAPI CRUD) with realistic code fences so visitors can see highlight.js and the full session UX immediately. - README screenshots (#73) — added six embedded screenshots under
docs/images/(home, sessions index, session detail, changelog, projects index, code-heavy session) captured from the demo site with headless Chrome at 2x device pixel ratio.
Changed¶
- No optional highlight dependency (#73) —
pip install -e '.[highlight]'is now a no-op alias kept only for backwards-compatibility with v0.4 install docs.setup.sh,setup.bat,pyproject.toml, and CI workflows no longer install Pygments.
Fixed¶
- Horizontal overflow on small viewports (#45) — three bugs in the nav/layout CSS caused the body to scroll horizontally below 1024px and made the mobile-bottom-nav breakpoint mismatch the common 768 tablet cutoff. (1) The top-nav text anchors (Home / Projects / Sessions / Models / Compare / Changelog) stayed inline at every width and overflowed a 768px viewport — now hidden via
@media (max-width: 1023px)so tablet + mobile use the command palette or the bottom nav for navigation instead. (2) The.mobile-bottom-navbreakpoint wasmax-width: 720px— nowmax-width: 767pxso the 767/768 tablet boundary is respected. (3) The.recently-updated-itemgrid had a fixed180px 100px 1frtemplate that overflowed sub-400px viewports — now collapses to a single column below 640px and truncates overflowing cell content withtext-overflow: ellipsis. Caught by the new responsive E2E scenarios in #45. Desktop (1024+) layout is unchanged. - Raw HTML in session prose leaking into the DOM (#74) — a session transcript that mentioned
<textarea>(or any tag-shaped substring) in prose used to pass through the markdown library unescaped, leaving an unclosed element that swallowed every following tag — including the<script>that boots highlight.js. The v0.5 swap from server-side Pygments to client-side hljs (#73) made this pre-existing bug catastrophic: once the script was inside a stuck textarea, no code block on the page ever got highlighted. Fixed by a new_EscapeRawHtmlPreprocessorthat runs in the Pythonmarkdownpipeline afterfenced_code(priority 25) and beforehtml_block(priority 20), escaping<tagname>/</tagname>patterns outside inline backtick spans. Inline/fenced code, HTML comments (<!-- llmwiki:metadata -->), bare<in math, blockquotes, tables, headings, and link syntax are all untouched. 9 new regression tests lock it down. Verified on a real 169-code-block session page: 0/172 → 172/172 highlighted after the fix. - Code-fence truncation eating pages (#72) —
truncate_chars/truncate_linesused to cut content mid-code-block, leaving the opening```without a closing fence. The markdown parser then swallowed everything that followed as one giant block (user-visible example: the "Full Directory Tree" section on subagent pages). Fixed by counting unbalanced fences in the kept portion and injecting a closing fence before the truncation marker. 5 new tests; 30 previously-mangled session files regenerated. - Sync crash on corrupt JSONL bytes (#72) — a single stray non-UTF-8 byte in a session transcript used to abort the entire
llmwiki syncrun withUnicodeDecodeError.parse_jsonlnow opens witherrors="replace"and silently drops non-dict records (rare stray scalars from partial writes that previously crashedfilter_recordswithAttributeError).
[0.4.0] — 2026-04-08¶
Theme: AI + human dual-format. Every page ships both as HTML for humans AND as machine-readable .txt + .json siblings for AI agents, alongside site-level exports that follow open standards (llms.txt, JSON-LD, sitemap, RSS).
Added¶
Part A — AI-consumable exports (llmwiki/exporters.py)¶
llms.txt— short index per the llmstxt.org spec with project list, machine-readable links, and AI-agent entry pointsllms-full.txt— flattened plain-text dump of every wiki page, ordered project → date, capped at 5 MB for pasteable LLM contextgraph.jsonld— schema.org JSON-LD@graphrepresentation withCreativeWorknodes for the wiki, projects, and individual sessions, all linked viaisPartOfrelationssitemap.xml— standard sitemap withlastmodtimestamps and priority hintsrss.xml— RSS 2.0 feed of the newest 50 sessionsrobots.txt— with explicitllms.txt+sitemap.xmlreferences for AI-agent-aware crawlersai-readme.md— AI-specific entry point explaining navigation structure, machine-readable siblings, and MCP tool surface- Per-page
.txtsiblings next to everysessions/<project>/<slug>.html— plain text version stripped of all markdown/HTML for fast AI consumption - Per-page
.jsonsiblings with structured frontmatter + body text + SHA-256 + outbound wikilinks — ideal for RAG or structured-data agents - Schema.org microdata on every session page (
itemscope/itemtype="https://schema.org/Article"+headline+datePublished+inLanguage) <link rel="canonical">on every session page for SEO and duplicate-indexing prevention- Open Graph tags (
og:type,og:title,og:description,article:published_time) <!-- llmwiki:metadata -->HTML comment at the top of every session page — AI agents scraping HTML can parse metadata without fetching the separate.jsonsiblingwiki_exportMCP tool (7th tool on the MCP server) — returns any AI-consumable export format by name (llms-txt,llms-full-txt,jsonld,sitemap,rss,manifest, orlist). Capped at 200 KB per response.
Part B — Human polish¶
- Reading time estimates on every session page (
X min readin the metadata strip) - Related pages panel at the bottom of session pages (3-5 related sessions computed from shared project + entities, all client-side from
search-index.json) - Activity heatmap on the home page — SVG cells with per-day intensity gradient
- Mark highlighting support (
<mark>styled with the accent color) for search results - Deep-link icons on every
h2/h3/h4in the content — hover to reveal, click to copy a canonical URL with#anchorto the clipboard .txtand.jsondownload buttons in the session-actions strip next to Copy-as-markdown
Part C — Cross-cutting infra¶
- Build manifest (
llmwiki/manifest.py) — generatessite/manifest.jsonon every build with SHA-256 hashes of all files, total sizes, perf-budget check, and budget violations list - Link checker (
llmwiki/link_checker.py) — walkssite/verifying every internal<a href>,<link href>, and<script src>resolves to an existing file. External URLs are skipped. Strict regex filters out code-block artifacts. - Performance budget targets declared in
manifest.py(cold build <30s, total site <150 MB, per-page <3 MB, CSS+JS <200 KB,llms-full.txt<10 MB) - New CLI subcommands:
llmwiki check-links,llmwiki export <format>,llmwiki manifest(all with--fail-on-*flags for CI integration)
Tests¶
- 24 new tests in
tests/test_v04.pycovering exporters, manifest, link checker, MCPwiki_export, schema.org microdata, canonical links, per-page siblings, and CLI subcommands - 95 tests passing total (was 71 in v0.3)
Fixed¶
- Link checker rewritten to only match
<a>/<link>/<script>tag hrefs (not URLs inside code blocks). The initial naive regex was catching runaway multi-line matches from rendered tool-result output. - Canonical URLs and
.txt/.jsonsibling links now use the actual HTML filename stem (date-slug) instead of the frontmatterslugfield, which was causing broken link reports.
[0.3.0] — 2026-04-08¶
Added¶
pyproject.toml— full PEP 621 metadata, PyPI-ready. Optional dep groups:highlight(pygments),pdf(pypdf),dev(pytest+ruff),all. Declared entry pointllmwiki = llmwiki.cli:main.- Eval framework (
llmwiki/eval.py) — 7 structural quality checks (orphans, broken links, frontmatter coverage, type coverage, cross-linking, size bounds, contradiction tracking) totalling 100 points. New CLI:llmwiki eval [--check ...] [--json] [--fail-below N]. Zero LLM calls, pure structural analysis, runs in under a second on a 300-page wiki. - Codex CLI adapter graduated from v0.2 stub → production with
SUPPORTED_SCHEMA_VERSIONS = ["v0.x", "v1.0"], two session store roots, config override, and hashed-path slug derivation. - i18n docs scaffold — translations of
getting-started.mdin Chinese (zh-CN), Japanese (ja), and Spanish (es) underdocs/i18n/. Each linked back to the English master with a sync date. - 15 new tests covering the eval framework, pyproject, i18n scaffold, and version bump.
Deferred to v0.5+¶
- OpenCode / OpenClaw adapter
- Homebrew formula
- Local LLM via Ollama (optional synthesis backend)
(per explicit user direction — none of these block a v0.3.0 release)
[0.2.0] — 2026-04-08¶
Added¶
- Three new slash commands:
/wiki-update(surgical in-place page update),/wiki-graph(knowledge graph generator),/wiki-reflect(higher-order self-reflection) llmwiki/graph.py— walks every[[wikilink]]and producesgraph/graph.json(canonical) +graph/graph.html(vis.js). Reports top-linked, top-linking, orphans, broken edges. CLI:llmwiki graph [--format json|html|both].llmwiki/watch.py— file watcher with polling + debounce. Detects mtime changes in agent session stores and auto-runsllmwiki syncafter the debounce window. CLI:llmwiki watch [--adapter ...] [--interval N] [--debounce M]. Stdlib only, nowatchdogdep.llmwiki/obsidian_output.py— bidirectional Obsidian output mode. Copies the compiled wiki into a subfolder of an Obsidian vault with backlinks and a README. CLI:llmwiki export-obsidian --vault PATH [--subfolder NAME] [--clean] [--dry-run].- Full MCP server (
llmwiki/mcp/server.py) — graduated from v0.1 2-tool stub to 6 production tools:wiki_query(keyword search + page content),wiki_search(raw grep),wiki_list_sources,wiki_read_page(path-traversal guarded),wiki_lint(structural report),wiki_sync(trigger converter). - Cursor adapter (
llmwiki/adapters/cursor.py) — detects Cursor IDE install on macOS/Linux/Windows, discovers workspace storage. - Gemini CLI adapter (
llmwiki/adapters/gemini_cli.py) — detects~/.gemini/sessions. - PDF adapter (
llmwiki/adapters/pdf.py) — optionalpypdfdep, user-configurable roots, disabled by default. - Hover-to-preview wikilinks in the HTML viewer — floating preview cards fetched from the client-side search index.
- Timeline view on the sessions index — compact SVG sparkline showing session frequency per day.
- CLAUDE.md extended with
/wiki-update,/wiki-graph,/wiki-reflectslash command docs and new page types (comparisons/,questions/,archive/). - 21 new tests covering adapters, graph builder, Obsidian output, MCP server, file watcher, and CLI subcommands.
[0.1.0] — 2026-04-08¶
Initial public release.
Added¶
- Python CLI (
python3 -m llmwiki) withsync,build,serve,initsubcommands - Claude Code adapter (
llmwiki.adapters.claude_code) — converts~/.claude/projects/*/*.jsonlto markdown - Codex CLI adapter stub (
llmwiki.adapters.codex_cli) — scaffold for v0.2 - Karpathy-style wiki schema in
CLAUDE.mdandAGENTS.md - God-level HTML generator (
llmwiki.build) - Inter + JetBrains Mono typography
- Light/dark theme toggle with
data-themeattribute + system preference - Global search via pre-built JSON index
- Cmd+K command palette
- Keyboard shortcuts (
/search,g hhome,j/knext/prev session) - Syntax highlighting via Pygments (optional dep)
- Collapsible tool-result sections (click to expand, auto-collapse > 500 chars)
- Breadcrumbs on session pages
- Reading progress bar on long pages
- Sticky table headers on the sessions index
- Copy-as-markdown and copy-code buttons (with
document.execCommandfallback for HTTP) - Mobile-responsive breakpoints
- Print-friendly CSS
- One-click scripts for macOS/Linux (
setup.sh,build.sh,sync.sh,serve.sh) - One-click scripts for Windows (
setup.bat,build.bat,sync.bat,serve.bat) .claude/commands/slash commands:wiki-sync,wiki-build,wiki-serve,wiki-query,wiki-lint.claude/skills/llmwiki-sync/SKILL.md— global skill for auto-discovery- GitHub Actions CI workflow (
.github/workflows/ci.yml) — lint + build smoke test - Documentation: getting-started, architecture, configuration, claude-code, codex-cli
- Redaction config with username, API key, token, and email patterns
- Idempotent incremental sync via
.ingestion-state.jsonmtime tracking - Live-session detection — skips sessions with activity in the last 60 minutes
- Sub-agent session support — rendered as separate pages linked from parent