Personal AI Knowledge Base

Built a personal AI knowledge base system based on Andrej Karpathy's LLM Wiki concept. Articles, PDFs and podcast transcripts are saved into a folder, an LLM compiles them into an interlinked wiki of synthesised concept pages, and Claude Code queries the whole thing to build sourced arguments. Two knowledge bases running so far covering Australian economics and enterprise AI workflows.

Why I built it

I wanted a knowledge base on my computer that I owned and controlled. When Karpathy published his approach it was an opportunity to rebuild a system designed by one of AI's most respected practitioners and properly understand how LLMs compound value over time. The concept that an LLM could read everything you save, maintain a structured synthesis of it, and get meaningfully better with every article added was something I wanted to test firsthand.

Claude Code terminal showing a Polfpedia wiki compilation session

I chose to build from scratch as owning the full architecture meant understanding every design decision and being able to replicate or extend the system later.

How I approached it

The architecture follows Karpathy's pattern. Each knowledge base has a raw/ folder for original sources that never get modified, a wiki/ folder where the LLM writes synthesised concept pages, and an outputs/ folder for generated analysis. The core of the system is a file called AGENTS.md that instructs the LLM how to operate as the wiki maintainer, covering domain guidance, page format, writing style, and procedures for ingestion, compilation, querying and health checks.

The toolchain is deliberately simple. Obsidian as the file viewer, Obsidian Web Clipper for saving articles from the browser, and Claude Code as the terminal interface for compilation and queries.

For the enterprise AI wiki I added a credibility scoring system that rates every source from Tier 1 (primary sources like vendor documentation) through Tier 5 (blog posts and social media). Personal analysis is tagged separately so the LLM treats it as perspective rather than external evidence. I also built support for podcast transcripts, video content via screenshots paired with transcript excerpts, and design exports with companion notes providing context for the LLM.

Obsidian view of a Polfpedia wiki page showing sources and tags

The first compile took seven articles across a mix of institutional research, media coverage and personal analysis. The LLM created concept pages for each theme it identified, cross-referenced them, and cited specific sources. That structure would have taken days to build manually.

Now there are close to 60 articles and 34 wikis.

What I learnt

This is an evergreen project. There are tweaks that can be made to the model to constantly improve it, which is part of what makes it rewarding. I added the credibility scoring concept and cross-wiki referencing after the initial build based on other people's approaches. But this was also the value of owning the build myself rather than using a pre-built plugin. I could omit what I didn't want and personalise what I did.

One early challenge was understanding the mental model. The distinction between raw sources and the LLM's synthesised wiki about those sources required a few iterations to properly internalise. Once it clicked, the whole system made more sense.

The compounding effect Karpathy described is real, but it depends on consistent ingestion more than clever architecture. The Web Clipper reducing article saving to a two-second process matters more than any feature I built. The system is now evolving beyond personal knowledge management into a consulting delivery layer, where the wiki acts as a hub of domain expertise and each client engagement draws from it while keeping client data isolated.