• Data, AI & Analytics

What Is Graphify? The Open-Source Tool That Gives AI Assistants Codebase Memory

Published On: 5 June 2026.By .
AI Tools · Developer Productivity · Open Source · Knowledge Graphs

AI assistants are brilliant at reading a single file. They are terrible at understanding how 200 files connect. Graphify fixes that. And it went viral overnight.

Auriga IT Editorial Team June 2026 14 min read Tags: Claude Code · Knowledge Graphs · Open Source
In a hurry? Start here
  • The problem: AI coding assistants read files sequentially with no structural map. They burn context limits and miss cross-module connections on every single query.
  • The solution: Graphify (/graphify) converts your entire project into a queryable knowledge graph. Code, docs, PDFs, images, all of it.
  • The impact: 71.5x fewer tokens per query. Persistent codebase memory that survives session resets.
  • Privacy: Source code is parsed 100% locally using Tree-sitter. It never leaves your machine.
  • Install: pip install graphifyy (double y) · 58,300+ GitHub stars · 1.2M PyPI downloads · YC S26.

01The problem every developer knows

You are onboarding to a large codebase. Or you have been on the project for months but the architecture is sprawling. Microservices, multiple databases, API layers, scattered documentation. You open your AI coding assistant, ask how the authentication module connects to the user data layer, and get a file-by-file scan that burns thousands of tokens and still misses the structural picture.

This is not a model intelligence problem. It is a context problem.

AI coding assistants like Claude Code, Codex, Cursor, and Gemini CLI operate on flat-file context. They have no map of how your codebase is actually organised, how functions call each other across modules, why certain architectural decisions were made, or how a Figma doc relates to the API handler it describes. Every single query, they start from scratch.

"AI assistants are brilliant at reading a single file. They are terrible at understanding how 200 files connect. Graphify fixes that."

02What is Graphify?

Graphify is an open-source AI coding assistant skill. It is a slash command (/graphify) you invoke inside AI coding assistants like Claude Code, OpenAI Codex, Cursor, Gemini CLI, OpenCode, Aider, Kiro, and more.

When you run it, it reads every file in your project folder and builds a queryable knowledge graph showing the structure, relationships, and reasoning behind everything in your project. Instead of your AI assistant grepping through files on every query, it navigates a compact, pre-built map.

71.5x Token reduction
58.3k+ GitHub stars
1.2M+ PyPI downloads
33 Languages
71.5x fewer tokens on a 52-file test corpus
123,000
tokens per query
Without Graphify · raw file scan
1,700
tokens per query
With Graphify · graph-navigated query

03The origin story: born in 48 hours

On April 1, 2026, Andrej Karpathy, former Tesla AI director and OpenAI co-founder, posted on X describing a workflow he wished existed. A way to drop papers, tweets, screenshots, and code notes into a folder and actually be able to query them meaningfully later, with an AI that understood the relationships between all of it.

Within 48 hours, Safi Shamsi published Graphify on GitHub. Shamsi holds an MSc in Data Science with Distinction from the University of Birmingham, where his thesis focused specifically on knowledge-graph-based hybrid RAG systems for academic search. He had the exact technical background to turn Karpathy's vision into working software immediately.

The community response was immediate. A tweet announcing the project earned over 12,000 likes, framed simply as: "Karpathy asked for LLM knowledge graphs, and someone built it." Within 10 days, Graphify had 22,000 GitHub stars. By June 2026 it had crossed 58,300 stars and 1.2 million PyPI downloads.

From idea to 58,000+ stars in 8 weeks
Apr 1
Karpathy posts the vision on X
Apr 3
Graphify published in 48 hours
Apr 13
22,000 stars in first 10 days
Jun 2026
58,300+ stars · 1.2M downloads

04How Graphify works: the 3-pass architecture

Pass 1

Deterministic AST extraction

Graphify calls Tree-sitter, a deterministic rule-based parser. No language model involved. No API call made. Your source code never leaves your machine during this pass.

Extracts: every function, class, import, call graph, docstring, and rationale comment across 33 languages including Python, TypeScript, Go, Rust, Java, C, C++, and Ruby.

100% local · zero network calls
Pass 2

Graph construction

Extracted entities become nodes. Relationships become edges. Leiden clustering surfaces communities: groups of tightly related components you might never have noticed.

The algorithm also identifies god nodes: the highest-connectivity components in your codebase. If one breaks, Graphify shows the blast radius before you touch it.

100% local · NetworkX + Leiden
Pass 3

Semantic extraction

For PDFs, Markdown docs, architecture diagrams, and whiteboard photos, Graphify uses parallel Claude subagents. API calls flow directly from your machine to Anthropic or OpenAI using your own API key. Graphify never sees your document content.

Every relationship is tagged EXTRACTED (from code) or INFERRED (from model reasoning).

API calls via your key only

05What you get as output

After running /graphify . in your project folder, three files land in graphify-out/:

graph.html
An interactive, clickable visualisation of your entire project. Filter by node type, click any node to explore its connections, navigate between detected communities, and search across the whole graph. Open in a browser to see your codebase as a system.
GRAPH_REPORT.md
A plain-English summary covering god nodes, surprising connections between distant parts of the codebase, and suggested questions the AI assistant is now positioned to answer. This is what your assistant reads first before navigating the graph.
graph.json
The structured graph file your AI assistant queries at any time. Weeks later, after your session ended, even after context was cleared. Persistent knowledge that does not disappear when you close the terminal.

06How to install and set up Graphify

Install the Python package

Requires Python 3.10+. Install from PyPI:

pip install graphifyy The package is spelled graphifyy with a double y. The single-y package (graphify) is a completely different and unrelated tool. Almost every first-time installer gets this wrong.

Register as a skill in your AI coding assistant

This registers the /graphify slash command inside Claude Code, Cursor, Gemini CLI, and all other supported assistants:

graphify install

Build the knowledge graph

Navigate to your project folder inside your AI coding assistant and run:

/graphify .

On Windows PowerShell, use graphify . without the leading slash. Three files will appear in graphify-out/: graph.html, GRAPH_REPORT.md, and graph.json.

07Which AI coding assistants does Graphify support?

As of June 2026, Graphify works as a registered skill across 10+ AI coding platforms:

Claude Code (Anthropic)
OpenAI Codex
Cursor
Gemini CLI (Google)
OpenCode
Aider
Kiro
Factory Droid
Trae
+ 10 more platforms

For Claude Code and Gemini CLI, Graphify installs a hook that fires automatically before search-style tool calls, nudging your assistant toward graph queries instead of raw file reads. For other platforms, it writes persistent instruction files (AGENTS.md, .cursor/rules/) that achieve the same behaviour.

08Is your code safe?

This is the question every security-conscious developer should ask before using any tool that touches source code. Here is the full breakdown.

100% local

Pass 1 and 2: source code

Tree-sitter is a deterministic parser. No LLM is invoked. No API call is made. Your source code does not leave your machine during Passes 1 and 2. Run with --code-only mode to use Graphify without any API key at all.

API via your key

Pass 3: docs, PDFs, images

API calls flow directly from your machine to Anthropic or OpenAI using your own key. Graphify has no relay server and never sees your document content. Governed by your agreement with your AI provider, not Graphify.

09Who should use Graphify?

Right fit
  • Developers onboarding to large unfamiliar codebases
  • Teams on repos with 100+ files where context limits hurt answers
  • Research and doc-heavy teams with mixed assets: code, PDFs, diagrams
  • AI-first engineers who want architectural reasoning, not just autocomplete
  • CTOs doing blast-radius analysis before refactoring high-connectivity components
Probably overkill
  • Small projects where a quick grep already answers the question
  • Inline code completion only with no interest in repo-level memory
  • Environments without Python 3.10+ or assistant hook setup
  • Teams working exclusively with standalone prose documents

10What actually changes when you use it

The clearest way to describe the difference: without Graphify, your AI assistant is a very fast reader. With Graphify, it becomes a senior engineer who already knows the codebase.

The difference is most visible on structural questions like these:

Cross-module refactoring

"If I change validate_card, what else breaks?"

Without the graph: partial answer from a file scan. With the graph: full blast radius traversal in one query.

Architectural intent

"Why was this schema designed this way?"

Without the graph: no context. With the graph: rationale comments from the decision-time code are connected to the schema node.

Risk identification

"Which modules are highest risk to refactor?"

Graphify surfaces god nodes automatically so you know which components carry the most blast radius risk before you start.

Spec-to-code tracing

"How does this PDF spec connect to the API handler?"

Pass 3 extracts concepts from the PDF and connects them to the code entities that implement them. Questions that were impossible become one-query answers.

"The 71.5x token reduction is real and measurable. But the bigger change is qualitative. Answers shift from 'here is what I found in this file' to 'here is how this fits into the whole system.'"

This kind of structural AI reasoning is exactly the capability that separates surface-level AI adoption from real AI-driven engineering. At Auriga IT, we apply this thinking across products for clients like Ferns N Petals and Yes Bank, building data and AI layers that give software systems genuine contextual intelligence.

11How Auriga IT can help your organisation

Graphify is a strong tool for individual developers and small teams. But for organisations looking to embed AI deeply into engineering workflows, across large codebases, multi-team environments, and enterprise-grade products, there is more ground to cover than a single open-source skill can address.

At Auriga IT, we work with startups, enterprises, and product companies to design and implement AI-assisted development workflows, build knowledge graph and RAG architectures, modernise legacy codebases so AI tools have clean systems to work with, and deliver Data, AI, and Analytics products from initial assessment through to production deployment.

We have delivered 1,000+ projects across e-commerce, fintech, healthcare, and infrastructure. If you are evaluating how AI tooling fits into your engineering roadmap, talk to our team for a no-commitment initial conversation.

12The road ahead: from Graphify to Penpax

The Graphify project, maintained by Safi Shamsi and now 71 contributors with releases roughly every other day, is moving toward Penpax: an always-on layer that applies the same graph approach to your entire working life. Meetings, browser history, emails, files, and code, updated continuously in the background.

The underlying insight is powerful. The knowledge graph is not just a codebase tool. It is a general-purpose memory architecture for AI agents. As agentic AI systems become more capable, the challenge of what they remember and how they navigate it becomes the critical constraint. Graphify is an early, well-executed answer to that challenge.

13Quick reference

FactDetail
CreatorSafi Shamsi — MSc Data Science, University of Birmingham
ReleasedApril 2026
GitHub Stars58,300+ (as of June 2026)
PyPI Downloads1.2 million+
Install commandpip install graphifyy (double y)
Invoke command/graphify . inside your AI assistant
Supported platformsClaude Code, Codex, Cursor, Gemini CLI, Aider, Kiro, OpenCode + 10 more
Languages33 (via Tree-sitter AST)
Token reduction71.5x vs raw file context on a 52-file test corpus
LicenseMIT open source, free to use
Backed byYC S26
Contributors71 contributors, releases roughly every other day

14Frequently asked questions

Is Graphify free to use?
Yes. Graphify is fully open-source under the MIT license. The only costs are the API tokens used during Pass 3 for document semantic extraction, billed to your own Anthropic or OpenAI account. The tool itself has no subscription or licensing fee.
Does Graphify work without an API key?
Partially. Pass 1 and Pass 2 require no API key and run fully locally. Pass 3 for doc, image, and PDF semantic extraction requires your existing AI provider API key. Run with --code-only to use Graphify entirely without any API key.
How long does it take to build the graph?
For small to medium codebases under 100 files, typically under a minute. For large mixed-media repositories, it varies depending on how many non-code documents require LLM processing during Pass 3.
Does the graph update automatically when I change files?
Yes. Run with the --watch flag for auto-sync: graphify . --watch. It rebuilds the affected portions of the graph as code files are edited.
What is the difference between graphify and graphifyy?
graphifyy with a double y is the correct PyPI package for this tool. graphify with a single y is an unrelated package. Always install with pip install graphifyy. This is the most common installation mistake new users make.
What is a god node and why does it matter?
A god node is a component with the highest betweenness-centrality in your codebase. It connects the most communities in your knowledge graph. Graphify surfaces these automatically so you understand which components carry the most blast-radius risk before you start a refactor. A god node breaking often causes cascading failures across multiple modules.
How is Graphify different from RAG?
RAG retrieves semantically similar text chunks using embedding models. Graphify builds a deterministic structural knowledge graph using AST parsing, capturing explicit code relationships like function calls, class inheritance, and import chains. For code, structural relationships are often more important than semantic similarity, and that is exactly what Graphify captures.
Can I use Graphify on a private company codebase?
Yes. Source code is processed entirely locally via Tree-sitter in Passes 1 and 2. For documentation in Pass 3, content travels directly from your machine to your configured AI provider using your own API key, not through Graphify's infrastructure. For maximum security, use --code-only mode to restrict all processing to the fully local passes.
Do I need Obsidian or any other tool to view the graph?
No. Obsidian or any other third-party tool integration is not required. Graphify generates a standalone graph.html file inside your graphify-out/ folder. You can open that file directly in any web browser and immediately explore your entire codebase as an interactive graph. No plugins, no extra software, no setup needed.
Final verdict

Graphify earned its GitHub stars not through marketing but through solving a real, well-understood pain point with an approach that is structurally correct for code. The 48-hour origin story is compelling, but the reason it stuck is simpler: developers tried it, it worked, and they told other developers.

If you are building with AI coding assistants on any project beyond trivial size, it is worth the five-minute setup. Install graphifyy (double y), run /graphify ., and your assistant's codebase answers get measurably better from the next query onward. The context window problem is not going away. Graphify is the most practical solution to it available today.

Related content

Stay Close to What We’re Building

Get insights on product engineering, AI, and real-world technology decisions shaping modern businesses.

Go to Top