
About
Give agents persistent structural memory of a codebase — navigate dependencies, track public APIs, and understand why connections exist without re-reading the whole repo.
name: data-structure-protocol description: "Give agents persistent structural memory of a codebase — navigate dependencies, track public APIs, and understand why connections exist without re-reading the whole repo." risk: safe source: "https://github.com/k-kolomeitsev/data-structure-protocol" date_added: "2026-02-27"
Data Structure Protocol (DSP)
LLM coding agents lose context between tasks. On large codebases they spend most of their tokens on "orientation" — figuring out where things live, what depends on what, and what is safe to change. DSP solves this by externalizing the project's structural map into a persistent, queryable graph stored in a .dsp/ directory next to the code.
DSP is NOT documentation for humans and NOT an AST dump. It captures three things: meaning (why an entity exists), boundaries (what it imports and exposes), and reasons (why each connection exists). This is enough for an agent to navigate, refactor, and generate code without loading the entire source tree into the context window.
When to Use
Use this skill when:
- The project has a
.dsp/directory (DSP is already set up) - The user asks to set up DSP, bootstrap, or map a project's structure
- Creating, modifying, or deleting code files in a DSP-tracked project (to keep the graph updated)
- Navigating project structure, understanding dependencies, or finding specific modules
- The user mentions DSP, dsp-cli,
.dsp, or structure mapping - Performing impact analysis before a refactor or dependency replacement
Core Concepts
Code = graph
DSP models the codebase as a directed graph. Nodes are entities, edges are imports and shared/exports.
Two entity kinds exist:
- Object: any "thing" that isn't a function (module/file/class/config/resource/external dependency)
- Function: an exported function/method/handler/pipeline
Identity by UID, not by file path
Every entity gets a stable UID: obj-<8hex> for objects, func-<8hex> for functions. File paths are attributes that can change; UIDs survive renames, moves, and reformatting.
For entities inside a file, the UID is anchored with a comment marker in source code:
// @dsp func-7f3a9c12
export function calculateTotal(items) { ... }
# @dsp obj-e5f6g7h8
class UserService:
Every connection has a "why"
When an import is recorded, DSP stores a short reason explaining why that dependency exists. This lives in the exports/ reverse index of the imported entity. A dependency graph without reasons tells you what imports what; reasons tell you what is safe to change and who will break.
Storage format
Each entity gets a small directory under .dsp/:
.dsp/
├── TOC # ordered list of all entity UIDs from root
├── obj-a1b2c3d4/
│ ├── description # source path, kind, purpose (1-3 sentences)
│ ├── imports # UIDs this entity depends on (one per line)
│ ├── shared # UIDs of public API / exported entities
│ └── exports/ # reverse index: who imports this and why
│ ├── <importer_uid> # file content = "why" text
│ └── <shared_uid>/
│ ├── description # what is exported
│ └── <importer_uid> # why this specific export is imported
└── func-7f3a9c12/
├── description
├── imports
└── exports/
Everything is plain text. Diffable. Reviewable. No database needed.
Full import coverage
Every file or artifact that is imported anywhere must be represented in .dsp as an Object — code, images, styles, configs, JSON, wasm, everything. External dependencies (npm packages, stdlib, etc.) are recorded as kind: external but their internals are never analyzed.
How It Works
Initial Setup
The skill relies on a standalone Python CLI script dsp-cli.py. If it is missing from the project, download it:
curl -O https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/skills/data-structure-protocol/scripts/dsp-cli.py
Requires Python 3.10+. All commands use python dsp-cli.py --root <project-root> <command>.
Bootstrap (initial mapping)
If .dsp/ is empty, traverse the project from root entrypoint(s) via DFS on imports:
- Identify root entrypoints (
package.jsonmain, framework entry,main.py, etc.) - Document the root file:
create-object,create-functionfor each export,create-shared,add-importfor all dependencies - Take the first non-external import, document it fully, descend into its imports
- Backtrack when no unvisited local imports remain; continue until all reachable files are documented
- External dependencies:
create-object --kind external, add to TOC, but never descend intonode_modules/site-packages/etc.
Workflow Rules
- Before changing code: Find affected entities via
search,find-by-source, orread-toc. Read theirdescriptionandimportsto understand context.