# Knowledge graph

> Implicit directed edges from markdown links, tag clustering, directory hierarchy, BFS traversal, neighbor/backlink queries, and optional NetworkX export via `GraphExtractor`.

- Repository: EliaszDev/hermes-okf
- GitHub: https://github.com/EliaszDev/hermes-okf
- Human docs: https://www.grok-wiki.com/public/docs/eliaszdev-hermes-okf-b71befaafe02
- Complete Markdown: https://www.grok-wiki.com/public/docs/eliaszdev-hermes-okf-b71befaafe02/llms-full.txt

## Source Files

- `src/hermes_okf/graph.py`
- `src/hermes_okf/bundle.py`
- `docs/ARCHITECTURE.md`
- `tests/test_graph.py`
- `src/hermes_okf/cli.py`

---

---
title: "Knowledge graph"
description: "Implicit directed edges from markdown links, tag clustering, directory hierarchy, BFS traversal, neighbor/backlink queries, and optional NetworkX export via `GraphExtractor`."
---

Hermes OKF builds an implicit knowledge graph over every concept file in an OKF bundle. Nodes are concept IDs (relative paths without `.md`); directed edges come from markdown links in concept bodies. `GraphExtractor` layers neighbor/backlink queries, tag clustering, directory co-location, BFS traversal, and optional NetworkX export on top of `OKFBundle.get_graph_edges()`. No RDF store, Cypher engine, or graph database is required — the graph is reconstructed on demand from plain markdown files.

## Graph model

The graph has three complementary views. Only markdown links form directed edges; directory layout and tags provide structural and semantic grouping without creating automatic edges.

| View | Mechanism | Edge type | API surface |
|------|-----------|-----------|-------------|
| Link graph | `[label](target.md)` in concept bodies | Directed (`source` → `target`) | `get_edges`, `get_neighbors`, `get_backlinks`, `traverse` |
| Directory hierarchy | Co-located `.md` files in the same folder | Sibling listing (not parent→child edges) | `get_children` |
| Tag clustering | Shared `tags` frontmatter values | Soft clusters (no edges) | `get_tag_clusters` |

```mermaid
flowchart LR
  subgraph persistence [Filesystem OKF bundle]
    MD["Concept .md files"]
  end
  subgraph extraction [Edge extraction]
    BE["OKFBundle.get_graph_edges()"]
  end
  subgraph navigation [GraphExtractor]
    GN["get_neighbors / get_backlinks"]
    GT["traverse (BFS)"]
    GC["get_children / get_tag_clusters"]
    NX["to_networkx()"]
  end
  subgraph surfaces [Callers]
    CLI["hermes-okf graph-*"]
    SDK["Python agents / scripts"]
  end
  MD --> BE
  BE --> GN
  BE --> GT
  BE --> NX
  MD --> GC
  GN --> CLI
  GN --> SDK
  GT --> SDK
  GC --> SDK
  NX --> SDK
```

<Info>
Reserved files `index.md` and `log.md` are excluded from edge scanning, concept listing, and tag clustering. They never appear as graph nodes.
</Info>

## Link-based directed edges

`OKFBundle.get_graph_edges()` scans every `*.md` file under the bundle root (except `index.md` and `log.md`), matches markdown links with the regex `\[([^\]]+)\]\(([^)]+)\)`, and emits one directed edge per match.

<ResponseField name="edge" type="object">
  <ResponseField name="source" type="string">Concept ID of the file containing the link (relative path without `.md`, POSIX separators).</ResponseField>
  <ResponseField name="target" type="string">Link destination after `.md` suffix removal. Relative paths are not resolved against the bundle root.</ResponseField>
  <ResponseField name="context" type="string">Link label text (the bracketed portion of the markdown link).</ResponseField>
</ResponseField>

**Inclusion rules:**

- Targets ending in `.md` have the suffix stripped (e.g. `b.md` → `b`, `projects/foo.md` → `projects/foo`).
- External URLs (`http://`, `https://`) are skipped entirely.
- Links in `log.md` and `index.md` are never scanned.

**Example:** A concept `a` with body `[see b](b.md)` and concept `b` with body `Body` produces one edge:

```text
a -> b  (see b)
```

Use concept IDs that match the filesystem layout. A link `[config](config/agent.md)` from `tools/search_web` yields target `config/agent`, not a resolved absolute concept path.

## GraphExtractor API

`GraphExtractor` is exported from `hermes_okf` and takes an `OKFBundle` instance. It delegates edge extraction to the bundle and adds navigation helpers.

```python
from hermes_okf import OKFBundle, GraphExtractor

bundle = OKFBundle("./my_knowledge")
extractor = GraphExtractor(bundle)
```

### Link navigation

| Method | Returns | Description |
|--------|---------|-------------|
| `get_edges()` | `list[dict[str, str]]` | All directed link edges in the bundle |
| `get_neighbors(concept_id)` | `list[str]` | Target concept IDs of outgoing edges from `concept_id` |
| `get_backlinks(concept_id)` | `list[str]` | Source concept IDs of incoming edges to `concept_id` |

<Note>
`OKFBundle.get_neighbors(concept_id)` also exists but returns full edge dicts (`source`, `target`, `context`), not bare target IDs. Prefer `GraphExtractor` for ID-only neighbor lists and backlink queries.
</Note>

### Directory siblings

`get_children(concept_id)` lists other concept IDs in the same directory as the given concept. It excludes `index.md` and the concept itself. This reflects filesystem co-location, not a parent→child link in the markdown graph.

For concepts `sub/a` and `sub/b` in the same folder, `get_children("sub/a")` includes `sub/b`.

### Tag clustering

`get_tag_clusters()` returns `dict[str, list[str]]` mapping each tag string to the concept IDs that carry it in frontmatter. A concept with multiple tags appears in multiple clusters. Tags do not create graph edges — they group concepts for filtering and recall alongside the link graph.

### BFS traversal

`traverse(start_id, max_depth=3)` performs a breadth-first walk of the link graph from `start_id` and returns a nested dict subtree.

<ResponseField name="traverse result" type="object">
  <ResponseField name="id" type="string">Concept ID at this node.</ResponseField>
  <ResponseField name="title" type="string">Concept title from frontmatter, or the ID if missing.</ResponseField>
  <ResponseField name="type" type="string">OKF `type` field, or `"Unknown"`.</ResponseField>
  <ResponseField name="depth" type="integer">Depth from `start_id` (0 at root).</ResponseField>
  <ResponseField name="children" type="array">Nested traverse nodes for outgoing link targets (optional).</ResponseField>
</ResponseField>

Traversal follows outgoing markdown links only. Cycles are bounded by `max_depth`; a node may appear in multiple branches of the returned tree. Nodes beyond `max_depth` are omitted.

```python
tree = extractor.traverse("decisions/api_provider", max_depth=2)
# tree["id"] == "decisions/api_provider"
# tree["children"][0]["id"] == outgoing link target
```

### NetworkX export

`to_networkx()` builds a `networkx.DiGraph` with one node per concept (attributes from `concept.metadata`) and directed edges from link extraction (edge attribute `context` holds the link label).

NetworkX is **not** a core or optional-extra dependency. Install it separately:

```bash
pip install networkx
```

If NetworkX is missing, `to_networkx()` raises `ImportError` with install instructions.

```python
graph = extractor.to_networkx()
# graph.nodes["projects/my_project"]["type"]  -> frontmatter fields
# graph.edges["a", "b"]["context"]            -> link label
```

## CLI graph inspection

The standalone CLI exposes two graph subcommands. Both accept `--path` (default `.`) to select the bundle root.

<Steps>
<Step title="List all link edges">

```bash
hermes-okf graph-edges --path ./my_knowledge
```

<ResponseExample>

```text
decisions/api_provider -> tools/search_web  (tool access)
tools/search_web -> context/firecrawl_config  (Firecrawl)
```

</ResponseExample>

Prints `No edges found.` when the bundle has no internal markdown links.

</Step>

<Step title="List outgoing neighbors for one concept">

```bash
hermes-okf graph-neighbors --path ./my_knowledge tools/search_web
```

<ResponseExample>

```text
context/firecrawl_config
config/agent
```

</ResponseExample>

Prints `No neighbors found.` when the concept has no outgoing links.

</Step>
</Steps>

Backlinks, tag clusters, BFS traversal, directory children, and NetworkX export are available only through the Python SDK (`GraphExtractor` or `OKFBundle.get_graph_edges()`).

## Authoring linked concepts

Link concepts in markdown bodies to grow the graph as agents write memory. A typical chain connects decisions, tools, and configuration:

```markdown
---
type: Decision
title: API Provider Choice
---

Selected OpenRouter for [tool access](tools/search_web.md).
```

```markdown
---
type: Tool
title: search_web
---

Search the web using [Firecrawl](context/firecrawl_config.md).
Requires [OpenRouter key](config/agent.md) for rate limits.
```

After both files exist, `get_edges()` reports directed edges from the decision to the tool and from the tool to its dependencies. Agents can then call `get_backlinks("config/agent")` to find what references a configuration, or `traverse("decisions/api_provider")` to walk the reasoning chain.

<Warning>
Broken links still produce edges: if `target.md` does not exist, the edge is recorded but `read_concept(target)` returns `None`. Validate bundle structure separately with `hermes-okf validate`.
</Warning>

## Design constraints

| Constraint | Behavior |
|------------|----------|
| No graph database | Graph is recomputed from filesystem on each call |
| No automatic hierarchy edges | Parent directories do not link to child concepts unless markdown links exist |
| No external link edges | `http://` and `https://` targets are ignored |
| Relative path targets | Only `.md` suffix stripping; `../` paths are not normalized to concept IDs |
| Core dependency footprint | Graph logic uses stdlib + `pyyaml` only; NetworkX is opt-in |

The OKF v0.1 conformance model treats markdown links as the canonical edge type. Types and tags are user-defined — the graph does not enforce a fixed ontology.

## Related pages

<CardGroup>
<Card title="OKF bundle model" href="/okf-bundle-model">
Concept files, frontmatter fields, reserved paths, and how concept IDs map to the filesystem.
</Card>
<Card title="Standalone CLI workflows" href="/standalone-cli-workflows">
Operate a bundle without Hermes: init, validate, search, and graph inspection with `--path`.
</Card>
<Card title="Python SDK reference" href="/python-sdk-reference">
Full `GraphExtractor` and `OKFBundle` graph method signatures and return shapes.
</Card>
<Card title="OKF bundle basics example" href="/example-okf-bundle-basics">
Copy-paste recipe: write concepts, search by tag, and print graph edges.
</Card>
<Card title="Enable RAG" href="/enable-rag">
Optional vector retrieval over the same bundle — complements link traversal with semantic search.
</Card>
</CardGroup>
