# Visualize OKF bundles

> Generate self-contained viz.html graph viewers from OKF bundles with force-directed layouts, concept detail panels, backlinks, and in-browser markdown rendering.

- Repository: GoogleCloudPlatform/knowledge-catalog
- GitHub: https://github.com/GoogleCloudPlatform/knowledge-catalog
- Human docs: https://www.grok-wiki.com/public/docs/googlecloudplatform-knowledge-catalog-9cee6ee3cba5
- Complete Markdown: https://www.grok-wiki.com/public/docs/googlecloudplatform-knowledge-catalog-9cee6ee3cba5/llms-full.txt

## Source Files

- `okf/README.md`
- `okf/src/enrichment_agent/cli.py`
- `okf/src/enrichment_agent/viewer/generator.py`
- `okf/src/enrichment_agent/viewer/templates/viz.html`
- `okf/src/enrichment_agent/viewer/static/viz.js`
- `okf/src/enrichment_agent/viewer/static/viz.css`

---

---
title: "Visualize OKF bundles"
description: "Generate self-contained viz.html graph viewers from OKF bundles with force-directed layouts, concept detail panels, backlinks, and in-browser markdown rendering."
---

The `enrichment-agent visualize` subcommand (also invokable as `python -m enrichment_agent visualize`) walks an OKF bundle directory, extracts concepts and cross-links from markdown files, and writes a single self-contained `viz.html` file with an embedded graph and in-browser markdown renderer. No model credentials, network access, or backend server is required at generation time; the viewer loads Cytoscape.js and marked from a CDN only when you open the HTML in a browser.

<Note>
The visualizer is a proof-of-concept **consumer** of OKF bundles. Any tool that reads markdown can consume bundles; this viewer is one bundled option for exploring graph-shaped knowledge.
</Note>

## Prerequisites

- The `enrichment-agent` package installed from the `okf/` directory (see [Installation](/installation)).
- An OKF bundle directory on disk — either produced by the enrichment agent or authored by hand. Bundles in this repository include `okf/bundles/ga4/`, `okf/bundles/stackoverflow/`, and `okf/bundles/crypto_bitcoin/`, each with a pre-generated `viz.html`.

No BigQuery, Vertex AI, or Gemini credentials are needed for visualization. Generation is entirely local file I/O.

## Generate a visualization

<Steps>
<Step title="Point at a bundle directory">

Pass `--bundle` with the root of an OKF bundle (the directory that contains concept markdown files and optional `index.md` navigation files).

</Step>
<Step title="Run the visualize subcommand">

<CodeGroup>
```bash title="Module invocation"
.venv/bin/python -m enrichment_agent visualize \
    --bundle ./bundles/ga4
```

```bash title="Console script"
enrichment-agent visualize \
    --bundle ./bundles/crypto_bitcoin
```
</CodeGroup>

</Step>
<Step title="Verify output">

On success, the CLI prints counts to stderr and writes the HTML file.

<ResponseExample>
```text
Wrote 14 concept(s), 42 edge(s), 287431 bytes → bundles/ga4/viz.html
```
</ResponseExample>

Open the output path in a browser. The default location is `<bundle>/viz.html`.

</Step>
</Steps>

### Custom output path and display name

```bash
.venv/bin/python -m enrichment_agent visualize \
    --bundle ./bundles/crypto_bitcoin \
    --out /tmp/btc.html \
    --name "Bitcoin OKF"
```

The `--name` value appears in the viewer header and browser title. When omitted, the bundle directory name is used.

## CLI reference

<ParamField body="--bundle" type="path" required>
Root directory of the OKF bundle to visualize.
</ParamField>

<ParamField body="--out" type="path">
Output HTML path. Defaults to `<bundle>/viz.html`.
</ParamField>

<ParamField body="--name" type="string">
Display name shown in the viewer header. Defaults to the bundle directory name.
</ParamField>

| Flag | Default | Description |
|------|---------|-------------|
| `--bundle` | *(required)* | Bundle root directory |
| `--out` | `<bundle>/viz.html` | Output HTML path |
| `--name` | bundle directory name | Header display name |

## How generation works

```mermaid
flowchart LR
  subgraph cli ["enrichment_agent/cli.py"]
    V["visualize subcommand"]
  end
  subgraph gen ["viewer/generator.py"]
    W["_walk_concepts"]
    G["_build_graph"]
    E["embed template + assets"]
  end
  subgraph out ["Output"]
    H["viz.html"]
  end
  V --> W
  W --> G
  G --> E
  E --> H
```

`generate_visualization(bundle_root, out_path, bundle_name=None)` in `enrichment_agent.viewer` performs four steps:

1. **Walk concepts** — Recursively find every `*.md` file under the bundle root.
2. **Parse frontmatter** — Load each file with `OKFDocument.parse`. Files that fail parsing are skipped silently.
3. **Extract links** — Scan markdown bodies for relative `.md` link targets and resolve them to concept IDs.
4. **Embed assets** — Inline `viz.css` and `viz.js` into `viz.html`, inject the graph JSON, and write a single HTML file.

<ResponseField name="return value" type="dict">
Generation returns counts: `concepts` (node count), `edges` (directed edge count), and `bytes` (output file size).
</ResponseField>

### Concept discovery rules

| Rule | Behavior |
|------|----------|
| `index.md` files | Excluded from the graph (navigation indexes, not concepts) |
| Parse failures | Skipped; bundle generation continues |
| Concept ID | Relative path from bundle root without `.md` suffix (e.g. `tables/events_`) |
| Frontmatter fields used | `type`, `title`, `description`, `resource`, `tags` |
| Missing frontmatter keys | Falls back to `"Unknown"` type, concept ID as title, empty strings for optional fields |

### Link extraction and edges

Cross-links are detected with a regex that matches markdown link targets ending in `.md`, optionally followed by an anchor fragment.

| Link form | Included as edge? |
|-----------|-------------------|
| Relative (`../tables/users.md`, `events.md`) | Yes, if target resolves inside the bundle |
| Absolute in-bundle (`/tables/users.md` in body) | Parsed at generation; rewired at view time in the detail panel |
| External (`https://…`) | No — skipped during extraction |
| Absolute path starting with `/` in link target | No — skipped during extraction |
| Dangling target (file not in bundle) | No edge created |
| Self-link | No edge created |
| Duplicate source→target pair | Deduplicated |

Edges are **directed**: source is the citing concept, target is the linked concept.

## Embedded graph data model

The generator serializes a JSON blob into `window.BUNDLE` inside the HTML:

```text
BUNDLE
├── nodes[]          # Cytoscape node elements
│   └── data
│       ├── id, label, type, description, resource, tags
│       ├── color    # from type palette
│       └── size     # 30 + min(60, len(body) // 200)
├── edges[]          # Cytoscape edge elements
│   └── data: { id, source, target }
├── bodies{}         # concept id → raw markdown body
├── types[]          # sorted unique type strings
└── palette{}        # known type → hex color
```

### Node color palette

| Concept type | Color |
|--------------|-------|
| `BigQuery Dataset` | `#8b5cf6` |
| `BigQuery Table` | `#3b82f6` |
| `Reference` | `#10b981` |
| Any other type | `#94a3b8` (default) |

Node diameter scales with body length (capped), so concepts with more prose appear slightly larger on the graph.

## Browser viewer

The generated `viz.html` is a split-pane application: a Cytoscape.js graph on the left (~60% width) and a concept detail panel on the right (~40%).

### Graph interactions

| Control | Behavior |
|---------|----------|
| Click node | Opens detail panel; selects and centers the node |
| Click canvas background | Clears selection |
| Search box | Dims nodes whose title, concept ID, or tags do not match the query |
| Type filter | Dims all nodes except the selected `type` |
| Layout selector | Re-layouts graph: `cose` (force-directed, default), `concentric`, `breadthfirst`, `circle`, `grid` |
| Reset view | Fits graph to viewport and clears selection |

On load, the viewer auto-selects the first `BigQuery Dataset` node if one exists; otherwise it selects the first concept.

### Detail panel

For the selected concept, the panel shows:

- **Type chip** — colored by the type palette
- **Title and concept ID**
- **Frontmatter** — description, resource (as external link), tags (as chips)
- **Rendered body** — markdown parsed in-browser with marked (GFM enabled)
- **Cited by** — reverse-link backlinks computed from the edge graph

Internal markdown links in the form `/path/to/concept.md` are rewired to navigate within the viewer instead of loading a file path. External links open in a new tab.

<Info>
Cytoscape.js `3.28.1` and marked `12.0.0` are loaded from jsDelivr CDN when the page opens. Bundle content itself is fully embedded in the HTML at generation time — no fetch of bundle files occurs in the browser.
</Info>

## Output layout

:::files
okf/bundles/<name>/
├── datasets/
├── tables/
├── references/
├── index.md              # navigation only; not graphed
└── viz.html              # default output (--out overrides path)
:::

The HTML file inlines all CSS and JavaScript from `enrichment_agent/viewer/static/`. Template placeholders (`__BUNDLE_NAME__`, `__BUNDLE_DATA__`) are replaced at generation time. You can commit `viz.html` next to the bundle, host it on a static file server, or share it as a standalone artifact.

## Programmatic use

Import `generate_visualization` directly for custom pipelines:

```python
from pathlib import Path
from enrichment_agent.viewer import generate_visualization

stats = generate_visualization(
    Path("./bundles/ga4"),
    Path("./bundles/ga4/viz.html"),
    bundle_name="GA4 E-commerce",
)
# stats == {"concepts": N, "edges": M, "bytes": K}
```

## Troubleshooting

<AccordionGroup>
<Accordion title="FileNotFoundError: Bundle directory not found">

`--bundle` must point to an existing directory. The generator does not create bundle content — produce a bundle first with `enrich` or author markdown concepts manually.

</Accordion>

<Accordion title="Graph has fewer nodes than expected">

- `index.md` files are intentionally excluded.
- Markdown files with invalid YAML frontmatter are skipped during parsing.
- Check that concept files use the standard `---` frontmatter delimiter.

</Accordion>

<Accordion title="Expected cross-links missing from the graph">

Links must target relative `.md` paths that resolve inside the bundle. External URLs, absolute `/` paths in link targets, and links to non-existent concepts do not produce edges. Verify link syntax matches `[label](../path/to/concept.md)`.

</Accordion>

<Accordion title="Viewer loads but graph area is empty">

Open the browser developer console. If Cytoscape or marked fail to load from the CDN (network policy, offline environment), the graph will not render. The embedded bundle data is still present in the page source.

</Accordion>

<Accordion title="enrichment_agent module not found">

Install the package from `okf/`:

```bash
python3 -m venv .venv
.venv/bin/pip install -e .[dev]
```

</Accordion>
</AccordionGroup>

## Related pages

<CardGroup>
<Card title="Open Knowledge Format" href="/open-knowledge-format">
Bundle structure, frontmatter fields, and cross-link semantics that the visualizer reads.
</Card>
<Card title="Produce OKF bundles" href="/produce-okf-bundles">
Run the enrichment agent to generate bundle directories you can visualize.
</Card>
<Card title="OKF enrichment CLI reference" href="/okf-enrichment-cli-reference">
Full `enrich` and `visualize` subcommand reference including BigQuery source flags.
</Card>
<Card title="OKF bundle recipes" href="/okf-bundle-recipes">
Copy-paste recipes for GA4, Stack Overflow, and Bitcoin bundles with sample `viz.html` outputs.
</Card>
</CardGroup>
