# Templates vs methods

> Domain YAML templates (`general/biography_graph`, `finance/earnings_summary`, etc.) versus algorithm-driven method templates (`method/light_rag`, `method/atom`); language requirements (`--lang` for templates, English-only for methods); and selection criteria.

- Repository: yifanfeng97/Hyper-Extract
- GitHub: https://github.com/yifanfeng97/Hyper-Extract
- Human docs: https://www.grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf
- Complete Markdown: https://www.grok-wiki.com/public/docs/yifanfeng97-hyper-extract-7891c7254cdf/llms-full.txt

## Source Files

- `hyperextract/utils/template_engine/gallery.py`
- `hyperextract/methods/registry.py`
- `hyperextract/utils/template_engine/factory.py`
- `hyperextract/cli/commands/list.py`
- `hyperextract/templates/README.md`
- `hyperextract/cli/cli.py`

---

---
title: "Templates vs methods"
description: "Domain YAML templates (`general/biography_graph`, `finance/earnings_summary`, etc.) versus algorithm-driven method templates (`method/light_rag`, `method/atom`); language requirements (`--lang` for templates, English-only for methods); and selection criteria."
---

Hyper-Extract exposes two extraction paths that both produce queryable Knowledge Abstracts through the same `Template.create` / `he parse` surface: **knowledge templates** (declarative YAML presets under `hyperextract/templates/presets/`) and **method templates** (registered algorithm classes under `hyperextract/methods/`). `TemplateFactory.create` routes `method/{name}` IDs to `create_method` and all other IDs through `Gallery` plus `localize_template`.

## Two extraction paths

| Aspect | Knowledge templates | Method templates |
|--------|---------------------|------------------|
| ID format | `{domain}/{name}` (e.g. `general/biography_graph`, `finance/earnings_summary`) | `method/{name}` (e.g. `method/light_rag`, `method/atom`) |
| Definition | YAML files with `output`, `guideline`, `identifiers`, `options` | Python classes registered in `hyperextract/methods/registry.py` |
| Discovery | `Gallery` scans `templates/presets/**/*.yaml` at import | `register_method` populates `_METHOD_REGISTRY` at import |
| Schema control | Full field, entity, and relation schemas per template | Fixed by algorithm; autotype comes from registry (`graph` or `hypergraph`) |
| Language | Multilingual YAML; runtime language required (`zh` or `en`) | English prompts only; `metadata["lang"]` hardcoded to `"en"` |
| Customization | Edit YAML or author new files | Pass constructor kwargs (e.g. `observation_time` for `atom`) |

Both paths return an `BaseAutoType` instance with the same lifecycle: `feed_text`, `dump`, `load`, `build_index`, `search`, `chat`, and `show`.

```text
  Knowledge path                         Method path
  ──────────────                         ───────────

  presets/{domain}/*.yaml                methods/registry.py
         │                                      │
         ▼                                      ▼
      Gallery.get()                      get_method()
         │                                      │
         └──────────► TemplateFactory.create ◄──┘
                           │
                           ▼
                    BaseAutoType instance
                           │
                           ▼
                  Knowledge Abstract (KA)
```

## Knowledge templates

Knowledge templates are domain YAML presets that declare **what** to extract and **how** to prompt the LLM. Each file lives under a domain directory inside `hyperextract/templates/presets/` and is keyed at runtime as `{domain}/{name}`.

### Domains and examples

The preset library ships 37 YAML templates across six domains:

| Domain | Count | Example IDs | Typical documents |
|--------|-------|-------------|-------------------|
| `general` | 13 | `general/biography_graph`, `general/concept_graph`, `general/base_graph` | Biographies, technical docs, agent workflows |
| `finance` | 5 | `finance/earnings_summary`, `finance/event_timeline` | Earnings calls, filings, news |
| `medicine` | 5 | `medicine/treatment_map`, `medicine/hospital_timeline` | Guidelines, discharge summaries |
| `tcm` | 5 | `tcm/syndrome_reasoning`, `tcm/formula_composition` | TCM case records, formula texts |
| `industry` | 5 | `industry/operation_flow`, `industry/safety_control` | SOPs, safety handbooks |
| `legal` | 5 | `legal/contract_obligation`, `legal/case_fact_timeline` | Contracts, court judgments |

Each YAML file declares an AutoType (`model`, `list`, `set`, `graph`, `hypergraph`, `temporal_graph`, `spatial_graph`, or `spatio_temporal_graph`), multilingual `description` and `guideline` blocks, an `output` schema, and optional `identifiers`, `options`, and `display` sections. At load time, `load_template` validates every language listed in the `language` field; at runtime, `localize_template` converts multilingual fields into a single-language `TemplateCfg` before the matching `create_{type}` factory method runs.

<Info>
Template IDs without a domain prefix resolve only under `general/`. For example, `graph` maps to `general/graph`, not templates in other domains.
</Info>

### When knowledge templates fit

Choose a knowledge template when the document type maps to a known schema:

- **Structured records** — `finance/earnings_summary` extracts quarterly metrics into an `AutoModel`.
- **Domain graphs** — `general/biography_graph` builds a `temporal_graph` of life events with timestamps.
- **Multilingual extraction** — prompts and field descriptions are localized to `zh` or `en`.
- **Custom schemas** — author a standalone YAML file and pass its path to `Template.create`.

## Method templates

Method templates wrap extraction **algorithms** as first-class template IDs. They do not use YAML; each method is a Python class registered with an autotype and description.

### Registered methods

Nine methods ship in the default registry, split across `hyperextract/methods/rag` (retrieval-augmented) and `hyperextract/methods/typical` (direct extraction):

| Method ID | Autotype | Category | Description |
|-----------|----------|----------|-------------|
| `method/graph_rag` | `graph` | RAG | Graph-RAG with community detection |
| `method/light_rag` | `graph` | RAG | Lightweight graph RAG with binary edges |
| `method/hyper_rag` | `hypergraph` | RAG | Hypergraph RAG with n-ary hyperedges |
| `method/hypergraph_rag` | `hypergraph` | RAG | Advanced hypergraph knowledge construction |
| `method/cog_rag` | `hypergraph` | RAG | Cognitive RAG for reasoning-focused retrieval |
| `method/itext2kg` | `graph` | Typical | High-quality triple-based extraction |
| `method/itext2kg_star` | `graph` | Typical | Enhanced iText2KG with improved quality |
| `method/kg_gen` | `graph` | Typical | Knowledge graph generator |
| `method/atom` | `graph` | Typical | Temporal knowledge graph with evidence attribution |

`TemplateFactory.create_method` instantiates the class, then stamps metadata:

```python
instance.metadata["template"] = f"method/{method_name}"
instance.metadata["lang"] = "en"
instance.metadata["type"] = autotype
```

Method-specific kwargs pass through to the constructor. For example, `atom` accepts `observation_time`:

```python
template = Template.create(
    "method/atom",
    observation_time="2024-06-15",
)
```

### When method templates fit

Choose a method template when schema flexibility matters less than extraction strategy:

- **General-purpose graph extraction** without a domain-specific field layout (`method/light_rag`).
- **Large documents** where chunking and retrieval help (`method/graph_rag`, `method/light_rag`).
- **Complex multi-entity relations** (`method/hyper_rag`, `method/hypergraph_rag`).
- **Temporal facts with evidence** (`method/atom` with `observation_time`).
- **Algorithm comparison** across RAG and typical pipelines using the same `feed_text` / `chat` surface.

<Note>
Method demos under `examples/en/methods/` use English source documents and instantiate method classes directly (e.g. `Light_RAG`) or via `Template.create("method/light_rag")`.
</Note>

## Language requirements

Language handling diverges at the `TemplateFactory.create` boundary.

### Knowledge templates: `--lang` required

Knowledge templates store prompts and schemas in multilingual YAML (`language: [zh, en]`). The runtime language selects which localized strings `localize_template` applies.

<ParamField body="--lang" type="string" required>
Language code for knowledge templates. Accepted values: `zh`, `en`. Required on `he parse` when using `-t` (or interactive template selection). Required as the `language` argument in `Template.create` for non-method sources.
</ParamField>

If `language` is omitted for a knowledge template, `TemplateFactory.create` raises:

```text
ValueError: language is required for knowledge templates. Provide a language code (e.g., 'zh', 'en').
```

The CLI enforces the same rule:

```bash
# Error: --lang missing
he parse document.md -t general/biography_graph -o ./ka/

# Correct
he parse document.md -t general/biography_graph -o ./ka/ -l en
he parse document.md -t finance/earnings_summary -o ./ka/ -l zh
```

### Method templates: English only

Method templates use English prompts baked into algorithm code. `TemplateFactory.create_method` documents that language is hardcoded to `"en"` in metadata, and `Template.create` ignores any `language` argument for `method/` sources.

<ParamField body="--lang" type="string">
Optional for method templates. If provided, the CLI prints a note that the value is ignored and forces `lang = "en"`.
</ParamField>

```bash
# No --lang needed
he parse document.md -m light_rag -o ./ka/

# Equivalent template ID form
he parse document.md -t method/light_rag -o ./ka/
```

`he list template --lang zh` filters to Chinese-capable knowledge templates and **excludes** method templates. Use `he list method` to browse methods independently.

## CLI invocation

`he parse` accepts templates and methods through separate flags that converge on one template ID string.

<Steps>
<Step title="List available options">

```bash
he list template          # Knowledge templates + methods (default lang: en)
he list template -l zh    # Chinese knowledge templates only
he list template --no-methods
he list method
he list method -q rag
```

</Step>
<Step title="Run extraction with a knowledge template">

```bash
he parse examples/en/tesla.md \
  -t general/biography_graph \
  -l en \
  -o ./tesla-ka
```

Omit `-t` for interactive template selection (knowledge templates only).

</Step>
<Step title="Run extraction with a method">

```bash
he parse examples/en/tesla.md \
  -m light_rag \
  -o ./tesla-ka-rag
```

The `-m` flag sets the internal template ID to `method/{name}`. No `-l` flag is required.

</Step>
<Step title="Verify output">

```bash
he info ./tesla-ka
he show ./tesla-ka
he search ./tesla-ka "AC motor"
```

Metadata records the template ID and language (`en` or `zh` for knowledge templates; always `en` for methods).

</Step>
</Steps>

## Python API

Both paths use the same `Template` facade exported from `hyperextract`.

<CodeGroup>
```python Knowledge template
from hyperextract import Template

ka = Template.create("general/biography_graph", language="en")
ka.feed_text(document_text)
ka.dump("./tesla-ka")
ka.build_index()
```

```python Method template
from hyperextract import Template

ka = Template.create("method/light_rag")
ka.feed_text(document_text)
ka.dump("./tesla-ka-rag")
```

```python Custom YAML path
ka = Template.create("/path/to/my_template.yaml", language="zh")
ka.feed_text(document_text)
```
</CodeGroup>

`Template.get` resolves configs from either source: `Gallery.get` for knowledge IDs, `get_method_cfg` for `method/` IDs. `Template.list(include_methods=True)` merges gallery results with `list_method_cfgs()`.

For direct algorithm access without the template wrapper, import classes from `hyperextract.methods.rag` or `hyperextract.methods.typical` and pass `llm_client` and `embedder` explicitly.

## Selection criteria

Use the decision below to pick a path before tuning autotype or provider settings.

```text
Need a specific output schema for a known document type?
│
├─ Yes → Knowledge template
│         Match domain + document type (see templates catalog)
│         Set --lang to match document language
│         Pick autotype by structure need (model/list/set/graph/…)
│
└─ No → Method template
          Pick algorithm by document size and relation complexity
          English input recommended
          Pass method kwargs (e.g. observation_time for atom)
```

### Knowledge template selection

| Scenario | Recommended template | AutoType |
|----------|---------------------|----------|
| Person biography or memoir | `general/biography_graph` | `temporal_graph` |
| Earnings call transcript | `finance/earnings_summary` | `model` |
| Multi-party contract | `legal/contract_obligation` | `hypergraph` |
| Clinical guideline | `medicine/treatment_map` | `hypergraph` |
| Custom domain schema | Author YAML from `general/base_*` | Any |

Match `type` to document structure: records use `model`/`list`/`set`; relationships use `graph`/`hypergraph`; time- or location-anchored relations use `temporal_graph`, `spatial_graph`, or `spatio_temporal_graph`. Temporal and spatio-temporal templates accept runtime kwargs such as `observation_time` and `observation_location`.

### Method template selection

| Priority | Recommended method |
|----------|-------------------|
| Fast general extraction | `method/light_rag` |
| Best triple quality | `method/itext2kg_star` |
| Large documents (10K+ words) | `method/graph_rag` |
| N-ary / multi-entity relations | `method/hyper_rag` |
| Temporal facts with evidence | `method/atom` |
| Reasoning-focused RAG | `method/cog_rag` |

<Warning>
Do not pass `--lang zh` expecting Chinese prompts from method templates. Methods always run with English prompts regardless of the flag value.
</Warning>

## Unified metadata and downstream commands

Regardless of path, the resulting Knowledge Abstract stores `template` and `lang` in `metadata.json`. Downstream CLI commands (`he feed`, `he search`, `he talk`, `he show`, `he build-index`) reload the KA via `Template.create(template, lang)` using those stored values. When feeding new documents, `he feed` inherits template and language from existing metadata unless overridden.

## Related pages

<CardGroup>
<Card title="Auto-Types" href="/auto-types">
Eight extraction primitives, merge behavior, and autotype selection for YAML `type` fields.
</Card>
<Card title="Create custom templates" href="/create-custom-templates">
Author domain YAML templates with multilingual blocks, identifiers, and validation.
</Card>
<Card title="Use extraction methods" href="/use-extraction-methods">
Invoke methods via CLI, `Template.create`, or direct class instantiation with kwargs.
</Card>
<Card title="Extraction methods reference" href="/extraction-methods-reference">
Full registry of nine methods with autotypes, descriptions, and constructor parameters.
</Card>
<Card title="Template schema reference" href="/template-schema-reference">
YAML field definitions for knowledge template authoring.
</Card>
<Card title="Troubleshooting" href="/troubleshooting">
Missing `--lang`, template resolution errors, and method-specific failure modes.
</Card>
</CardGroup>
