RLM Enrichment Pipeline

What is Enrichment?

Standard search embeds your raw text and hopes the right chunks surface. But raw text has problems:

Pronouns and vague references -- A chunk says "those customers" but the search query says "Partner Reseller Program." No match.
Implicit knowledge -- A chunk describes a feature without naming it. The right keywords never appear in the text.
Missing context -- A chunk makes sense in the document but is meaningless in isolation.

The RLM Enrichment Pipeline fixes this by analyzing every document during ingestion. An LLM reads your content, extracts entities, resolves pronouns, identifies key terms, and produces enriched metadata that gets baked into embeddings, BM25 indexes, and reranker inputs.

The result: your search finds things that raw text search would miss.

How It Works

The enrichment pipeline runs automatically during ingestText, ingestUrl, and processUploadedFile. It adds three phases to the ingestion flow:

typescript

Phase 1: Pre-Chunk Structuring
   Full document -> LLM analyzes structure, extracts entities & relationships
   |
Phase 2: Guided Chunking
   Section hints from Phase 1 guide chunk boundaries
   |
Phase 3: Chunk-Level Enrichment
   Each chunk -> LLM resolves pronouns, extracts key terms, summarizes
   |
   -> Enriched embeddings (semantic search)
   -> Enriched BM25 index (keyword search)
   -> Enriched reranker input (relevance scoring)

All three phases run inside Daytona cloud sandboxes using the same RLM engine described in the RLM docs.

Phase 1: Pre-Chunk Structuring

Before chunking, the full document is sent to an LLM that produces a structural analysis:

Output	Description
Summary	Document-level summary (up to 2,000 characters)
Sections	Logical section boundaries with titles and start/end hints -- used to guide chunking
Entities	Canonical entity list with names, types, aliases, and descriptions
Relationships	Entity-to-entity relationships with type and evidence
Normalized text	Optionally rewritten text with ambiguities resolved (used as the chunking input)

This phase is most valuable for long documents where the LLM can identify structure that the chunker would otherwise miss.

Phase 2: Guided Chunking

When chunking.useRlmSectionHints is enabled and Phase 1 produced section boundaries, the chunker uses those hints to split text at logical section boundaries instead of arbitrary token counts.

For example, a document with three sections ("Overview", "Installation", "API Reference") gets chunks that respect those boundaries rather than splitting mid-section.

Phase 3: Chunk-Level Enrichment

After chunking, each chunk is analyzed by the LLM with access to the document-wide entity map from Phase 1. This enables coreference resolution -- the LLM knows that "those customers" in chunk 7 refers to "Partner Reseller Program customers" because it saw the entity definition in Phase 1.

Each chunk gets the following enrichment data:

Field	Type	Description
`summary`	`string`	1-2 sentence summary of what this chunk covers
`topics`	`string[]`	Topic labels (e.g., "authentication", "error handling")
`entities`	`Array<{name, type}>`	Named entities mentioned in this chunk
`keyTerms`	`string[]`	Key terms and phrases for keyword matching
`semanticTags`	`string[]`	Content classification tags
`resolvedText`	`string`	Full chunk text with all pronouns and vague references resolved to canonical names
`coreferences`	`Array<{original, resolved}>`	Specific resolutions applied (e.g., "they" -> "Team tier users")
`qualityScore`	`number`	0-1 information density score
`crossReferences`	`string[]`	References to related chunks or documents

How Enrichment Improves Search

Enrichment data flows into three search touchpoints:

1. Enriched Embeddings

When embedding.includeEnrichment is true, the embedding input for each chunk includes the resolved text, entity names, and key terms. This means the vector embedding captures the full meaning -- not just the raw text.

typescript

Raw chunk:    "Those customers can use the portal to manage their subscriptions."
Enriched:    "Those customers can use the portal to manage their subscriptions.
              Context: Partner Reseller Program customers can use the self-service
              portal to manage their SaaS subscriptions.
              Entities: Partner Reseller Program, Self-Service Portal
              Keywords: subscription management, reseller portal, partner access"

The enriched embedding matches queries about "partner portal" or "reseller subscription management" that the raw text would miss.

2. Enriched BM25 Index

When search.bm25IndexEnrichment is true, the BM25 keyword index includes enrichment terms alongside raw content. A search for "Partner Reseller Program" finds chunks whose raw text only says "those customers" because the enriched index contains the resolved entity name.

3. Enriched Reranking

When search.rerankerUsesEnrichment is true, the text sent to Jina Reranker v3 includes the resolved text, entity names, and key terms. The reranker has more signal for relevance scoring because it sees resolved references alongside raw text.

Configuration

Enabling Enrichment

Enrichment requires the Team tier and a Daytona API key:

const memory = new Memory(components.memory, {
  tier: "team",
  ai: {
    gateway: "openrouter",
    model: "google/gemini-2.0-flash-001",
  },
  // RLM enrichment pipeline
  rlm: {
    enrichOnIngest: true,
    preChunkStructuringOnIngest: true,
    chunkLevelEnrichment: true,
    consumeDocContext: true,
  },
  // Use enrichment data in embeddings
  embedding: {
    includeEnrichment: true,
    enrichmentFields: ["resolvedText", "entities", "keyTerms"],
  },
  // Use enrichment data in search
  search: {
    bm25IndexEnrichment: true,
    rerankerUsesEnrichment: true,
  },
  // Section-aware chunking
  chunking: {
    useRlmSectionHints: true,
  },
  apiKeys: {
    openrouter: process.env.OPENROUTER_API_KEY,
    jina: process.env.JINA_API_KEY,
    daytona: process.env.DAYTONA_API_KEY,
  },
});

Full Configuration Reference

RLM Enrichment (`rlm.*`)

Option	Type	Default	Description
`enrichOnIngest`	`boolean`	`false`	Master switch for the enrichment pipeline
`enrichmentModel`	`string`	falls back to `ai.model`	LLM model for enrichment calls
`enrichmentMaxIterations`	`number`	`5`	Max REPL iterations per enrichment batch
`preChunkStructuringOnIngest`	`boolean`	`false`	Enable Phase 1: document-wide structural analysis
`preChunkStructuringModel`	`string`	falls back to `ai.model`	LLM model for pre-chunk structuring
`preChunkStructuringMaxIterations`	`number`	`6`	Max REPL iterations for structuring
`preChunkStructuringMaxChars`	`number`	`300,000`	Max document characters sent to structuring
`preChunkStructuringMinChars`	`number`	`1,000`	Min document size to trigger structuring (smaller docs skip Phase 1)
`chunkLevelEnrichment`	`boolean`	`false`	Enable Phase 3: per-chunk enrichment
`chunkLevelEnrichmentModel`	`string`	falls back to `ai.model`	LLM model for chunk enrichment
`chunkLevelEnrichmentBatchSize`	`number`	`5`	Number of chunks processed per LLM call
`consumeDocContext`	`boolean`	`true`	Feed Phase 1 entity map to Phase 3 for coreference resolution

Enrichment-Aware Embedding (`embedding.*`)

Option	Type	Default	Description
`includeEnrichment`	`boolean`	`false`	Include enrichment data in embedding input
`enrichmentFields`	`string[]`	`["resolvedText", "entities", "keyTerms"]`	Which enrichment fields to include in embeddings

Enrichment-Aware Search (`search.*`)

Option	Type	Default	Description
`bm25IndexEnrichment`	`boolean`	`false`	Index enrichment terms in BM25 keyword search
`rerankerUsesEnrichment`	`boolean`	`false`	Feed enrichment metadata to Jina Reranker

Enrichment-Aware Chunking (`chunking.*`)

Option	Type	Default	Description
`useRlmSectionHints`	`boolean`	`false`	Use Phase 1 section boundaries to guide chunk splits

Cost and Performance

Enrichment increases ingestion time and cost because it adds LLM calls during ingestion. Search latency is not affected -- enrichment data is computed once during ingestion and stored.

Ingestion Time Impact

Without enrichment	With enrichment
~3-8 seconds per document	~25-55 seconds per document

The increase comes from Phase 1 (1 LLM call for the full document) and Phase 3 (1 LLM call per batch of 5 chunks).

Cost-Saving Tips

Use a fast, cheap model -- google/gemini-2.0-flash-001 is an excellent choice for enrichment. It's fast, inexpensive, and produces high-quality enrichment data.
Adjust batch size -- Increasing chunkLevelEnrichmentBatchSize from 5 to 10 reduces the number of LLM calls but may reduce enrichment quality for complex documents.
Skip Phase 1 for short documents -- The preChunkStructuringMinChars threshold (default: 1,000) already skips Phase 1 for short documents. Increase it if most of your docs are brief.
Enrichment runs once -- Unlike search-time LLM calls, enrichment only runs during ingestion. Once a document is enriched, every search benefits at zero additional cost.

When to Use Enrichment

Enable enrichment when:

Your documents contain pronouns and vague references that would confuse keyword search
You need cross-document entity consistency -- the same entity should be recognized by the same canonical name across all documents
Your search queries use different terminology than your documents (enrichment bridges the vocabulary gap)
You're building a high-quality search experience where precision matters more than ingestion speed

Skip enrichment when:

Your documents are simple and self-contained (e.g., FAQ pages where each chunk is a complete answer)
Ingestion speed is more important than search quality
Your corpus is very small (less than 10 documents) -- the quality improvement may not be noticeable
You're on the Community or Pro tier (enrichment requires Team)

Availability

The RLM Enrichment Pipeline is available on the Team tier only. It requires a Daytona API key for sandbox execution.

Feature	Community	Pro	Team
Pre-chunk structuring	-	-	Yes
Guided chunking (section hints)	-	-	Yes
Chunk-level enrichment	-	-	Yes
Enrichment-aware embeddings	-	-	Yes
Enrichment-aware BM25	-	-	Yes
Enrichment-aware reranking	-	-	Yes