memcity

Features

RLM Enrichment Pipeline

Automatically analyze every document during ingestion with LLM-powered enrichment for dramatically better search results.

What is Enrichment?

Standard search embeds your raw text and hopes the right chunks surface. But raw text has problems:

  • Pronouns and vague references -- A chunk says "those customers" but the search query says "Partner Reseller Program." No match.
  • Implicit knowledge -- A chunk describes a feature without naming it. The right keywords never appear in the text.
  • Missing context -- A chunk makes sense in the document but is meaningless in isolation.

The RLM Enrichment Pipeline fixes this by analyzing every document during ingestion. An LLM reads your content, extracts entities, resolves pronouns, identifies key terms, and produces enriched metadata that gets baked into embeddings, BM25 indexes, and reranker inputs.

The result: your search finds things that raw text search would miss.

How It Works

The enrichment pipeline runs automatically during ingestText, ingestUrl, and processUploadedFile. It adds three phases to the ingestion flow:

typescript
Phase 1: Pre-Chunk Structuring
   Full document -> LLM analyzes structure, extracts entities & relationships
   |
Phase 2: Guided Chunking
   Section hints from Phase 1 guide chunk boundaries
   |
Phase 3: Chunk-Level Enrichment
   Each chunk -> LLM resolves pronouns, extracts key terms, summarizes
   |
   -> Enriched embeddings (semantic search)
   -> Enriched BM25 index (keyword search)
   -> Enriched reranker input (relevance scoring)

All three phases run inside Daytona cloud sandboxes using the same RLM engine described in the RLM docs.

Phase 1: Pre-Chunk Structuring

Before chunking, the full document is sent to an LLM that produces a structural analysis:

OutputDescription
SummaryDocument-level summary (up to 2,000 characters)
SectionsLogical section boundaries with titles and start/end hints -- used to guide chunking
EntitiesCanonical entity list with names, types, aliases, and descriptions
RelationshipsEntity-to-entity relationships with type and evidence
Normalized textOptionally rewritten text with ambiguities resolved (used as the chunking input)

This phase is most valuable for long documents where the LLM can identify structure that the chunker would otherwise miss.

Phase 2: Guided Chunking

When chunking.useRlmSectionHints is enabled and Phase 1 produced section boundaries, the chunker uses those hints to split text at logical section boundaries instead of arbitrary token counts.

For example, a document with three sections ("Overview", "Installation", "API Reference") gets chunks that respect those boundaries rather than splitting mid-section.

Phase 3: Chunk-Level Enrichment

After chunking, each chunk is analyzed by the LLM with access to the document-wide entity map from Phase 1. This enables coreference resolution -- the LLM knows that "those customers" in chunk 7 refers to "Partner Reseller Program customers" because it saw the entity definition in Phase 1.

Each chunk gets the following enrichment data:

FieldTypeDescription
summarystring1-2 sentence summary of what this chunk covers
topicsstring[]Topic labels (e.g., "authentication", "error handling")
entitiesArray<{name, type}>Named entities mentioned in this chunk
keyTermsstring[]Key terms and phrases for keyword matching
semanticTagsstring[]Content classification tags
resolvedTextstringFull chunk text with all pronouns and vague references resolved to canonical names
coreferencesArray<{original, resolved}>Specific resolutions applied (e.g., "they" -> "Team tier users")
qualityScorenumber0-1 information density score
crossReferencesstring[]References to related chunks or documents

Enrichment data flows into three search touchpoints:

1. Enriched Embeddings

When embedding.includeEnrichment is true, the embedding input for each chunk includes the resolved text, entity names, and key terms. This means the vector embedding captures the full meaning -- not just the raw text.

typescript
Raw chunk:    "Those customers can use the portal to manage their subscriptions."
Enriched:    "Those customers can use the portal to manage their subscriptions.
              Context: Partner Reseller Program customers can use the self-service
              portal to manage their SaaS subscriptions.
              Entities: Partner Reseller Program, Self-Service Portal
              Keywords: subscription management, reseller portal, partner access"

The enriched embedding matches queries about "partner portal" or "reseller subscription management" that the raw text would miss.

2. Enriched BM25 Index

When search.bm25IndexEnrichment is true, the BM25 keyword index includes enrichment terms alongside raw content. A search for "Partner Reseller Program" finds chunks whose raw text only says "those customers" because the enriched index contains the resolved entity name.

3. Enriched Reranking

When search.rerankerUsesEnrichment is true, the text sent to Jina Reranker v3 includes the resolved text, entity names, and key terms. The reranker has more signal for relevance scoring because it sees resolved references alongside raw text.

Configuration

Enabling Enrichment

Enrichment requires the Team tier and a Daytona API key:

ts
const memory = new Memory(components.memory, {
  tier: "team",
  ai: {
    gateway: "openrouter",
    model: "google/gemini-2.0-flash-001",
  },
  // RLM enrichment pipeline
  rlm: {
    enrichOnIngest: true,
    preChunkStructuringOnIngest: true,
    chunkLevelEnrichment: true,
    consumeDocContext: true,
  },
  // Use enrichment data in embeddings
  embedding: {
    includeEnrichment: true,
    enrichmentFields: ["resolvedText", "entities", "keyTerms"],
  },
  // Use enrichment data in search
  search: {
    bm25IndexEnrichment: true,
    rerankerUsesEnrichment: true,
  },
  // Section-aware chunking
  chunking: {
    useRlmSectionHints: true,
  },
  apiKeys: {
    openrouter: process.env.OPENROUTER_API_KEY,
    jina: process.env.JINA_API_KEY,
    daytona: process.env.DAYTONA_API_KEY,
  },
});

Full Configuration Reference

RLM Enrichment (rlm.*)

OptionTypeDefaultDescription
enrichOnIngestbooleanfalseMaster switch for the enrichment pipeline
enrichmentModelstringfalls back to ai.modelLLM model for enrichment calls
enrichmentMaxIterationsnumber5Max REPL iterations per enrichment batch
preChunkStructuringOnIngestbooleanfalseEnable Phase 1: document-wide structural analysis
preChunkStructuringModelstringfalls back to ai.modelLLM model for pre-chunk structuring
preChunkStructuringMaxIterationsnumber6Max REPL iterations for structuring
preChunkStructuringMaxCharsnumber300,000Max document characters sent to structuring
preChunkStructuringMinCharsnumber1,000Min document size to trigger structuring (smaller docs skip Phase 1)
chunkLevelEnrichmentbooleanfalseEnable Phase 3: per-chunk enrichment
chunkLevelEnrichmentModelstringfalls back to ai.modelLLM model for chunk enrichment
chunkLevelEnrichmentBatchSizenumber5Number of chunks processed per LLM call
consumeDocContextbooleantrueFeed Phase 1 entity map to Phase 3 for coreference resolution

Enrichment-Aware Embedding (embedding.*)

OptionTypeDefaultDescription
includeEnrichmentbooleanfalseInclude enrichment data in embedding input
enrichmentFieldsstring[]["resolvedText", "entities", "keyTerms"]Which enrichment fields to include in embeddings
OptionTypeDefaultDescription
bm25IndexEnrichmentbooleanfalseIndex enrichment terms in BM25 keyword search
rerankerUsesEnrichmentbooleanfalseFeed enrichment metadata to Jina Reranker

Enrichment-Aware Chunking (chunking.*)

OptionTypeDefaultDescription
useRlmSectionHintsbooleanfalseUse Phase 1 section boundaries to guide chunk splits

Cost and Performance

Enrichment increases ingestion time and cost because it adds LLM calls during ingestion. Search latency is not affected -- enrichment data is computed once during ingestion and stored.

Ingestion Time Impact

Without enrichmentWith enrichment
~3-8 seconds per document~25-55 seconds per document

The increase comes from Phase 1 (1 LLM call for the full document) and Phase 3 (1 LLM call per batch of 5 chunks).

Cost-Saving Tips

  1. Use a fast, cheap model -- google/gemini-2.0-flash-001 is an excellent choice for enrichment. It's fast, inexpensive, and produces high-quality enrichment data.
  2. Adjust batch size -- Increasing chunkLevelEnrichmentBatchSize from 5 to 10 reduces the number of LLM calls but may reduce enrichment quality for complex documents.
  3. Skip Phase 1 for short documents -- The preChunkStructuringMinChars threshold (default: 1,000) already skips Phase 1 for short documents. Increase it if most of your docs are brief.
  4. Enrichment runs once -- Unlike search-time LLM calls, enrichment only runs during ingestion. Once a document is enriched, every search benefits at zero additional cost.

When to Use Enrichment

Enable enrichment when:

  • Your documents contain pronouns and vague references that would confuse keyword search
  • You need cross-document entity consistency -- the same entity should be recognized by the same canonical name across all documents
  • Your search queries use different terminology than your documents (enrichment bridges the vocabulary gap)
  • You're building a high-quality search experience where precision matters more than ingestion speed

Skip enrichment when:

  • Your documents are simple and self-contained (e.g., FAQ pages where each chunk is a complete answer)
  • Ingestion speed is more important than search quality
  • Your corpus is very small (less than 10 documents) -- the quality improvement may not be noticeable
  • You're on the Community or Pro tier (enrichment requires Team)

Availability

The RLM Enrichment Pipeline is available on the Team tier only. It requires a Daytona API key for sandbox execution.

FeatureCommunityProTeam
Pre-chunk structuring--Yes
Guided chunking (section hints)--Yes
Chunk-level enrichment--Yes
Enrichment-aware embeddings--Yes
Enrichment-aware BM25--Yes
Enrichment-aware reranking--Yes