Configuration | Memcity

How Configuration Works

When you create a Memory instance, you pass a configuration object. Memcity deep-merges your config with sensible defaults — you only need to specify what you want to change.

const memory = new Memory(components.memcity, {
  tier: "pro",
  ai: { gateway: "openrouter" },
  // Everything else uses defaults
});

Tier enforcement is automatic. If you're on the Community tier and try to enable a Pro feature, Memcity silently overrides it to the default value. No errors, no crashes — it just uses what your tier supports.

// On Community tier, this config:
const memory = new Memory(components.memcity, {
  tier: "community",
  search: { reranking: true }, // Pro+ feature
});
// Behaves exactly like this:
const memory = new Memory(components.memcity, {
  tier: "community",
  search: { reranking: false }, // Silently overridden
});

The Full MemoryConfig Interface

Here's the complete TypeScript interface showing every option:

interface MemoryConfig {
  tier: "community" | "pro" | "team";
 
  ai: {
    gateway: "openrouter" | "vercel";
    model: string;
  };
 
  search: {
    maxResults: number;
    minScore: number;
    weights: {
      semantic: number;
      bm25: number;
    };
    enableQueryRouting: boolean;      // Pro+
    enableQueryDecomposition: boolean; // Pro+
    enableHyde: boolean;              // Pro+
    reranking: boolean;               // Pro+
    maxQueryExpansions: number;       // Pro+
    maxChunkExpansions: number;       // Pro+
  };
 
  chunking: {
    strategy: "recursive" | "fixed";
    chunkSize: number;
    chunkOverlap: number;
  };
 
  graph: {                            // Pro+
    enabled: boolean;
    traversalStrategy: "breadth_first" | "best_first" | "hybrid";
    maxDepth: number;
    maxNodes: number;
  };
 
  enterprise: {                       // Team only
    acl: boolean;
    auditLog: boolean;
    quotas: boolean;
  };
}

AI Configuration

Gateway: OpenRouter vs Vercel

The ai.gateway option controls how Memcity accesses language models:

	OpenRouter	Vercel AI Gateway
Setup	Set `OPENROUTER_API_KEY` env var	Uses Vercel's built-in credentials
Models	200+ models from all providers	OpenAI, Anthropic, Google
Pricing	Pay-per-token via OpenRouter	Pay-per-token via Vercel
Best for	Most users, widest model selection	Vercel-deployed apps wanting simplicity
Fallbacks	Automatic model fallbacks	Limited fallback support

// OpenRouter (recommended for most users)
ai: {
  gateway: "openrouter",
  model: "google/gemini-2.0-flash-001",
}
 
// Vercel AI Gateway
ai: {
  gateway: "vercel",
  model: "gpt-4o-mini",
}

Model Selection

The model is used for reasoning tasks — query routing, entity extraction, HyDE generation, query decomposition. It is not used for embeddings (those always use Jina v4).

Model	Cost	Quality	Speed	Best For
`google/gemini-2.0-flash-001`	Low	Good	Fast	Default choice, good balance
`gpt-4o-mini`	Low	Good	Fast	If you prefer OpenAI
`anthropic/claude-3.5-haiku`	Low	Good	Fast	If you prefer Anthropic
`google/gemini-2.5-pro-preview`	High	Excellent	Slow	Maximum quality entity extraction
`anthropic/claude-sonnet-4`	High	Excellent	Medium	Complex reasoning tasks

Recommendation: Start with google/gemini-2.0-flash-001. It's fast, cheap, and good enough for most use cases. Only upgrade if you need better entity extraction or query understanding.

Search Configuration

maxResults

How many results to return from a search. Default: 10.

When to change: If you're building a chat interface, 3-5 results is usually enough context. If you're building a search results page, 10-20 gives users more to browse.

search: {
  maxResults: 5,  // For chat: fewer but more focused results
}

minScore

The minimum relevance score (0-1) a result must have to be included. Default: 0.1.

When to change: If you're getting too many low-quality results, raise this to 0.3 or 0.5. If you're getting too few results, lower it to 0.05.

search: {
  minScore: 0.3,  // Only return results that are at least 30% relevant
}

weights: semantic vs bm25

These control how much weight to give semantic (meaning-based) search vs BM25 (keyword-based) search. They must sum to 1.0. Default: 0.7 semantic, 0.3 BM25.

What's the difference?

Semantic search understands meaning. "How do I cancel my subscription?" matches "To terminate your plan, visit account settings" even though the words are different.
BM25 search matches keywords. "error code 4012" matches documents containing exactly "error code 4012". It's precise but doesn't understand synonyms.

Use Case	Semantic	BM25	Why
Natural language Q&A	0.8	0.2	Users ask in their own words
Technical documentation	0.6	0.4	Function names and codes matter
Code search	0.3	0.7	Exact identifiers are critical
Legal/compliance docs	0.5	0.5	Both exact terms and concepts matter

search: {
  weights: {
    semantic: 0.6,  // Understanding matters
    bm25: 0.4,      // But exact terms also matter
  },
}

enableQueryRouting (Pro+)

When enabled, Memcity classifies each query as simple, moderate, or complex before processing. This determines which pipeline steps activate:

Simple queries ("What is X?") skip decomposition and HyDE — they're fast.
Moderate queries ("How does X compare to Y?") use query expansion but skip decomposition.
Complex queries ("What are the implications of X on Y and Z?") use the full pipeline.

Default: false. When to enable: When your users ask a mix of simple and complex questions and you want to optimize for both speed and quality.

search: {
  enableQueryRouting: true,
}

enableQueryDecomposition (Pro+)

Breaks complex queries into simpler sub-queries that are searched independently, then results are merged.

Before decomposition:

"Compare the vacation policy with the sick leave policy and explain which is more generous"

After decomposition:

"What is the vacation policy?"

"What is the sick leave policy?"

"How do vacation days compare to sick days in terms of quantity?"

Each sub-query gets its own search, and results are merged. This dramatically improves recall for complex questions.

Default: false. When to enable: When users ask multi-part or comparative questions.

enableHyde (Pro+)

HyDE stands for Hypothetical Document Embeddings. Instead of just embedding the query, Memcity asks the LLM: "If a document existed that perfectly answered this question, what would it say?" Then it embeds that hypothetical answer and searches for real documents similar to it.

Why it works: Queries are short ("refund policy?") but answers are long and detailed. A hypothetical answer is more similar to the actual document than the short query is.

Example:

Query: "refund policy"
HyDE generates: "Our refund policy allows customers to return products within 30 days of purchase for a full refund. Items must be unused and in original packaging..."
This hypothetical text matches the real refund policy document much better than the two-word query would.

Default: false. When to enable: When users ask short questions about topics with detailed documentation.

search: {
  enableHyde: true,
}

reranking (Pro+)

After the initial search retrieves candidates, a reranker (Jina Reranker v3) re-scores them using a cross-encoder model that looks at the query and each candidate together.

Why initial ranking isn't enough: The initial search uses separate embeddings for the query and documents. A reranker directly compares each pair, which is more accurate but slower (you can't rerank thousands of results, only the top candidates).

Think of it like a hiring process: the initial search is the resume screening (fast, approximate), and the reranker is the interview (slower, more accurate).

Default: false. When to enable: Almost always. This is the single most impactful quality improvement for most use cases. The latency cost (~100ms) is usually worth it.

search: {
  reranking: true,
}

maxQueryExpansions (Pro+)

How many semantic variations of the query to generate. Default: 3.

Example: For the query "Python web frameworks", expansions might be:

"Django Flask FastAPI web development Python"
"Building web applications with Python"
"Python HTTP server frameworks comparison"

More expansions improve recall but increase latency and cost. Range: 1-5.

maxChunkExpansions (Pro+)

For top results, how many surrounding chunks to fetch for additional context. Default: 2.

Think of it like reading a book — if a sentence matches your query, you probably want to read the paragraph (or page) around it. Chunk expansion gives you that context.

search: {
  maxChunkExpansions: 3, // Fetch 3 chunks before and after each result
}

Chunking Configuration

What is Chunking?

When you ingest a document, Memcity splits it into "chunks" — smaller pieces of text. Each chunk gets its own embedding and can be retrieved independently.

Why not just embed the whole document? Because embeddings work best on focused pieces of text. A 50-page document embedded as one vector loses detail. But a 512-token chunk about "refund policy" creates a precise, searchable vector.

Strategy

Strategy	Description	Tier
`recursive`	Splits on paragraph → sentence → word boundaries, preserving structure	All
`fixed`	Splits at a fixed token count regardless of structure	All

Use recursive (the default) in almost all cases. It produces more natural chunks that respect paragraph boundaries.

chunking: {
  strategy: "recursive",
  chunkSize: 512,       // Target tokens per chunk
  chunkOverlap: 50,     // Overlap between consecutive chunks
}

chunkSize and chunkOverlap

chunkSize (default: 512): How many tokens per chunk. Smaller chunks (256) are more precise but lose context. Larger chunks (1024) have more context but are less focused.
chunkOverlap (default: 50): How many tokens overlap between consecutive chunks. This prevents information at chunk boundaries from being lost.

Content Type	Chunk Size	Overlap	Why
FAQ / short answers	256	25	Each Q&A pair should be one chunk
Technical docs	512	50	Good balance for most content
Long-form articles	1024	100	Preserve more narrative context
Legal documents	512	100	Higher overlap prevents clause splitting

Graph Configuration (Pro+)

The knowledge graph automatically extracts entities and relationships from your documents. See Knowledge Graph for a deep dive.

traversalStrategy

How the graph is traversed when searching for related entities:

breadth_first — Explore all neighbors at each depth level before going deeper. Like exploring a building floor by floor.
best_first — Always follow the highest-scoring connection. Like a detective following the hottest lead.
hybrid (default) — BFS for the first hop, then best-first. Gets the best of both strategies.

graph: {
  enabled: true,
  traversalStrategy: "hybrid",
  maxDepth: 3,     // How many hops to traverse (default: 3)
  maxNodes: 50,    // Max nodes to visit (default: 50)
}

Enterprise Configuration (Team)

These features are available on the Team tier only. Each has a dedicated documentation page:

Access Control (ACLs) — Per-document permissions with principal-based filtering
Audit Logging — Immutable trail of every operation
Usage Quotas — Rate limiting and usage tracking per organization

enterprise: {
  acl: true,        // Enable per-document access control
  auditLog: true,   // Enable immutable audit logging
  quotas: true,     // Enable usage quotas and rate limiting
}

Configuration Recipes

"Fast and Cheap" — Minimize Costs

Best for: prototypes, low-traffic apps, simple Q&A.

const memory = new Memory(components.memcity, {
  tier: "community",
  ai: {
    gateway: "openrouter",
    model: "google/gemini-2.0-flash-001",
  },
  search: {
    maxResults: 5,
    weights: { semantic: 0.7, bm25: 0.3 },
    // All advanced features disabled by default on Community
  },
  chunking: {
    strategy: "recursive",
    chunkSize: 512,
    chunkOverlap: 50,
  },
});

"Maximum Quality" — Best Possible Results

Best for: customer-facing search, support bots, enterprise apps.

const memory = new Memory(components.memcity, {
  tier: "pro",
  ai: {
    gateway: "openrouter",
    model: "google/gemini-2.5-pro-preview",
  },
  search: {
    maxResults: 10,
    minScore: 0.2,
    weights: { semantic: 0.7, bm25: 0.3 },
    enableQueryRouting: true,
    enableQueryDecomposition: true,
    enableHyde: true,
    reranking: true,
    maxQueryExpansions: 5,
    maxChunkExpansions: 3,
  },
  graph: {
    enabled: true,
    traversalStrategy: "hybrid",
    maxDepth: 3,
    maxNodes: 50,
  },
});

"Enterprise Secure" — Full Compliance

Best for: regulated industries, multi-tenant SaaS, enterprise deployments.

const memory = new Memory(components.memcity, {
  tier: "team",
  ai: {
    gateway: "openrouter",
    model: "google/gemini-2.0-flash-001",
  },
  search: {
    maxResults: 10,
    enableQueryRouting: true,
    reranking: true,
  },
  enterprise: {
    acl: true,        // Per-document access control
    auditLog: true,   // Immutable operation logging
    quotas: true,     // Usage limits per organization
  },
});