Enterprise
Usage Quotas
Control costs and ensure fair usage with per-organization rate limiting and usage tracking.
What are Usage Quotas?
Usage quotas let you set limits on how much each organization (or user) can consume. Think of it like a cell phone plan — you get X searches per month, Y documents stored, and Z embeddings generated. Go over the limit and requests are rejected until the period resets.
Why You Need Them
Cost Control
Every search costs money — embedding generation, reranker calls, LLM calls for query routing. Without quotas, one runaway customer could generate a $500 API bill overnight.
Fair Usage
In a multi-tenant application, you want to ensure one customer's heavy usage doesn't degrade performance for others. Quotas enforce fair sharing.
Tiered Pricing
If you're building a SaaS with free, starter, and enterprise plans, quotas let you enforce plan limits: free users get 100 searches/day, starter gets 1,000, enterprise gets unlimited.
How It Works
Enabling Quotas
const memory = new Memory(components.memcity, {
tier: "team",
enterprise: {
quotas: true, // Enable usage quotas
},
});Pipeline Integration
Quota checking is Step 1 of the 16-step search pipeline. Before Memcity does any expensive work (embedding, searching, reranking), it checks:
- Does this organization have quotas set?
- If yes, are they within their limits?
- If over quota, reject immediately with a clear error.
This means you never pay for a search that's going to be rejected — the check happens before any API calls.
Setting Quotas
await memory.setQuota(ctx, {
orgId,
quotas: {
// Rate limits (per time period)
searchesPerDay: 1000, // Max searches per day
ingestionsPerDay: 100, // Max document ingestions per day
// Storage limits
maxDocuments: 500, // Max documents in all KBs combined
maxStorageBytes: 1073741824, // 1GB max storage
},
});What Can Be Limited
| Quota | Description | What Counts |
|---|---|---|
searchesPerDay | Searches per 24-hour rolling period | Each getContext call |
ingestionsPerDay | Ingestions per 24-hour rolling period | Each ingestText, ingestUrl, processUploadedFile call |
maxDocuments | Total documents across all KBs | Total live (non-deleted) documents |
maxStorageBytes | Total storage consumption | Chunks + embeddings + metadata |
Checking Quota Status
Before hitting the limit, you might want to show users their consumption:
const status = await memory.getQuotaStatus(ctx, {
orgId,
});
console.log(status);
// {
// searches: { used: 742, limit: 1000, remaining: 258 },
// ingestions: { used: 45, limit: 100, remaining: 55 },
// documents: { used: 312, limit: 500, remaining: 188 },
// storage: { usedBytes: 524288000, limitBytes: 1073741824, remainingBytes: 549453824 },
// periodResetsAt: "2024-03-16T00:00:00.000Z"
// }You can use this to show a usage bar in your UI:
function UsageBar({ used, limit, label }: { used: number; limit: number; label: string }) {
const pct = Math.round((used / limit) * 100);
return (
<div>
<p>{label}: {used} / {limit} ({pct}%)</p>
<div style={{ width: "100%", background: "#eee", height: 8 }}>
<div style={{
width: `${pct}%`,
background: pct > 90 ? "red" : pct > 70 ? "orange" : "green",
height: 8
}} />
</div>
</div>
);
}Handling Quota Exceeded
When a quota is exceeded, Memcity throws a clear error:
try {
const results = await memory.getContext(ctx, {
orgId,
knowledgeBaseId: kbId,
query: "...",
});
} catch (error) {
if (error.message.includes("Quota exceeded")) {
// Handle gracefully — show the user a message
console.log("You've reached your daily search limit.");
console.log("Upgrade your plan or wait for the period to reset.");
}
}Important: When a quota is exceeded, no partial processing occurs. The request is rejected at Step 1 before any API calls are made. You're never charged for a rejected request.
Quota Exceeded for Ingestion
try {
await memory.ingestText(ctx, {
orgId,
knowledgeBaseId: kbId,
text: "...",
source: "doc.md",
});
} catch (error) {
if (error.message.includes("Quota exceeded")) {
// Check which limit was hit
const status = await memory.getQuotaStatus(ctx, { orgId });
if (status.ingestions.remaining === 0) {
console.log("Daily ingestion limit reached.");
} else if (status.documents.remaining === 0) {
console.log("Maximum document count reached.");
} else if (status.storage.remainingBytes === 0) {
console.log("Storage limit reached.");
}
}
}Adjusting Quotas
You can update quotas at any time:
// Upgrade a customer to a higher plan
await memory.setQuota(ctx, {
orgId,
quotas: {
searchesPerDay: 5000, // was 1000
ingestionsPerDay: 500, // was 100
maxDocuments: 2000, // was 500
maxStorageBytes: 5368709120, // 5GB, was 1GB
},
});Removing Quotas
Set quotas to null or very high values to effectively remove limits:
// Remove all quotas (unlimited usage)
await memory.setQuota(ctx, {
orgId,
quotas: {
searchesPerDay: Infinity,
ingestionsPerDay: Infinity,
maxDocuments: Infinity,
maxStorageBytes: Infinity,
},
});Period Reset
Rate limits (searchesPerDay, ingestionsPerDay) use a 24-hour rolling window. The counter resets automatically — no cron jobs or manual intervention needed.
Storage limits (maxDocuments, maxStorageBytes) are absolute — they don't reset. But they do decrease when documents are deleted.
Organizations Without Quotas
If an organization has no quotas set, all operations are allowed without limits. Quotas are opt-in per organization — you only need to set them for organizations you want to limit.
// Org A: has quotas → limited to 1000 searches/day
// Org B: no quotas set → unlimitedThis lets you run a freemium model: set strict quotas for free-tier orgs, and leave enterprise orgs unlimited.
Example: SaaS Plan Enforcement
// When a customer signs up, set quotas based on their plan
async function onCustomerSignup(orgId: string, plan: "free" | "starter" | "enterprise") {
const quotasByPlan = {
free: {
searchesPerDay: 100,
ingestionsPerDay: 10,
maxDocuments: 50,
maxStorageBytes: 104857600, // 100MB
},
starter: {
searchesPerDay: 1000,
ingestionsPerDay: 100,
maxDocuments: 500,
maxStorageBytes: 1073741824, // 1GB
},
enterprise: {
searchesPerDay: Infinity,
ingestionsPerDay: Infinity,
maxDocuments: Infinity,
maxStorageBytes: Infinity,
},
};
await memory.setQuota(ctx, {
orgId,
quotas: quotasByPlan[plan],
});
}Availability
Usage Quotas are available on the Team tier only ($179 one-time).