Features
File Ingestion
Process 25+ file types including PDFs, images, audio, and video with automatic text extraction and AI analysis.
Overview
Memcity can process almost any file you throw at it. Upload a PDF, a PowerPoint presentation, a photo of a whiteboard, or even a video recording — Memcity extracts the text, chunks it, embeds it, and makes it searchable through the same RAG pipeline used for plain text.
The processing is automatic. You upload a file, tell Memcity to process it, and minutes later it's searchable via natural language queries.
Supported File Types
Documents
| Format | Extensions | How It's Processed |
|---|---|---|
.pdf | Text extracted via Jina Reader. If text extraction fails (scanned PDFs), falls back to the AI gateway model for OCR. | |
| Word | .docx, .doc | Processed through the AI gateway model which reads the document structure, headings, tables, and content. |
Plain Text
| Format | Extensions | How It's Processed |
|---|---|---|
| Text | .txt | Direct text extraction — no AI needed. |
| Markdown | .md | Direct extraction. Heading structure is preserved for citations. |
| HTML | .html | HTML tags are stripped, text content is extracted. |
| JSON | .json | Stringified and chunked. Useful for API responses, config files. |
| CSV | .csv | Parsed row by row. Each row or group of rows becomes a chunk. |
Spreadsheets
| Format | Extensions | How It's Processed |
|---|---|---|
| Excel | .xlsx, .xls | The AI gateway model reads each sheet, extracting tabular data with headers and cell values. Formulas are evaluated to their results. |
Presentations
| Format | Extensions | How It's Processed |
|---|---|---|
| PowerPoint | .pptx | Each slide is processed individually. The AI model reads slide titles, bullet points, and any text in shapes. Speaker notes are included. |
Images
| Format | Extensions | How It's Processed |
|---|---|---|
| Images | .png, .jpg, .webp, .gif, .heic, .heif | The AI gateway model's vision capability performs OCR (text extraction from images) AND generates a description of the visual content. A whiteboard photo gets both the text on the board and a description of the diagrams. |
Audio
| Format | Extensions | How It's Processed |
|---|---|---|
| Audio | .mp3, .wav, .m4a, .ogg, .flac, .aac, .webm | The AI gateway model transcribes the audio to text. Speaker diarization (who said what) is included when the model supports it. |
Video
| Format | Extensions | How It's Processed |
|---|---|---|
| Video | .mp4, .webm, .mov, .avi, .mkv | The AI gateway model extracts both the audio transcript and visual content descriptions. For a presentation recording, you get the spoken words plus the slide content. |
File Upload Flow
Processing a file takes three steps: generate an upload URL, upload the file, then trigger processing. Here's the complete flow:
Step 1: Generate an Upload URL
Convex uses presigned URLs for file uploads. This gives you a temporary URL that your frontend can upload directly to — no need to proxy through your server.
// convex/files.ts
import { action } from "./_generated/server";
import { Memory } from "./memcity/client";
import { components } from "./_generated/api";
const memory = new Memory(components.memory, {
tier: "pro",
ai: { gateway: "openrouter", model: "google/gemini-2.0-flash-001" },
});
export const getUploadUrl = action({
args: {},
handler: async (ctx) => {
// Returns a temporary URL valid for ~1 hour
const uploadUrl = await memory.getUploadUrl(ctx);
return uploadUrl;
},
});Step 2: Upload the File from Your Frontend
// In your React component
async function handleFileUpload(file: File) {
// Get the presigned upload URL from Convex
const uploadUrl = await getUploadUrl();
// Upload directly to Convex storage
const response = await fetch(uploadUrl, {
method: "POST",
headers: { "Content-Type": file.type },
body: file,
});
const { storageId } = await response.json();
return storageId;
}Step 3: Process the Uploaded File
// convex/files.ts
export const processFile = action({
args: {
storageId: v.id("_storage"),
fileName: v.string(),
orgId: v.string(),
knowledgeBaseId: v.string(),
},
handler: async (ctx, args) => {
// This triggers the full processing pipeline:
// 1. Detect file type from extension/MIME type
// 2. Extract text using the appropriate processor
// 3. Chunk the extracted text
// 4. Generate embeddings for each chunk
// 5. Index for search
// 6. Extract entities and relationships
const result = await memory.processUploadedFile(ctx, {
orgId: args.orgId,
knowledgeBaseId: args.knowledgeBaseId,
storageId: args.storageId,
fileName: args.fileName,
});
return result;
// { success: true, chunkCount: 47, documentId: "..." }
},
});Complete Frontend Example
Putting it all together with a drag-and-drop upload:
function FileUploader({ orgId, kbId }: { orgId: string; kbId: string }) {
const [uploading, setUploading] = useState(false);
async function onDrop(files: File[]) {
setUploading(true);
for (const file of files) {
// 1. Get upload URL
const uploadUrl = await getUploadUrl();
// 2. Upload file
const res = await fetch(uploadUrl, {
method: "POST",
headers: { "Content-Type": file.type },
body: file,
});
const { storageId } = await res.json();
// 3. Process file
await processFile({
storageId,
fileName: file.name,
orgId,
knowledgeBaseId: kbId,
});
console.log(`Processed ${file.name}`);
}
setUploading(false);
}
return (
<div onDrop={(e) => {
e.preventDefault();
onDrop(Array.from(e.dataTransfer.files));
}}>
{uploading ? "Processing..." : "Drop files here"}
</div>
);
}Batch Ingestion
For multiple text documents, use batchIngest to process them all in one call:
await memory.batchIngest(ctx, {
orgId,
knowledgeBaseId: kbId,
documents: [
{ text: "First document content...", source: "doc1.md" },
{ text: "Second document content...", source: "doc2.md" },
{ text: "Third document content...", source: "doc3.md" },
],
});This is more efficient than calling ingestText in a loop because it batches the embedding API calls.
URL Ingestion
Ingest web pages directly from a URL:
await memory.ingestUrl(ctx, {
orgId,
knowledgeBaseId: kbId,
url: "https://docs.example.com/getting-started",
});SSRF Protection: Memcity validates URLs before fetching to prevent Server-Side Request Forgery attacks. Internal IPs (127.0.0.1, 10.x.x.x, 192.168.x.x) and private hostnames are blocked.
What Happens Inside the Processing Pipeline
When you process a PDF, here's what happens step by step:
- File detection — Memcity reads the file extension and MIME type to determine the processor.
- Text extraction — For PDF, Jina Reader extracts the text content. If it fails (e.g., the PDF is a scanned image), the AI gateway model performs OCR.
- Text cleaning — Extra whitespace, headers/footers, and formatting artifacts are removed.
- Pre-chunk structuring (Team, optional) — The full document is analyzed by the RLM enrichment pipeline to extract entities, relationships, section boundaries, and an optional normalized text. Section hints guide the chunker in step 5.
- Chunking — The cleaned (or normalized) text is split into chunks using the configured strategy (recursive by default, ~512 tokens each). If RLM section hints are available, chunk boundaries align with logical sections.
- Chunk-level enrichment (Team, optional) — Each chunk is analyzed by the RLM enrichment pipeline to produce summaries, key terms, resolved text (pronouns replaced with canonical names), and entity extraction. This runs before embedding so enrichment data is baked into vectors.
- Embedding — Each chunk is sent to Gemini to generate a 1,024-dimensional vector embedding. If enrichment data is available, it's included in the embedding input for richer semantic representation.
- Indexing — Vectors are stored in Convex's vector index. Raw text (or enriched text) is indexed for BM25 keyword search.
- Entity extraction (Pro+) — The LLM identifies entities and relationships in each chunk, adding them to the knowledge graph.
- Metadata — Source file name, page numbers, and heading structure are stored for citation generation.
File Size Limits
- Maximum file size: 100MB per file
- Maximum files per batch: No hard limit, but processing is sequential within a batch
- Recommended: For very large files (over 50MB), consider splitting them into smaller pieces before upload
Error Handling
File processing can fail for various reasons. Always handle errors:
try {
const result = await memory.processUploadedFile(ctx, {
orgId,
knowledgeBaseId: kbId,
storageId,
fileName: "report.pdf",
});
console.log(`Processed: ${result.chunkCount} chunks created`);
} catch (error) {
if (error.message.includes("Unsupported file type")) {
// File format not supported
} else if (error.message.includes("File too large")) {
// Over 100MB limit
} else if (error.message.includes("Text extraction failed")) {
// Could not extract text — corrupted file?
} else {
// Unexpected error
throw error;
}
}Availability
| Feature | Community | Pro | Team |
|---|---|---|---|
Text ingestion (ingestText) | Yes | Yes | Yes |
URL ingestion (ingestUrl) | - | Yes | Yes |
| File upload + processing | - | Yes | Yes |
| Batch ingestion | - | Yes | Yes |
| All 25+ file types | - | Yes | Yes |