File Ingestion | Memcity

Overview

Memcity can process almost any file you throw at it. Upload a PDF, a PowerPoint presentation, a photo of a whiteboard, or even a video recording — Memcity extracts the text, chunks it, embeds it, and makes it searchable through the same RAG pipeline used for plain text.

The processing is automatic. You upload a file, tell Memcity to process it, and minutes later it's searchable via natural language queries.

Supported File Types

Documents

Format	Extensions	How It's Processed
PDF	`.pdf`	Text extracted via Jina Reader. If text extraction fails (scanned PDFs), falls back to the AI gateway model for OCR.
Word	`.docx`, `.doc`	Processed through the AI gateway model which reads the document structure, headings, tables, and content.

Plain Text

Format	Extensions	How It's Processed
Text	`.txt`	Direct text extraction — no AI needed.
Markdown	`.md`	Direct extraction. Heading structure is preserved for citations.
HTML	`.html`	HTML tags are stripped, text content is extracted.
JSON	`.json`	Stringified and chunked. Useful for API responses, config files.
CSV	`.csv`	Parsed row by row. Each row or group of rows becomes a chunk.

Spreadsheets

Format	Extensions	How It's Processed
Excel	`.xlsx`, `.xls`	The AI gateway model reads each sheet, extracting tabular data with headers and cell values. Formulas are evaluated to their results.

Presentations

Format	Extensions	How It's Processed
PowerPoint	`.pptx`	Each slide is processed individually. The AI model reads slide titles, bullet points, and any text in shapes. Speaker notes are included.

Images

Format	Extensions	How It's Processed
Images	`.png`, `.jpg`, `.webp`, `.gif`, `.heic`, `.heif`	The AI gateway model's vision capability performs OCR (text extraction from images) AND generates a description of the visual content. A whiteboard photo gets both the text on the board and a description of the diagrams.

Audio

Format	Extensions	How It's Processed
Audio	`.mp3`, `.wav`, `.m4a`, `.ogg`, `.flac`, `.aac`, `.webm`	The AI gateway model transcribes the audio to text. Speaker diarization (who said what) is included when the model supports it.

Video

Format	Extensions	How It's Processed
Video	`.mp4`, `.webm`, `.mov`, `.avi`, `.mkv`	The AI gateway model extracts both the audio transcript and visual content descriptions. For a presentation recording, you get the spoken words plus the slide content.

File Upload Flow

Processing a file takes three steps: generate an upload URL, upload the file, then trigger processing. Here's the complete flow:

Step 1: Generate an Upload URL

Convex uses presigned URLs for file uploads. This gives you a temporary URL that your frontend can upload directly to — no need to proxy through your server.

// convex/files.ts
import { action } from "./_generated/server";
import { Memory } from "memcity";
import { components } from "./_generated/api";
 
const memory = new Memory(components.memcity, {
  tier: "pro",
  ai: { gateway: "openrouter", model: "google/gemini-2.0-flash-001" },
});
 
export const getUploadUrl = action({
  args: {},
  handler: async (ctx) => {
    // Returns a temporary URL valid for ~1 hour
    const uploadUrl = await memory.getUploadUrl(ctx);
    return uploadUrl;
  },
});

Step 2: Upload the File from Your Frontend

// In your React component
async function handleFileUpload(file: File) {
  // Get the presigned upload URL from Convex
  const uploadUrl = await getUploadUrl();
 
  // Upload directly to Convex storage
  const response = await fetch(uploadUrl, {
    method: "POST",
    headers: { "Content-Type": file.type },
    body: file,
  });
 
  const { storageId } = await response.json();
  return storageId;
}

Step 3: Process the Uploaded File

// convex/files.ts
export const processFile = action({
  args: {
    storageId: v.id("_storage"),
    fileName: v.string(),
    orgId: v.string(),
    knowledgeBaseId: v.string(),
  },
  handler: async (ctx, args) => {
    // This triggers the full processing pipeline:
    // 1. Detect file type from extension/MIME type
    // 2. Extract text using the appropriate processor
    // 3. Chunk the extracted text
    // 4. Generate embeddings for each chunk
    // 5. Index for search
    // 6. Extract entities and relationships
    const result = await memory.processUploadedFile(ctx, {
      orgId: args.orgId,
      knowledgeBaseId: args.knowledgeBaseId,
      storageId: args.storageId,
      fileName: args.fileName,
    });
 
    return result;
    // { success: true, chunkCount: 47, documentId: "..." }
  },
});

Complete Frontend Example

Putting it all together with a drag-and-drop upload:

tsx

function FileUploader({ orgId, kbId }: { orgId: string; kbId: string }) {
  const [uploading, setUploading] = useState(false);
 
  async function onDrop(files: File[]) {
    setUploading(true);
 
    for (const file of files) {
      // 1. Get upload URL
      const uploadUrl = await getUploadUrl();
 
      // 2. Upload file
      const res = await fetch(uploadUrl, {
        method: "POST",
        headers: { "Content-Type": file.type },
        body: file,
      });
      const { storageId } = await res.json();
 
      // 3. Process file
      await processFile({
        storageId,
        fileName: file.name,
        orgId,
        knowledgeBaseId: kbId,
      });
 
      console.log(`Processed ${file.name}`);
    }
 
    setUploading(false);
  }
 
  return (
    <div onDrop={(e) => {
      e.preventDefault();
      onDrop(Array.from(e.dataTransfer.files));
    }}>
      {uploading ? "Processing..." : "Drop files here"}
    </div>
  );
}

Batch Ingestion

For multiple text documents, use batchIngest to process them all in one call:

await memory.batchIngest(ctx, {
  orgId,
  knowledgeBaseId: kbId,
  documents: [
    { text: "First document content...", source: "doc1.md" },
    { text: "Second document content...", source: "doc2.md" },
    { text: "Third document content...", source: "doc3.md" },
  ],
});

This is more efficient than calling ingestText in a loop because it batches the embedding API calls.

URL Ingestion

Ingest web pages directly from a URL:

await memory.ingestUrl(ctx, {
  orgId,
  knowledgeBaseId: kbId,
  url: "https://docs.example.com/getting-started",
});

SSRF Protection: Memcity validates URLs before fetching to prevent Server-Side Request Forgery attacks. Internal IPs (127.0.0.1, 10.x.x.x, 192.168.x.x) and private hostnames are blocked.

What Happens Inside the Processing Pipeline

When you process a PDF, here's what happens step by step:

File detection — Memcity reads the file extension and MIME type to determine the processor.
Text extraction — For PDF, Jina Reader extracts the text content. If it fails (e.g., the PDF is a scanned image), the AI gateway model performs OCR.
Text cleaning — Extra whitespace, headers/footers, and formatting artifacts are removed.
Chunking — The cleaned text is split into chunks using the configured strategy (recursive by default, ~512 tokens each).
Embedding — Each chunk is sent to Jina v4 to generate a 1,024-dimensional vector embedding.
Indexing — Vectors are stored in Convex's vector index. Raw text is indexed for BM25 keyword search.
Entity extraction (Pro+) — The LLM identifies entities and relationships in each chunk, adding them to the knowledge graph.
Metadata — Source file name, page numbers, and heading structure are stored for citation generation.

File Size Limits

Maximum file size: 100MB per file
Maximum files per batch: No hard limit, but processing is sequential within a batch
Recommended: For very large files (over 50MB), consider splitting them into smaller pieces before upload

Error Handling

File processing can fail for various reasons. Always handle errors:

try {
  const result = await memory.processUploadedFile(ctx, {
    orgId,
    knowledgeBaseId: kbId,
    storageId,
    fileName: "report.pdf",
  });
  console.log(`Processed: ${result.chunkCount} chunks created`);
} catch (error) {
  if (error.message.includes("Unsupported file type")) {
    // File format not supported
  } else if (error.message.includes("File too large")) {
    // Over 100MB limit
  } else if (error.message.includes("Text extraction failed")) {
    // Could not extract text — corrupted file?
  } else {
    // Unexpected error
    throw error;
  }
}

Availability

Feature	Community	Pro	Team
Text ingestion (`ingestText`)	Yes	Yes	Yes
URL ingestion (`ingestUrl`)	-	Yes	Yes
File upload + processing	-	Yes	Yes
Batch ingestion	-	Yes	Yes
All 25+ file types	-	Yes	Yes