memcity

Features

File Ingestion

Process 25+ file types including PDFs, images, audio, and video with automatic text extraction and AI analysis.

Overview

Memcity can process almost any file you throw at it. Upload a PDF, a PowerPoint presentation, a photo of a whiteboard, or even a video recording — Memcity extracts the text, chunks it, embeds it, and makes it searchable through the same RAG pipeline used for plain text.

The processing is automatic. You upload a file, tell Memcity to process it, and minutes later it's searchable via natural language queries.

Supported File Types

Documents

FormatExtensionsHow It's Processed
PDF.pdfText extracted via Jina Reader. If text extraction fails (scanned PDFs), falls back to the AI gateway model for OCR.
Word.docx, .docProcessed through the AI gateway model which reads the document structure, headings, tables, and content.

Plain Text

FormatExtensionsHow It's Processed
Text.txtDirect text extraction — no AI needed.
Markdown.mdDirect extraction. Heading structure is preserved for citations.
HTML.htmlHTML tags are stripped, text content is extracted.
JSON.jsonStringified and chunked. Useful for API responses, config files.
CSV.csvParsed row by row. Each row or group of rows becomes a chunk.

Spreadsheets

FormatExtensionsHow It's Processed
Excel.xlsx, .xlsThe AI gateway model reads each sheet, extracting tabular data with headers and cell values. Formulas are evaluated to their results.

Presentations

FormatExtensionsHow It's Processed
PowerPoint.pptxEach slide is processed individually. The AI model reads slide titles, bullet points, and any text in shapes. Speaker notes are included.

Images

FormatExtensionsHow It's Processed
Images.png, .jpg, .webp, .gif, .heic, .heifThe AI gateway model's vision capability performs OCR (text extraction from images) AND generates a description of the visual content. A whiteboard photo gets both the text on the board and a description of the diagrams.

Audio

FormatExtensionsHow It's Processed
Audio.mp3, .wav, .m4a, .ogg, .flac, .aac, .webmThe AI gateway model transcribes the audio to text. Speaker diarization (who said what) is included when the model supports it.

Video

FormatExtensionsHow It's Processed
Video.mp4, .webm, .mov, .avi, .mkvThe AI gateway model extracts both the audio transcript and visual content descriptions. For a presentation recording, you get the spoken words plus the slide content.

File Upload Flow

Processing a file takes three steps: generate an upload URL, upload the file, then trigger processing. Here's the complete flow:

Step 1: Generate an Upload URL

Convex uses presigned URLs for file uploads. This gives you a temporary URL that your frontend can upload directly to — no need to proxy through your server.

ts
// convex/files.ts
import { action } from "./_generated/server";
import { Memory } from "memcity";
import { components } from "./_generated/api";
 
const memory = new Memory(components.memcity, {
  tier: "pro",
  ai: { gateway: "openrouter", model: "google/gemini-2.0-flash-001" },
});
 
export const getUploadUrl = action({
  args: {},
  handler: async (ctx) => {
    // Returns a temporary URL valid for ~1 hour
    const uploadUrl = await memory.getUploadUrl(ctx);
    return uploadUrl;
  },
});

Step 2: Upload the File from Your Frontend

ts
// In your React component
async function handleFileUpload(file: File) {
  // Get the presigned upload URL from Convex
  const uploadUrl = await getUploadUrl();
 
  // Upload directly to Convex storage
  const response = await fetch(uploadUrl, {
    method: "POST",
    headers: { "Content-Type": file.type },
    body: file,
  });
 
  const { storageId } = await response.json();
  return storageId;
}

Step 3: Process the Uploaded File

ts
// convex/files.ts
export const processFile = action({
  args: {
    storageId: v.id("_storage"),
    fileName: v.string(),
    orgId: v.string(),
    knowledgeBaseId: v.string(),
  },
  handler: async (ctx, args) => {
    // This triggers the full processing pipeline:
    // 1. Detect file type from extension/MIME type
    // 2. Extract text using the appropriate processor
    // 3. Chunk the extracted text
    // 4. Generate embeddings for each chunk
    // 5. Index for search
    // 6. Extract entities and relationships
    const result = await memory.processUploadedFile(ctx, {
      orgId: args.orgId,
      knowledgeBaseId: args.knowledgeBaseId,
      storageId: args.storageId,
      fileName: args.fileName,
    });
 
    return result;
    // { success: true, chunkCount: 47, documentId: "..." }
  },
});

Complete Frontend Example

Putting it all together with a drag-and-drop upload:

tsx
function FileUploader({ orgId, kbId }: { orgId: string; kbId: string }) {
  const [uploading, setUploading] = useState(false);
 
  async function onDrop(files: File[]) {
    setUploading(true);
 
    for (const file of files) {
      // 1. Get upload URL
      const uploadUrl = await getUploadUrl();
 
      // 2. Upload file
      const res = await fetch(uploadUrl, {
        method: "POST",
        headers: { "Content-Type": file.type },
        body: file,
      });
      const { storageId } = await res.json();
 
      // 3. Process file
      await processFile({
        storageId,
        fileName: file.name,
        orgId,
        knowledgeBaseId: kbId,
      });
 
      console.log(`Processed ${file.name}`);
    }
 
    setUploading(false);
  }
 
  return (
    <div onDrop={(e) => {
      e.preventDefault();
      onDrop(Array.from(e.dataTransfer.files));
    }}>
      {uploading ? "Processing..." : "Drop files here"}
    </div>
  );
}

Batch Ingestion

For multiple text documents, use batchIngest to process them all in one call:

ts
await memory.batchIngest(ctx, {
  orgId,
  knowledgeBaseId: kbId,
  documents: [
    { text: "First document content...", source: "doc1.md" },
    { text: "Second document content...", source: "doc2.md" },
    { text: "Third document content...", source: "doc3.md" },
  ],
});

This is more efficient than calling ingestText in a loop because it batches the embedding API calls.

URL Ingestion

Ingest web pages directly from a URL:

ts
await memory.ingestUrl(ctx, {
  orgId,
  knowledgeBaseId: kbId,
  url: "https://docs.example.com/getting-started",
});

SSRF Protection: Memcity validates URLs before fetching to prevent Server-Side Request Forgery attacks. Internal IPs (127.0.0.1, 10.x.x.x, 192.168.x.x) and private hostnames are blocked.

What Happens Inside the Processing Pipeline

When you process a PDF, here's what happens step by step:

  1. File detection — Memcity reads the file extension and MIME type to determine the processor.
  2. Text extraction — For PDF, Jina Reader extracts the text content. If it fails (e.g., the PDF is a scanned image), the AI gateway model performs OCR.
  3. Text cleaning — Extra whitespace, headers/footers, and formatting artifacts are removed.
  4. Chunking — The cleaned text is split into chunks using the configured strategy (recursive by default, ~512 tokens each).
  5. Embedding — Each chunk is sent to Jina v4 to generate a 1,024-dimensional vector embedding.
  6. Indexing — Vectors are stored in Convex's vector index. Raw text is indexed for BM25 keyword search.
  7. Entity extraction (Pro+) — The LLM identifies entities and relationships in each chunk, adding them to the knowledge graph.
  8. Metadata — Source file name, page numbers, and heading structure are stored for citation generation.

File Size Limits

  • Maximum file size: 100MB per file
  • Maximum files per batch: No hard limit, but processing is sequential within a batch
  • Recommended: For very large files (over 50MB), consider splitting them into smaller pieces before upload

Error Handling

File processing can fail for various reasons. Always handle errors:

ts
try {
  const result = await memory.processUploadedFile(ctx, {
    orgId,
    knowledgeBaseId: kbId,
    storageId,
    fileName: "report.pdf",
  });
  console.log(`Processed: ${result.chunkCount} chunks created`);
} catch (error) {
  if (error.message.includes("Unsupported file type")) {
    // File format not supported
  } else if (error.message.includes("File too large")) {
    // Over 100MB limit
  } else if (error.message.includes("Text extraction failed")) {
    // Could not extract text — corrupted file?
  } else {
    // Unexpected error
    throw error;
  }
}

Availability

FeatureCommunityProTeam
Text ingestion (ingestText)YesYesYes
URL ingestion (ingestUrl)-YesYes
File upload + processing-YesYes
Batch ingestion-YesYes
All 25+ file types-YesYes