Command Palette

Search for a command to run...

How we built a PDF toolkit that never sees your files

2 min read
Karan Kacha

Building a PDF editor usually means spinning up a cluster of servers running Ghostscript, Poppler, or a commercial PDF SDK. When a user uploads a file, the server queues the job, processes it, and serves a download link.

We didn't want to build that. We wanted to build a toolkit that respects user privacy by design—one that never sees your files.

Here is how we did it.

The Power of WebAssembly (Wasm)

Historically, JavaScript was too slow and memory-constrained to handle heavy binary data manipulation like parsing and rebuilding a 50MB PDF.

Enter WebAssembly (Wasm).

WebAssembly allows us to compile high-performance libraries written in C, C++, or Rust and run them directly in the browser at near-native speeds.

For SAMAST, we utilize libraries like pdf-lib (a pure TypeScript library) alongside optimized Wasm modules for heavier lifting when necessary. This allows us to parse the PDF structure, extract pages, and generate new PDF documents entirely in the client's memory.

Managing Browser Memory

The biggest challenge with client-side PDF processing is memory management. Browsers strictly limit how much RAM a single tab can consume. If a user tries to merge fifty 10MB PDFs, a naive implementation will crash the tab.

We solve this through:

  1. Incremental Parsing: We don't load the entire file into memory at once if we don't have to.
  2. Garbage Collection Optimization: We carefully manage references to large ArrayBuffers so the V8 engine can garbage collect them as soon as a page is processed.

Merge PDF

See the engineering in action. Try merging multiple large PDFs and notice how fast it processes locally.

Try it now

Frequently Asked Questions