Server-Side ZIP Extraction — Security and Performance
Users upload ZIPs for bulk imports, bundled assets, or backups. Extracting them on the server is mostly straightforward — until you hit zip-slip, zip-bombs, or OOM. This guide covers safe, performant extraction.
The zip-slip vulnerability
A malicious ZIP can contain entries with paths like ../../../etc/passwd. Naively extracting writes outside your target directory, potentially overwriting system files. Always normalize and validate paths:
import path from 'path';
function isSafePath(targetDir, entryName) {
const resolved = path.resolve(targetDir, entryName);
return resolved.startsWith(path.resolve(targetDir) + path.sep);
}
// Reject any entry that escapes targetDir
for (const entry of zip.files) {
if (!isSafePath('/uploads', entry.name)) {
throw new Error('Zip-slip attempted: ' + entry.name);
}
}
Safe extraction in Node.js with yauzl
import yauzl from 'yauzl';
yauzl.open('upload.zip', { lazyEntries: true }, (err, zip) => {
zip.readEntry();
zip.on('entry', (entry) => {
if (!isSafePath('/uploads', entry.fileName)) {
return zip.readEntry(); // skip
}
if (/\/$/.test(entry.fileName)) {
fs.mkdirSync(path.join('/uploads', entry.fileName), { recursive: true });
zip.readEntry();
} else {
zip.openReadStream(entry, (err, read) => {
const write = fs.createWriteStream(path.join('/uploads', entry.fileName));
read.pipe(write).on('finish', () => zip.readEntry());
});
}
});
});
Protecting against zip-bombs
A 42KB ZIP can expand to 4.5 petabytes (the 42.zip benchmark). Always check uncompressed size before writing:
const MAX_TOTAL = 1 * 1024 * 1024 * 1024; // 1GB
const MAX_RATIO = 100; // 100x compression max
let totalUncompressed = 0;
for (const entry of zip.files) {
totalUncompressed += entry.uncompressedSize;
const ratio = entry.uncompressedSize / Math.max(entry.compressedSize, 1);
if (totalUncompressed > MAX_TOTAL) throw new Error('Zip too large when extracted');
if (ratio > MAX_RATIO) throw new Error('Suspicious compression ratio');
}
Python extraction
import zipfile, os
with zipfile.ZipFile('upload.zip') as zf:
for member in zf.infolist():
target = os.path.realpath(os.path.join('/uploads', member.filename))
if not target.startswith(os.path.realpath('/uploads')):
raise ValueError('Zip-slip: ' + member.filename)
zf.extractall('/uploads')
Streaming extraction for large ZIPs
For multi-GB archives, don't buffer the whole file in memory. Use a streaming parser that reads the central directory, then streams each entry to disk or to S3:
import unzipper from 'unzipper';
import { createWriteStream } from 'fs';
fs.createReadStream('upload.zip')
.pipe(unzipper.Parse())
.on('entry', (entry) => {
entry.pipe(createWriteStream('/uploads/' + entry.path));
});
Testing with sample ZIPs
- Standard ZIPs (1KB-1GB) — cover size tiers
- Nested ZIPs — zip-in-zip, test your recursion limits
- ZIPs with long filenames (>255 chars)
- ZIPs with Unicode filenames
- ZIPs with 1000+ entries (stress test your entry iteration)
Related
For encrypted archives, see handling password-protected archives. For other compression formats, compare with RAR and 7Z.