Seeding Test Databases with Sample Data — SQL, JSON, CSV
Every development and staging environment needs realistic seed data. This guide shows how to seed databases using sample SQL, JSON, and CSV files from TrueFileSize — with examples for PostgreSQL, MySQL, SQLite, MongoDB, and ORMs.
Step 1: Download Sample Data
mkdir -p seed-data
CDN="https://cdn.truefilesize.com"
# SQL dump (CREATE TABLE + INSERT statements)
curl -sL -o seed-data/users.sql "$CDN/sql/sample-users.sql"
# JSON data (realistic user records)
curl -sL -o seed-data/users.json "$CDN/json/sample-users.json"
curl -sL -o seed-data/products.json "$CDN/json/sample-products.json"
# CSV data (tabular import)
curl -sL -o seed-data/users.csv "$CDN/csv/sample-users.csv"
# SQLite database (pre-seeded)
curl -sL -o seed-data/dev.sqlite "$CDN/sqlite/sample-1mb.sqlite"
Available: SQL · JSON · CSV · SQLite
PostgreSQL
# Import SQL dump directly
psql -U postgres -d myapp_dev < seed-data/users.sql
# Import CSV into existing table
psql -U postgres -d myapp_dev -c "\
COPY users(name, email, city, date) \
FROM '/path/to/seed-data/users.csv' \
WITH CSV HEADER;"
# Import JSON using psql + jq
cat seed-data/users.json | jq -c '.[]' | while read row; do
psql -U postgres -d myapp_dev -c "INSERT INTO users (data) VALUES ('$row'::jsonb);"
done
MySQL
# Import SQL dump
mysql -u root -p myapp_dev < seed-data/users.sql
# Import CSV
mysql -u root -p myapp_dev -e "
LOAD DATA LOCAL INFILE 'seed-data/users.csv'
INTO TABLE users
FIELDS TERMINATED BY ','
ENCLOSED BY '\"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;"
SQLite
# Use pre-seeded SQLite database directly
cp seed-data/dev.sqlite ./prisma/dev.db
# Or import SQL
sqlite3 dev.db < seed-data/users.sql
# Import CSV
sqlite3 dev.db <<EOF
.mode csv
.import seed-data/users.csv users
EOF
Download sample SQLite databases with multi-table schemas already populated.
MongoDB
# Import JSON array as collection
mongoimport --db myapp_dev --collection users \
--jsonArray --file seed-data/users.json
# Import CSV
mongoimport --db myapp_dev --collection users \
--type csv --headerline --file seed-data/users.csv
Node.js with Prisma
// prisma/seed.ts
import { PrismaClient } from '@prisma/client';
import users from '../seed-data/users.json';
import products from '../seed-data/products.json';
const prisma = new PrismaClient();
async function seed() {
// Clear existing data
await prisma.user.deleteMany();
await prisma.product.deleteMany();
// Seed users
for (const user of users) {
await prisma.user.create({
data: {
name: user.name,
email: user.email,
age: user.age,
city: user.city,
active: user.active,
},
});
}
// Seed products
for (const product of products) {
await prisma.product.create({
data: {
name: product.name,
price: product.price,
category: product.category,
inStock: product.inStock,
},
});
}
console.log(`Seeded ${users.length} users, ${products.length} products`);
}
seed()
.catch(console.error)
.finally(() => prisma.$disconnect());
Add to package.json:
{
"prisma": {
"seed": "tsx prisma/seed.ts"
}
}
Run: npx prisma db seed
Python with SQLAlchemy
import json
import csv
from sqlalchemy import create_engine, text
engine = create_engine('postgresql://localhost/myapp_dev')
# Seed from JSON
with open('seed-data/users.json') as f:
users = json.load(f)
with engine.begin() as conn:
for user in users:
conn.execute(text("""
INSERT INTO users (name, email, age, city, active)
VALUES (:name, :email, :age, :city, :active)
"""), user)
print(f"Seeded {len(users)} users")
CI/CD Integration
Add seeding to your test pipeline:
# .github/workflows/test.yml
- name: Download seed data
run: |
mkdir -p seed-data
curl -sL -o seed-data/users.json https://cdn.truefilesize.com/json/sample-users.json
curl -sL -o seed-data/products.json https://cdn.truefilesize.com/json/sample-products.json
- name: Seed database
run: npx prisma db seed
- name: Run tests
run: npm test
See our full CI/CD guide for caching and parallel jobs.
Sample Data Available
| Data Type | File | Records | Size | |-----------|------|---------|------| | Users (JSON) | sample-users.json | 500 | 77 KB | | Products (JSON) | sample-products.json | 500 | 111 KB | | Users (CSV) | sample-users.csv | 1,000+ | 50 KB | | SQL Dump | sample-users.sql | Various | 50 KB | | SQLite (pre-seeded) | sample-1mb.sqlite | 500+ | 1 MB | | Large dataset | sample-10000-records.json | 10,000 | 1.5 MB |
All data is generated with faker.js — realistic names, emails, cities, and dates. No real personal information.