🍋
Menu
Best Practice Beginner 1 min read 266 words

Database Export Formats: SQL, CSV, JSON, and Parquet

Choose the right export format for database migrations, analytics, and data sharing. Understand the tradeoffs between portability, performance, schema preservation, and human readability.

Key Takeaways

  • Database exports serve different purposes — migrations require schema and data fidelity, analytics need columnar efficiency, and sharing demands universal readability.
  • SQL dumps preserve everything — schemas, constraints, indexes, sequences, and data.
  • Apache Parquet stores data in columnar format with per-column compression and type encoding.
  • Same-engine migration**: SQL dump (pg_dump, mysqldump)

Export Format Tradeoffs

Database exports serve different purposes — migrations require schema and data fidelity, analytics need columnar efficiency, and sharing demands universal readability. No single format excels at everything.

Format Comparison

Feature SQL Dump CSV JSON Parquet
Schema included Yes No Flexible Yes (typed)
Data types Full Strings only Basic (6 types) Rich
Relationships Yes (FK) No Nested No
Compression No (text) No (text) No (text) Built-in (columnar)
Human readable Yes Yes Yes No (binary)
Query without import No Limited Limited Yes (DuckDB, Polars)

SQL Dump: Full Fidelity

SQL dumps preserve everything — schemas, constraints, indexes, sequences, and data. They are the standard for database-to-database migration within the same engine. Cross-engine compatibility is limited; PostgreSQL dumps do not import cleanly into MySQL without modification.

Parquet: The Analytics Standard

Apache Parquet stores data in columnar format with per-column compression and type encoding. A 1GB CSV file typically compresses to 100-200MB as Parquet while being dramatically faster to query. Tools like DuckDB, Polars, and pandas can query Parquet files directly without importing into a database.

Choosing Your Format

  • Same-engine migration: SQL dump (pg_dump, mysqldump)
  • Cross-engine migration: CSV + separate schema DDL
  • Analytics/data science: Parquet
  • API data exchange: JSON
  • Human review: CSV or formatted SQL