🍋
Menu
Best Practice Beginner 1 min read 269 words

Data Serialization Formats: JSON, YAML, TOML, XML, and Protocol Buffers

Compare data serialization formats for configuration files, API responses, and inter-service communication. Understand parsing performance, human readability, and schema validation capabilities.

Key Takeaways

  • Serialization formats divide into human-readable text formats (JSON, YAML, TOML, XML) and binary formats (Protocol Buffers, MessagePack, CBOR).
  • JSON's simplicity and universal language support make it the default for APIs and data interchange.
  • YAML supports comments, anchors, multi-line strings, and rich data types.
  • JSON**: APIs, data interchange, structured storage

Human-Readable vs Binary Formats

Serialization formats divide into human-readable text formats (JSON, YAML, TOML, XML) and binary formats (Protocol Buffers, MessagePack, CBOR). Text formats prioritize debuggability and editability. Binary formats prioritize parsing speed and compact wire size.

Text Format Comparison

Feature JSON YAML TOML XML
Comments No Yes Yes Yes
Multi-line strings No (escaped) Yes Yes CDATA
Data types 6 types Rich Typed String-based
Nesting Braces Indentation Tables/sections Tags
Trailing commas No N/A No N/A
Parse speed Fast Slow Fast Medium

JSON: The API Standard

JSON's simplicity and universal language support make it the default for APIs and data interchange. Its limitations — no comments, no trailing commas, no multi-line strings — make it less ideal for human-edited configuration files but excellent for machine-generated data.

YAML: Powerful but Dangerous

YAML supports comments, anchors, multi-line strings, and rich data types. However, its flexibility creates foot-guns: on and off are booleans, 1.0 can be a float or a string depending on context, and indentation errors cause silent data corruption rather than parse failures.

Choosing Your Format

  • JSON: APIs, data interchange, structured storage
  • TOML: Application configuration files
  • YAML: Kubernetes manifests, CI/CD pipelines (where the ecosystem demands it)
  • XML: Document markup, legacy systems, SOAP APIs
  • Protocol Buffers: High-performance microservice communication