Data Serialization Formats: JSON, YAML, TOML, XML, and Protocol Buffers
Compare data serialization formats for configuration files, API responses, and inter-service communication. Understand parsing performance, human readability, and schema validation capabilities.
Key Takeaways
- Serialization formats divide into human-readable text formats (JSON, YAML, TOML, XML) and binary formats (Protocol Buffers, MessagePack, CBOR).
- JSON's simplicity and universal language support make it the default for APIs and data interchange.
- YAML supports comments, anchors, multi-line strings, and rich data types.
- JSON**: APIs, data interchange, structured storage
Konverter CSV ↔ JSON
Konversi antara format CSV dan JSON
Human-Readable vs Binary Formats
Serialization formats divide into human-readable text formats (JSON, YAML, TOML, XML) and binary formats (Protocol Buffers, MessagePack, CBOR). Text formats prioritize debuggability and editability. Binary formats prioritize parsing speed and compact wire size.
Text Format Comparison
| Feature | JSON | YAML | TOML | XML |
|---|---|---|---|---|
| Comments | No | Yes | Yes | Yes |
| Multi-line strings | No (escaped) | Yes | Yes | CDATA |
| Data types | 6 types | Rich | Typed | String-based |
| Nesting | Braces | Indentation | Tables/sections | Tags |
| Trailing commas | No | N/A | No | N/A |
| Parse speed | Fast | Slow | Fast | Medium |
JSON: The API Standard
JSON's simplicity and universal language support make it the default for APIs and data interchange. Its limitations — no comments, no trailing commas, no multi-line strings — make it less ideal for human-edited configuration files but excellent for machine-generated data.
YAML: Powerful but Dangerous
YAML supports comments, anchors, multi-line strings, and rich data types. However, its flexibility creates foot-guns: on and off are booleans, 1.0 can be a float or a string depending on context, and indentation errors cause silent data corruption rather than parse failures.
Choosing Your Format
- JSON: APIs, data interchange, structured storage
- TOML: Application configuration files
- YAML: Kubernetes manifests, CI/CD pipelines (where the ecosystem demands it)
- XML: Document markup, legacy systems, SOAP APIs
- Protocol Buffers: High-performance microservice communication