- Comparison of data serialization formats
-
This is a comparison of data serialization formats, different ways to convert complex objects to sequences of bits. It does not include markup languages used exclusively as document file formats.
Contents
Overview
Name Creator/Maintainer Based on Standardized? Specification Binary? Human-readable? Includes reference support?e Schema/IDL? Standard APIs ASN.1 ISO, IEC, ITU-T N/A Yes ISO/IEC 8824; X.680 series of ITU-T Recommendations Yes
(BER, DER, PER, or custom via ECN)Yes
(XER, GSER, or custom via ECN)Partial f Yes (built-in) N/A Bencode BitTorrent, Inc. N/A Yes Part of BitTorrent protocol specification Partially
(numbers are ASCII-based)No No No No BSON MongoDB JSON Yes BSON Specification Yes No No No No Comma-separated values (CSV) RFC author:
Yakov ShafranovichN/A Partial
(myriad informal variants used)RFC 4180
(among others)No Yes No No No D-Bus Message Protocol freedesktop.org N/A Yes D-Bus Specification Yes Yes
(Type Signatures)No No Yes
(see D-Bus)JSON Douglas Crockford JavaScript syntax Yes RFC 4627 No, but see BSON Yes No Partial
(Kwalify, Rx, JSON Schema Proposal)No MessagePack Sadayuki Furuhashi JSON (loosely) Yes MessagePack format specification Yes No No No No Netstrings Dan Bernstein N/A Yes netstrings.txt Yes Yes No No No OGDL Rolf Veen ? Yes 1.0 Working draft Yes
(Binary 1.0 Working draft)Yes Yes
(Path 1.0 Working draft)Yes
(Schema WD)Property list NeXT (creator)
Apple (maintainer)? Partial Public DTD for XML format Yesa Yesb No ? Cocoa, CoreFoundation, OpenStep, GnuStep Protocol Buffers Google N/A Partial Developer Guide: Encoding Yes Partiald No Yes (built-in) S-expressions Internet Draft author:
Ron RivestLisp, Netstrings Partial
(largely de facto)"S-Expressions" Internet Draft Yes
("Canonical representation")Yes
("Advanced transport representation")No No Structured Data eXchange Formats IETF N/A Yes RFC 3072 Yes No No No Thrift Facebook (creator)
Apache (maintainer)N/A No Original whitepaper Yes Partialc No Yes (built-in) eXternal Data Representation IETF N/A Yes RFC 4506 Yes No Yes Yes Yes XML W3C SGML Yes W3C Recommendations:
1.0 (Fifth Edition)
1.1 (Second Edition)Partial
(Binary XML)Yes Yes (XPointer, XPath) Yes (XML schema) DOM, SAX, XQuery, XPath YAML Clark Evans, Ingy döt Net, and Oren Ben-Kiki XML, C, Python, Perl, Email Yes Version 1.2 No Yes Yes Partial (Kwalify, Rx, built-in language type-defs) No - a. ^ The current default format is binary.
- b. ^ The "classic" format is plain text, and an XML format is also supported.
- c. ^ Theoretically possible due to abstraction, but no implementation is included.
- d. ^ The primary format is binary but a text format is available.[1]
- e. ^ Means that generic tools/libraries know how to encode, decode, and dereference a reference to another piece of data in the same document. A tool may require the IDL file, but no more. Excludes custom, non-standardized referencing techniques.
- f. ^ ASN.1 does offer OIDs, a standard format for globally unique identifiers. However, there is no standard for "marking"/"tagging" an arbitrary piece of data in a document with an OID. There is also no standard format for locally unique identifiers within a document. Therefore, a generic ASN.1 tool/library can not automatically encode/decode/resolve references within a document without help from custom-written program code.
Syntax comparison of human-readable formats
Format Null Boolean true Boolean false Integer Floating-point String Array Associative array/Object ASN.1
(XML Encoding Rules)<foo />
<foo>true</foo>
<foo>false</foo>
<foo>685230</foo>
<foo>6.8523015e+5</foo>
<foo>A to Z</foo>
<SeqOfUnrelatedDatatypes> <isMarried>true</isMarried> <hobby /> <velocity>-42.1e7</velocity> <bookname>A to Z</bookname> <bookname>We said, "no".</bookname> </SeqOfUnrelatedDatatypes>
An object (the key is a field name): <person> <isMarried>true</isMarried> <hobby /> <height>1.85</height> <name>Bob Peterson</name> </person>
A data mapping (the key is a data value):
<competition> <measurement> <name>John</name> <height>3.14</height> </measurement> <measurement> <name>Jane</name> <height>2.718</height> </measurement> </competition>
CSVb null
a
(or an empty element in the row)a1
a
true
a0
a
false
a685230
-685230
a6.8523015e+5
aA to Z
"We said, ""no""."
true,,-42.1e7,"A to Z"
42,1 A to Z,1,2,3
Netstringsc 0:,
a
4:null,
a1:1,
a
4:true,
a1:0,
a
5:false,
a6:685230,
a9:6.8523e+5,
a6:A to Z,
29:4:true,0:,7:-42.1e7,6:A to Z,,
41:9:2:42,1:1,,25:6:A to Z,12:1:1,1:2,1:3,,,,
aJSON null
true
false
685230
-685230
6.8523015e+5
"A to Z"
[true, null, -42.1e7, "A to Z"]
{"42": true, "A to Z": [1, 2, 3]}
OGDL[verification needed] null
atrue
afalse
a685230
a6.8523015e+5
a"A to Z"
'A to Z'
NoSpaces
true null -42.1e7 "A to Z"
(true, null, -42.1e7, "A to Z")
42 true "A to Z" 1 2 3
42 true "A to Z", (1, 2, 3)
Property list
(plain text format)[2]N/A <*BY>
<*BN>
<*I685230>
<*R6.8523015e+5>
"A to Z"
( <*BY>, <*R-42.1e7>, "A to Z" )
{ "42" = <*BY>; "A to Z" = ( <*I1>, <*I2>, <*I3> ); }
Property list
(XML format)[3][4]N/A <true />
<false />
<integer>685230</integer>
<real>6.8523015e+5</real>
<string>A to Z</string>
<array> <true /> <real>-42.1e7</real> <string>A to Z</string> </array>
<dict> <key>42</key> <true /> <key>A to Z</key> <array> <integer>1</integer> <integer>2</integer> <integer>3</integer> </array> </dict>
S-expressions NIL
nil
T
#t
e
true
NIL
#f
e
false
685230
6.8523015e+5
abc
"abc"
#616263#
3:abc
{MzphYmM=}
|YWJj|
(T NIL -42.1e7 "A to Z")
((42 T) ("A to Z" (1 2 3)))
YAML ~
null
Null
NULL
[5]y
Y
yes
Yes
YES
on
On
ON
true
True
TRUE
[6]n
N
no
No
NO
off
Off
OFF
false
False
FALSE
[6]685230
+685_230
-685230
02472256
0x_0A_74_AE
0b1010_0111_0100_1010_1110
190:20:30
[7]6.8523015e+5
685.230_15e+03
685_230.15
190:20:30.15
.inf
-.inf
.Inf
.INF
.NaN
.nan
.NAN
[8]A to Z
"A to Z"
'A to Z'
[y, ~, -42.1e7, "A to Z"]
- y - - -42.1e7 - A to Z
{"John":3.14, "Jane":2.718}
42: y A to Z: [1, 2, 3]
XMLd <null />
a<boolean val="true"/>
a<true />
a<boolean val="false"/>
a<false />
a<integer>685230</integer>
a<float>6.8523015e+5</float>
aA to Z
a <array> <element type="boolean">true</element> <element type="null"/> <element type="float">-42.1e7</element> <element type="string">A to Z</element> </array>
a <associative-array> <entry> <key type="integer">42</key> <value type="boolean">true</value> </entry> <entry> <key type="string">A to Z</key> <value> <array> <element type="integer" val="1"/> <element type="integer" val="2"/> <element type="integer" val="3"/> </array> </value> </entry> </associative-array>
- a. ^ One possible encoding; the specification document does not specifically give an encoding for this datatype.
- b. ^ The RFC CSV specification only deals with delimiters, newlines, and quote characters; it does not directly deal with serializing programming data structures.
- c. ^ The netstrings specification only deals with nested byte strings; anything else is outside the scope of the specification.
- d. ^ XML in and of itself is not a data serialization language, but many data serialization formats have been derived from it; as such, there are many different ways, in addition to those shown, to serialize programming data structures into XML.
- e. ^ This syntax is not compatible with the Internet-Draft, but is used by some dialects of Lisp.
Comparison of binary formats
Format Null Booleans Integer Floating-point String Array Associative array/Object ASN.1
(BER or PER encoding)NULL type BOOLEAN; BER as 1 byte in binary form INTEGER; variable length big-endian binary representation up to 2^2^1024 bits REAL; representation as IEEE double or as three integer (mantissa + base + exponent) Multiple valid types (VisibleString, PrintableString, GeneralString, UniversalString) data specifications SET OF (unordered) and SEQUENCE OF (guaranteed order) user definable type BSON[9] Null type - 0 bytes for value True: one byte \x01
False:\x00
int32: 32-bit little-endian 2's complement or int64: 64-bit little-endian 2's complement double: little-endian binary64 UTF-8 encoded, preceded by int32 encoded string length in bytes BSON embedded document with numeric keys BSON embedded document MessagePack \xc0
True: \xc3
False:\xc2
Single byte "fixnum" (values -32..127) or typecode (one byte) + big-endian (u)int8/16/32/64
Typecode (one byte) + IEEE single/double As "fixraw" (single-byte prefix + up to 31 raw bytes) or typecode (one byte) + 2-4 bytes length + raw bytes
As "fixarray" (single-byte prefix + up to 15 array items) or typecode (one byte) + 2-4 bytes length + array items
As "fixmap" (single-byte prefix + up to 15 key-value pairs) or typecode (one byte) + 2-4 bytes length + key-value pairs
Netstrings 0:,
True: 1:1,
False:
1:0,
OGDL Binary Property list
(binary format)Protocol Buffers[10] Variable encoding length signed 32-bit: varint encoding of "ZigZag"-encoded value (n << 1) XOR (n >> 31)
Variable encoding length signed 64-bit: varint encoding of "ZigZag"-encoded
(n << 1) XOR (n >> 63)
Constant encoding length 32-bit: 32 bits in little-endian 2's complement
Constant encoding length 64-bit: 54 bits in little-endian 2's complementfloats: little-endian binary32 doubles: little-endian binary64
UTF-8 encoded, preceded by varint-encoded integer length of string in bytes Thrift See also
References
- ^ http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google.protobuf.text_format.html
- ^ http://www.gnustep.org/resources/documentation/Developer/Base/Reference/NSPropertyList.html
- ^ http://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPages/man5/plist.5.html
- ^ http://developer.apple.com/mac/library/documentation/CoreFoundation/Conceptual/CFPropertyLists/Articles/XMLTags.html#//apple_ref/doc/uid/20001172-CJBEJBHH
- ^ "Null Language-Independent Type for YAML™ Version 1.1". YAML.org. 2005-01-18. http://yaml.org/type/null.html. Retrieved 2009-09-12.
- ^ a b "Boolean Language-Independent Type for YAML™ Version 1.1". YAML.org. Clark C. Evans. 2005-01-18. http://yaml.org/type/bool.html. Retrieved 2009-09-12.
- ^ "Integer Language-Independent Type for YAML Version 1.1". YAML.org. Clark C. Evans. 2005-02-11. http://yaml.org/type/int.html. Retrieved 2009-09-12.
- ^ "Floating-Point Language-Independent Type for YAML™ Version 1.1". YAML.org. Clark C. Evans. 2005-01-18. http://yaml.org/type/float.html. Retrieved 2009-09-12.
- ^ http://bsonspec.org
- ^ http://code.google.com/apis/protocolbuffers/docs/encoding.html
External links
Categories:- Data serialization formats
- Persistence
Wikimedia Foundation. 2010.