Notice
This page describes the MessagePack data format, which is required for developing the MessagePack language bindings.
MessagePack format specification
MessagePack saves type-information to the serialized data. Thus each data is stored in *type-data* or *type-length-data* style.
MessagePack supports following types:
- Fixed length types
- Integers
- nil
- boolean
- Floating point
- Variable length types
- Raw bytes
- Container types
- Arrays
- Maps
Each type has one or more serialize format:
- Fixed length types
- Integers
- positive fixnum
- negative fixnum
- uint 8
- uint 16
- uint 32
- uint 64
- int 8
- int 16
- int 32
- int 64
- Nil
- nil
- Boolean
- true
- false
- Floating point
- float
- double
- Integers
- Variable length types
- Raw bytes
- fix raw
- raw 16
- raw 32
- Raw bytes
- Container types
- Arrays
- fix array
- array 16
- array 32
- Maps
- fix map
- map 16
- map 32
- Arrays
To serialize strings, use UTF-8 encoding and Raw type.
See this thread to understand the reason why msgpack doesn't have string type: https://github.com/msgpack/msgpack/issues/121
Integers
positive fixnum
save an integer within the range [0, 127] in 1 bytes.
negative fixnum
save an integer within the range [-32, -1] in 1 bytes.
uint 8
save an unsigned 8-bit integer in 2 bytes.
uint 16
save an unsigned 16-bit big-endian integer in 3 bytes.
uint 32
save an unsigned 32-bit big-endian integer in 5 bytes.
uint 64
save an unsigned 64-bit big-endian integer in 9 bytes.
int 8
save a signed 8-bit integer in 2 bytes.
int 16
save a signed 16-bit big-endian integer in 3 bytes.
int 32
save a signed 32-bit big-endian integer in 5 bytes.
int 64
save a signed 64-bit big-endian integer in 9 bytes.
Nil
nil
save a nil.
Boolean
true
save a true.
false
save a false.
Floating point
float
save a big-endian IEEE 754 single precision floating point number in 5 bytes.
double
save a big-endian IEEE 754 double precision floating point number in 9 bytes.
Raw bytes
fix raw
save raw bytes up to 31 bytes.
raw 16
save raw bytes up to (2^16)-1 bytes. Length is stored in unsigned 16-bit big-endian integer.
raw 32
save raw bytes up to (2^32)-1 bytes. Length is stored in unsigned 32-bit big-endian integer.
Arrays
fix array
save an array up to 15 elements.
array 16
save an array up to (2^16)-1 elements. Number of elements is stored in unsigned 16-bit big-endian integer.
array 32
save an array up to (2^32)-1 elements. Number of elements is stored in unsigned 32-bit big-endian integer.
Maps
fix map
save a map up to 15 elements.
map 16
save a map up to (2^16)-1 elements. Number of elements is stored in unsigned 16-bit big-endian integer.
map 32
save a map up to (2^32)-1 elements. Number of elements is stored in unsigned 32-bit big-endian integer.
Type Chart
| Type | Binary | Hex |
|---|---|---|
| Positive FixNum | 0xxxxxxx | 0x00 - 0x7f |
| FixMap | 1000xxxx | 0x80 - 0x8f |
| FixArray | 1001xxxx | 0x90 - 0x9f |
| FixRaw | 101xxxxx | 0xa0 - 0xbf |
| nil | 11000000 | 0xc0 |
| reserved | 11000001 | 0xc1 |
| false | 11000010 | 0xc2 |
| true | 11000011 | 0xc3 |
| reserved | 11000100 | 0xc4 |
| reserved | 11000101 | 0xc5 |
| reserved | 11000110 | 0xc6 |
| reserved | 11000111 | 0xc7 |
| reserved | 11001000 | 0xc8 |
| reserved | 11001001 | 0xc9 |
| float | 11001010 | 0xca |
| double | 11001011 | 0xcb |
| uint 8 | 11001100 | 0xcc |
| uint 16 | 11001101 | 0xcd |
| uint 32 | 11001110 | 0xce |
| uint 64 | 11001111 | 0xcf |
| int 8 | 11010000 | 0xd0 |
| int 16 | 11010001 | 0xd1 |
| int 32 | 11010010 | 0xd2 |
| int 64 | 11010011 | 0xd3 |
| reserved | 11010100 | 0xd4 |
| reserved | 11010101 | 0xd5 |
| reserved | 11010110 | 0xd6 |
| reserved | 11010111 | 0xd7 |
| reserved | 11011000 | 0xd8 |
| reserved | 11011001 | 0xd9 |
| raw 16 | 11011010 | 0xda |
| raw 32 | 11011011 | 0xdb |
| array 16 | 11011100 | 0xdc |
| array 32 | 11011101 | 0xdd |
| map 16 | 11011110 | 0xde |
| map 32 | 11011111 | 0xdf |
| Negative FixNum | 111xxxxx | 0xe0 - 0xff |

8 Comments
comments.show.hideJun 11, 2011
David Souther
With the proliferation of 128-bit UUIDs (IPv6, ext4/btrfs, MS COM objects, etc, etc), I could argue that it makes sense to include a dedicated UINT128 type. While it could be handled with prefix 0xB0 (16-byte FixRaw), the format loses the type specifications that makes it otherwise so nice, forcing the application to understand the low-lying transport mechanism (a bad thing). Because 128-bit numbers are used so often as identifiers (I'm having trouble thinking of numerical applications needing 128-bit integers), I would tentatively propose using 0xC1 as "128-bit UUID", packing to 17 bytes, while retaining type information.
Is there a place for this, or a reason against it?
Oct 04, 2011
Anders Dalvander
On the other hand there is no mention of a "string" type nor a "date/time" type, which perhaps be more useful than a 128-bit integer.
If a "string" type would be added, the encoding form must also be specified in order to keep interoperability. In that case I would prefer if UTF-8 would be chosen.
Jan 07, 2012
Javier Gonel
Msgpack, protocol buffers, even json are basic serialization formats. You build on top of them.
I've modified the .NET implementation of message buffers to output the string representation as my code speaks with ruby clients that love strings everwhere. But you could just write your own defition of what is a GUID.
As I see it is a two layer process. The first one using messagepack primitives, and the other adapting it to your language/API.
Another examples are dates. I just use strings and write ISO dates there.
Javier.
Jan 20, 2012
Nico Poggi
I am missing a way to easily detect if the string is msgpack serrialized. Other formats, ie. PHPs serialize, you can easily detect if its PHP serialized by looking at the second char (: or ;).
I can receive a string in my code that might be serialized in PHP, igbinay, or msgpack. Furthermore, msgpack_unserialize() extension for PHP does not return false or error on arbitrary strings.
Any suggestion without performing the unserialize? apart from checking other formats first?
Apr 20, 2012
Egil Möller
I'm missing an extension point. That is, a way to encode the usage of more complex abstract types built from the basic ones. What I'd like is something like
(type, value)
where both type and value can be values of any msgpack type. The interpretation above parsing those values individually would be outside of the scope of msgpack, and other specifications may define an interpretation for value given a certain type. Parsers would by default decode / encode this to / from an object of a specific type with two properties, "type" and "value", but could allow registration of encoding/decoding functions in whatever fashion makes sense in the implementation language / framework.
Implementation would be very much like FixArray, but the length is always exactly two. I propose using 11000001 followed by exactly two objects.
Note: It generally makes very little sense for the type to be anything but a string, array or map. I can however not see any good gains by artificially restricting this.
Example usages (outside of the scope of the msgpack specification):
Jul 03, 2012
Reini Urban
I'm missing an optional CRC tag for basic security and data corruption. If pack provides a crc checksum, unpack should check the result against the given crc and report an error otherwise. The crc tag must be the last tag in the buffer. The crc checksum does not include itself and its type tag.
There are several reserved bytes free for this. For example 0xc6 is similar to 0xce, only one bit apart.
0xc6 crc with uint32 (optional)
I implemented this for c and perl here:
https://github.com/msgpack/msgpack/pull/114
https://github.com/msgpack/msgpack-perl/pull/7
Our main advantage to use the optional CRC is to be sure that the writer has already finished writing by checking the fifth-last byte to be 0xc6. Omit parsing and unpacking then. Some writers buffer its output, and the reader optimisticly tries to parse whenever the ctime changes.
Jun 07, 2013
Christopher Dunn
MsgPack is great, and I completely agree about strings.
But could we get this wiki into GitHub? This site was down recently, and that's a major problem. I cannot find the spec anywhere else on the web.
I've copied the bare minimum here: https://github.com/cdunn2001/msgpack-spec
That's only a README. With GitHub, the wiki attached to a repo is itself available via git. That would be prefect.
Jun 07, 2013
Christopher Dunn
See wiki:
https://github.com/cdunn2001/msgpack-spec/wiki/Spec
and README:
https://github.com/cdunn2001/msgpack-spec/blob/master/README.md
The wiki looks a little better because of the surprising syntax-highlighting. Feel free to clone.