Added docs' to the control byte.
Signed-off-by: Adam Rocska <adam.rocska@adams.solutions>
This commit is contained in:
parent
cc4305f2d2
commit
6d98c2d5ff
|
@ -1,5 +1,83 @@
|
|||
import Foundation
|
||||
|
||||
/**
|
||||
# Control Byte
|
||||
Primary source of truth: [MaxMindDB Spec'](https://maxmind.github.io/MaxMind-DB/)
|
||||
## Data Field Format
|
||||
Each field starts with a control byte. This control byte provides information
|
||||
about the field's data type and payload size.
|
||||
|
||||
The first three bits of the control byte tell you what type the field is. If
|
||||
these bits are all 0, then this is an "extended" type, which means that the
|
||||
*next* byte contains the actual type. Otherwise, the first three bits will
|
||||
contain a number from 1 to 7, the actual type for the field.
|
||||
|
||||
We've tried to assign the most commonly used types as numbers 1-7 as an
|
||||
optimization.
|
||||
|
||||
With an extended type, the type number in the second byte is the number
|
||||
minus 7. In other words, an array (type 11) will be stored with a 0 for the
|
||||
type in the first byte and a 4 in the second.
|
||||
|
||||
Here is an example of how the control byte may combine with the next byte to
|
||||
tell us the type:
|
||||
|
||||
001XXXXX pointer
|
||||
010XXXXX UTF-8 string
|
||||
110XXXXX unsigned 32-bit int (ASCII)
|
||||
000XXXXX 00000011 unsigned 128-bit int (binary)
|
||||
000XXXXX 00000100 array
|
||||
000XXXXX 00000110 end marker
|
||||
|
||||
### Payload Size
|
||||
|
||||
The next five bits in the control byte tell you how long the data field's
|
||||
payload is, except for maps and pointers. Maps and pointers use this size
|
||||
information a bit differently. See below.
|
||||
|
||||
If the five bits are smaller than 29, then those bits are the payload size in
|
||||
bytes. For example:
|
||||
|
||||
01000010 UTF-8 string - 2 bytes long
|
||||
01011100 UTF-8 string - 28 bytes long
|
||||
11000001 unsigned 32-bit int - 1 byte long
|
||||
00000011 00000011 unsigned 128-bit int - 3 bytes long
|
||||
|
||||
If the five bits are equal to 29, 30, or 31, then use the following algorithm
|
||||
to calculate the payload size.
|
||||
|
||||
If the value is 29, then the size is 29 + *the next byte after the type
|
||||
specifying bytes as an unsigned integer*.
|
||||
|
||||
If the value is 30, then the size is 285 + *the next two bytes after the type
|
||||
specifying bytes as a single unsigned integer*.
|
||||
|
||||
If the value is 31, then the size is 65,821 + *the next three bytes after the
|
||||
type specifying bytes as a single unsigned integer*.
|
||||
|
||||
Some examples:
|
||||
|
||||
01011101 00110011 UTF-8 string - 80 bytes long
|
||||
|
||||
In this case, the last five bits of the control byte equal 29. We treat the
|
||||
next byte as an unsigned integer. The next byte is 51, so the total size is
|
||||
(29 + 51) = 80.
|
||||
|
||||
01011110 00110011 00110011 UTF-8 string - 13,392 bytes long
|
||||
|
||||
The last five bits of the control byte equal 30. We treat the next two bytes
|
||||
as a single unsigned integer. The next two bytes equal 13,107, so the total
|
||||
size is (285 + 13,107) = 13,392.
|
||||
|
||||
01011111 00110011 00110011 00110011 UTF-8 string - 3,421,264 bytes long
|
||||
|
||||
The last five bits of the control byte equal 31. We treat the next three bytes
|
||||
as a single unsigned integer. The next three bytes equal 3,355,443, so the
|
||||
total size is (65,821 + 3,355,443) = 3,421,264.
|
||||
|
||||
This means that the maximum payload size for a single field is 16,843,036
|
||||
bytes.
|
||||
*/
|
||||
struct ControlByte {
|
||||
|
||||
let type: DataType
|
||||
|
|
Loading…
Reference in New Issue