#database #arrow
- 逻辑上与 Utf8 相同,内部使用一个 view 结构表示,包含了:
- string 的 长度
- 对于 small strings:
- string 的整个数据(inlined)
- 对于 non-small strings
- string prefix(inlined)
- index(指向另一个 buffer)
- offset (buffer 内的偏移)
- view buffer
```
* Short strings, length <= 12
| Bytes 0-3 | Bytes 4-15 |
|------------|---------------------------------------|
| length | data (padded with 0) |
* Long strings, length > 12
| Bytes 0-3 | Bytes 4-7 | Bytes 8-11 | Bytes 12-15 |
|------------|------------|------------|-------------|
| length | prefix | buf. index | offset |
```