Skip to content
This repository has been archived by the owner on Feb 6, 2024. It is now read-only.

Commit

Permalink
Add indices and stats file format specification
Browse files Browse the repository at this point in the history
Add a specification for a container file format to store indices and
stats for Iceberg tables.
  • Loading branch information
findepi committed May 18, 2022
1 parent 4f8dd64 commit 3f6a2ce
Showing 1 changed file with 144 additions and 0 deletions.
144 changes: 144 additions & 0 deletions landing-page/content/common/index-and-statistics-format.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
---
url: index-and-statistics-format
toc: false
---
<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
-->

# Index and statistics file format

This is a specification for the Plain Format for Iceberg Statistics, a file
format designed to store information such as statistics about data managed in an
Iceberg table that cannot be stored directly within the Iceberg manifest. A
statistics file contains arbitrary pieces of information (here called "blobs"),
along with metadata necessary to interpret them. The blobs supported by Iceberg
are documented at [Blob types](#blob-types).

## Format specification

A file conforming to the format specification should have the structure as
described below.

### Versions

Currently, there is a single version of the file format, described below.

### File structure

The file has the following structure

```
Magic Blob₁ Blob₂ ... Blobₙ Footer
```

where

- `Magic` is four bytes 0x50, 0x46, 0x42, 0x31 (short for: Plain Format for
Blobs, version 1),
- `Blobᵢ` is i-th blob contained in the file, to be interpreted by application
according to the footer,
- `Footer` is defined below.

### Footer structure

Footer has the following structure

```
Magic FooterPayload FooterPayloadSize Flags Magic
```

where

- `Magic`: four bytes, same as at the beginning of the file.
- `FooterPayload`: optionally compressed, UTF-8 encoded JSON payload describing the
blobs in the file, with the structure described below,
- `FooterPayloadSize`: a length in bytes of the `FooterPayload` (compressed),
stored as 4 byte integer,
- `Flags`: 4 bytes for boolean flags
- byte 0 (first)
- bit 0 (lowest bit): whether `FooterPayload` is compressed
- all other bits are reserved for future use and should be set to 0 on write
- all other bytes are reserved for future use and should be set to 0 on write

A 4 byte integer is always signed, in a two's complement representation, stored
little-endian.

### Footer Payload

Footer payload bytes is either uncompressed or LZ4-compressed (as a single
[LZ4 compression frame](https://github.com/lz4/lz4/blob/77d1b93f72628af7bbde0243b4bba9205c3138d9/doc/lz4_Frame_format.md)
with content size present), UTF-8 encoded JSON payload representing a single
`FileMetadata` object.

#### FileMetadata

`FileMetadata` has the following fields


| Field Name | Field Type | Required | Description |
| ---------- | --------------------------------------- | -------- | ----------- |
| blobs | list of BlobMetadata objects | yes |
| properties | JSON object with string property values | no | storage for arbitrary meta-information, like writer identification/version. See [Common properties](#common-properties) for properties that are recommended to be set by a writer.

#### BlobMetadata

`BlobMetadata` has the following fields

| Field Name | Field Type | Required | Description |
|-------------------|-------------------| -------- | ----------- |
| type | JSON string | yes | See [Blob types](#blob-types)
| fields | list of JSON long | yes | List of field IDs the blob was computed for; the order of items is used to compute sketches stored in the blob.
| offset | JSON long | yes | The offset in the file where the blob contents start
| length | JSON long | yes | The length of the blob stored in the file
| compression-codec | JSON string | no | See [Compression codecs](#compression-codecs). If omitted, the data is assumed to be uncompressed.

### Blob types

The blobs can be of a type listed below

#### `ndv-long-little-endian` blob type

8-bytes unsigned integer stored little-endian and representing number of distinct values
of a single field.

#### `apache-datasketches-theta-v1` blob type

A serialized form of a "compact" Theta sketch produced by the [Apache
DataSketches](https://datasketches.apache.org/) library. The sketch is obtained by
constructing Alpha family sketch with default seed, and feeding it with individual
distinct values converted to bytes using Iceberg's single-value serialization.

### Compression codecs

The data can also be uncompressed. If it is compressed the code should be one of
codecs listed below. For maximal interoperability, other codecs are not supported.

| Codec name | Description |
|------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| lz4 | Single [LZ4 compression frame](https://github.com/lz4/lz4/blob/77d1b93f72628af7bbde0243b4bba9205c3138d9/doc/lz4_Frame_format.md), with content size present |
| zstd | Single [Zstandard compression frame](https://github.com/facebook/zstd/blob/8af64f41161f6c2e0ba842006fe238c664a6a437/doc/zstd_compression_format.md#zstandard-frames), with content size present |
__

### Common properties

When writing a file it is recommended to set the following fields in the
[FileMetadata](#filemetadata)'s `properties` field.

- `created-by` - human-readable identification of the application writing the stats file,
along with it's version. Example "Trino version 381".
- `source-snapshot-id` - a table snapshot which was used to calculate blob contents
- `source-sequence-number` - sequence number of the table snapshot used to calculate blob contents

0 comments on commit 3f6a2ce

Please sign in to comment.