Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New headers and chunked checksums #111

Open
shaitan opened this issue Apr 27, 2015 · 1 comment
Open

New headers and chunked checksums #111

shaitan opened this issue Apr 27, 2015 · 1 comment

Comments

@shaitan
Copy link
Member

shaitan commented Apr 27, 2015

Introduction

Each record has header that is presented at blob and at index. Header is binary dump of eblob_disk_control object, it has fixed size and in a blob it is placed at the beginning of the record. Header contains the record meta info, sizes, position etc. If it is not disabled, each record has footer that is binary dump of eblob_disk_footer, it has also fixed size and in a blob it is placed at the end of the record. Footer contains the checksum of the record.

Problems

  1. if we will decide to extend header, we will have to convert all blobs to new header format.
  2. record checksumming depends on record size and takes a lot of time in case of huge record

Solutions

1. extendable headers

We can use msgpack with fixed fields for header serialization. In case of header extension, blobs with old header will be available for read, but all new writes will be done in new blobs with new headers. Also while defragmentation it can convert blobs with old headers.

2. checksumming of huge file

We can split file into chunks and checksums each chunk. Also we can add new record flags for records which is checksummed by chunk, escape having to convert current blobs and can convert blobs while defragmentation.

@shaitan shaitan self-assigned this May 18, 2015
@shaitan
Copy link
Member Author

shaitan commented Jun 29, 2015

Chunked checksums are implemented in #131 and reverbrain/elliptics#629

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant