Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZSTD codec #679

Closed
wants to merge 1 commit into from
Closed

ZSTD codec #679

wants to merge 1 commit into from

Conversation

vthacker
Copy link
Contributor

Summary

Describe the goal of this PR. Mention any related Issue numbers.

Requirements

@bryanlb
Copy link
Contributor

bryanlb commented Oct 25, 2023

I looked a bit more into this, and I'm not convinced that ZSTD is the direction we want to go. The discussion from OpenSearch here specifically raises some concerns:

One of those reasons is the hard dependency on native compiled code, which often leads to portability issues due to glibc differences.

There's also currently issues with memory leaks, and aside from that the performance (especially at query time) is a mixed bag. If we were already running zlib this may be more interesting, but we currently use the default of BEST_SPEED.

apache/lucene#9784 (comment)

Codec Indexing time (ms) Disk usage (MB) Retrieval time per 10k docs (ms)
BEST_SPEED (LZ4 with small blocks) 35383 90.175 190.17524
BEST_COMPRESSION (vanilla zlib, DEFLATE level=6) 76671 58.682 1910.42106
BEST_COMPRESSION (Cloudflare zlib, DEFLATE level=6) 54791 58.601 1395.53593
ZSTD dict (level=1) 24687 63.324 928.73997
ZSTD dict (level=2) 24934 63.722 977.29911
ZSTD dict (level=3) 28285 62.072 938.10886
ZSTD dict (level=4) 37863 60.427 969.18655
ZSTD dict (level=5) 45479 59.317 941.20922
ZSTD dict (level=6) 57842 58.481 881.69049
ZSTD dict (level=7) 65796 58.107 886.42249

I think this is something we can table for now, and potentially revisit in the future.

@bryanlb bryanlb closed this Nov 6, 2023
@bryanlb bryanlb deleted the vthacker/zstd_codec branch August 26, 2024 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants