Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dk/recursively compress fsst array #916

Closed
wants to merge 3 commits into from

Conversation

danking
Copy link
Member

@danking danking commented Sep 23, 2024

No description provided.

@danking danking added the benchmark Run benchmarks on this branch label Sep 23, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 23, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vortex bytes_at

Benchmark suite Current: 09d174e Previous: 28205de Ratio
bytes_at/array_data 606.3691958481068 ns (0.3571374764276811) 618.0013064135177 ns (1.0449867774760833) 0.98
bytes_at/array_view 877.7695751589636 ns (0.9743392772022617) 949.5617171805501 ns (0.8549548221537862) 0.92

This comment was automatically generated by workflow using github-action-benchmark.

@danking danking added the benchmark Run benchmarks on this branch label Sep 23, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 23, 2024
@danking danking added the benchmark Run benchmarks on this branch label Sep 23, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 23, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random Access

Benchmark suite Current: 09d174e Previous: 28205de Ratio
random-access/vortex-tokio-local-disk 1169637.3685300692 ns (7730.747630377533) 1105904.097260631 ns (5042.062221880187) 1.06
random-access/vortex-local-fs 1298057.27770458 ns (3832.353846318554) 1245276.4040165455 ns (3430.5443588237977) 1.04
random-access/parquet-tokio-local-disk 213260981.43333334 ns (4907358.816666678) 206864975.56666666 ns (9792563.880833313) 1.03

This comment was automatically generated by workflow using github-action-benchmark.

@danking danking added the benchmark Run benchmarks on this branch label Sep 24, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 24, 2024
@danking danking added the benchmark Run benchmarks on this branch label Sep 24, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 24, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataFusion

Benchmark suite Current: 09d174e Previous: 28205de Ratio
arrow/planning 824121.5461043704 ns (708.2505997676053) 843470.1933541672 ns (17769.357566822844) 0.98
arrow/exec 1776681.5068155513 ns (1843.0842648033286) 1781809.6246483063 ns (6226.273213787703) 1.00
vortex-pushdown-compressed/planning 517156.46581309347 ns (479.75897013922804) 512741.6376662363 ns (879.4589469218627) 1.01
vortex-pushdown-compressed/exec 3076498.972352939 ns (3618.04533823533) 3097297.4658823526 ns (5018.832397058373) 0.99
vortex-pushdown-uncompressed/planning 519446.23343496025 ns (851.6252527793695) 514192.3171742441 ns (1473.4911523418268) 1.01
vortex-pushdown-uncompressed/exec 2932822.609444444 ns (2331.4410347226076) 3336001.50375 ns (4194.211453124881) 0.88
vortex-nopushdown-compressed/planning 839333.2493234345 ns (2052.720870503399) 830691.9516991806 ns (1936.4639848648803) 1.01
vortex-nopushdown-compressed/exec 8578559.071666665 ns (21456.21279166732) 10067534.141999997 ns (273521.05794999935) 0.85
vortex-nopushdown-uncompressed/planning 838326.1364466916 ns (2190.607326558733) 830539.616209093 ns (1194.4622335811728) 1.01
vortex-nopushdown-uncompressed/exec 1801618.6094841103 ns (2166.398876905092) 1790342.131500228 ns (6261.061493467889) 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@danking danking added the benchmark Run benchmarks on this branch label Sep 24, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 24, 2024
@danking danking added the benchmark Run benchmarks on this branch label Sep 24, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 24, 2024
@danking danking added the benchmark Run benchmarks on this branch label Sep 24, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 24, 2024
@danking danking added the benchmark Run benchmarks on this branch label Sep 24, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 24, 2024
@danking danking added the benchmark Run benchmarks on this branch label Sep 24, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Sep 24, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vortex Compression

Benchmark suite Current: 09d174e Previous: 28205de Ratio
Yellow Taxi Trip Data Compression Time/taxi compression 2502562487.7 ns (3421260.0500001907) 2512150622.8 ns (4715814.649999857) 1.00
Yellow Taxi Trip Data Compression Time/taxi compression throughput 470808924 bytes 470808924 bytes 1
Yellow Taxi Trip Data Vortex-to-ParquetZstd Ratio/taxi 0.9547439678479189 ratio 0.9580620880073182 ratio 1.00
Yellow Taxi Trip Data Vortex-to-ParquetUncompressed Ratio/taxi 0.6128693203794547 ratio 0.6149992883242769 ratio 1.00
Yellow Taxi Trip Data Compression Ratio/taxi 0.10756843682937496 ratio 0.10810682934293743 ratio 1.00
Yellow Taxi Trip Data Compression Size/taxi 50644180 bytes 50897660 bytes 1.00
Public BI Compression Time/AirlineSentiment compression 413400.45715536975 ns (917.8784058165038) 407517.73455317103 ns (386.95686009235214) 1.01
Public BI Compression Time/AirlineSentiment compression throughput 2020 bytes 2020 bytes 1
Public BI Vortex-to-ParquetZstd Ratio/AirlineSentiment 6.4672897196261685 ratio 6.4672897196261685 ratio 1
Public BI Vortex-to-ParquetUncompressed Ratio/AirlineSentiment 4.398305084745763 ratio 4.398305084745763 ratio 1
Public BI Compression Ratio/AirlineSentiment 0.6207920792079208 ratio 0.6207920792079208 ratio 1
Public BI Compression Size/AirlineSentiment 1254 bytes 1254 bytes 1
Public BI Compression Time/Arade compression 3175015164.9 ns (2118661.8375000954) 3177556352 ns (5656746.700000048) 1.00
Public BI Compression Time/Arade compression throughput 787023760 bytes 787023760 bytes 1
Public BI Vortex-to-ParquetZstd Ratio/Arade 0.49269022348110086 ratio 0.4927304613057681 ratio 1.00
Public BI Vortex-to-ParquetUncompressed Ratio/Arade 0.43976587469696543 ratio 0.439801790210035 ratio 1.00
Public BI Compression Ratio/Arade 0.1858696553201901 ratio 0.1858783971147199 ratio 1.00
Public BI Compression Size/Arade 146283835 bytes 146290715 bytes 1.00
Public BI Compression Time/Bimbo compression 22750900624.9 ns (27744772.57250023) 22851736365 ns (15982246.875) 1.00
Public BI Compression Time/Bimbo compression throughput 7121333608 bytes 7121333608 bytes 1
Public BI Vortex-to-ParquetZstd Ratio/Bimbo 1.3132193486869599 ratio 1.313476131950789 ratio 1.00
Public BI Vortex-to-ParquetUncompressed Ratio/Bimbo 0.8904063811955527 ratio 0.8905804887861287 ratio 1.00
Public BI Compression Ratio/Bimbo 0.06543966167748169 ratio 0.06543434441500189 ratio 1.00
Public BI Compression Size/Bimbo 466017662 bytes 465979796 bytes 1.00
Public BI Compression Time/CMSprovider compression 13327033811.6 ns (35601922.44999981) 12975232079.5 ns (19420027.48999977) 1.03
Public BI Compression Time/CMSprovider compression throughput 5149123964 bytes 5149123964 bytes 1
Public BI Vortex-to-ParquetZstd Ratio/CMSprovider 1.205344053822037 ratio 1.2031664729650404 ratio 1.00
Public BI Vortex-to-ParquetUncompressed Ratio/CMSprovider 0.7782820892506759 ratio 0.7768760407672395 ratio 1.00
Public BI Compression Ratio/CMSprovider 0.17647350973739345 ratio 0.17583991594108764 ratio 1.00
Public BI Compression Size/CMSprovider 908683978 bytes 905421525 bytes 1.00
Public BI Compression Time/Euro2016 compression 2218076895.8 ns (3577266.337499857) 2169861879.5 ns (2228555.600000143) 1.02
Public BI Compression Time/Euro2016 compression throughput 393253221 bytes 393253221 bytes 1
Public BI Vortex-to-ParquetZstd Ratio/Euro2016 1.4124944462759428 ratio 1.4644631894097515 ratio 0.96
Public BI Vortex-to-ParquetUncompressed Ratio/Euro2016 0.5992907568946616 ratio 0.6213399674169664 ratio 0.96
Public BI Compression Ratio/Euro2016 0.42258954313816033 ratio 0.4329904115394391 ratio 0.98
Public BI Compression Size/Euro2016 166184699 bytes 170274874 bytes 0.98
Public BI Compression Time/Food compression 1111144076.8 ns (1062264.25) 1080519190.6 ns (3149292.113749981) 1.03
Public BI Compression Time/Food compression throughput 332718229 bytes 332718229 bytes 1
Public BI Vortex-to-ParquetZstd Ratio/Food 1.2314513914749825 ratio 1.2308825406037045 ratio 1.00
Public BI Vortex-to-ParquetUncompressed Ratio/Food 0.6962926216809466 ratio 0.6959709795379847 ratio 1.00
Public BI Compression Ratio/Food 0.1300739611715113 ratio 0.1300739611715113 ratio 1
Public BI Compression Size/Food 43277978 bytes 43277978 bytes 1
Public BI Compression Time/HashTags compression 3033626491.9 ns (2945281.096250057) 2822113791.6 ns (1978730.3575000763) 1.07
Public BI Compression Time/HashTags compression throughput 804495592 bytes 804495592 bytes 1
Public BI Vortex-to-ParquetZstd Ratio/HashTags 1.5710619152399998 ratio 1.6595249653241244 ratio 0.95
Public BI Vortex-to-ParquetUncompressed Ratio/HashTags 0.446656004441462 ratio 0.4718062242437439 ratio 0.95
Public BI Compression Ratio/HashTags 0.2579453859829228 ratio 0.2639225051216937 ratio 0.98
Public BI Compression Size/HashTags 207515926 bytes 212324492 bytes 0.98
TPC-H l_comment Compression Time/chunked-without-fsst compression 191309180.08343256 ns (966712.5560615063) 188067526.41482145 ns (38210.28238841891) 1.02
TPC-H l_comment Compression Time/chunked-without-fsst compression throughput 183010921 bytes 183010921 bytes 1
TPC-H l_comment Vortex-to-ParquetZstd Ratio/chunked-without-fsst 3.215636053515698 ratio 3.215622182710311 ratio 1.00
TPC-H l_comment Vortex-to-ParquetUncompressed Ratio/chunked-without-fsst 0.9983859801926083 ratio 0.9983874372880842 ratio 1.00
TPC-H l_comment Compression Ratio/chunked-without-fsst 0.999965750677797 ratio 0.999965750677797 ratio 1
TPC-H l_comment Compression Size/chunked-without-fsst 183004653 bytes 183004653 bytes 1
TPC-H l_comment Compression Time/chunked-with-fsst compression 1250319147.35 ns (6125345.413125038) 1122703466.85 ns (1928494.515625) 1.11
TPC-H l_comment Compression Time/chunked-with-fsst compression throughput 183010921 bytes 183010921 bytes 1
TPC-H l_comment Vortex-to-ParquetZstd Ratio/chunked-with-fsst 1.1656268830354335 ratio 1.5033668936332807 ratio 0.78
TPC-H l_comment Vortex-to-ParquetUncompressed Ratio/chunked-with-fsst 0.36190213033774377 ratio 0.46676584964132767 ratio 0.78
TPC-H l_comment Compression Ratio/chunked-with-fsst 0.36010120947918733 ratio 0.4423270947857806 ratio 0.81
TPC-H l_comment Compression Size/chunked-with-fsst 65902454 bytes 80950689 bytes 0.81
TPC-H l_comment Compression Time/canonical-with-fsst compression 1242336680.1 ns (835490.064999938) 1122477453.1 ns (193778.6556251049) 1.11
TPC-H l_comment Compression Time/canonical-with-fsst compression throughput 183010937 bytes 183010937 bytes 1
TPC-H l_comment Vortex-to-ParquetZstd Ratio/canonical-with-fsst 1.1656283980968094 ratio 1.503369587037494 ratio 0.78
TPC-H l_comment Vortex-to-ParquetUncompressed Ratio/canonical-with-fsst 0.3619026237406602 ratio 0.46676584964132767 ratio 0.78
TPC-H l_comment Compression Ratio/canonical-with-fsst 0.3600932221881362 ratio 0.4423191003060107 ratio 0.81
TPC-H l_comment Compression Size/canonical-with-fsst 65900998 bytes 80949233 bytes 0.81

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TPC-H

Benchmark suite Current: 09d174e Previous: 28205de Ratio
tpch_q1/vortex-in-memory-no-pushdown 458278052.55 ns (1527722.074999988) 465180262.9 ns (2649666.1293750107) 0.99
tpch_q1/vortex-in-memory-pushdown 507494993.7 ns (684232.2237499952) 513200347 ns (1041174.4212500155) 0.99
tpch_q1/arrow 444599637.55 ns (959506.8881250024) 476541647.5 ns (36655508.975000024) 0.93
tpch_q1/parquet 651886383.4 ns (1369197.7225000262) 663086970.8 ns (4447213.25) 0.98
tpch_q1/vortex-file-compressed 617120117.1 ns (1462951.362500012) 619079656.1 ns (1942722.5362499356) 1.00
tpch_q1/vortex-file-uncompressed 638108816.3 ns (5913937.050000012) 637637479.8 ns (1726915.8499999642) 1.00
tpch_q2/vortex-in-memory-no-pushdown 121512552.98373015 ns (165223.40841270238) 124296982.77408731 ns (276628.49458334595) 0.98
tpch_q2/vortex-in-memory-pushdown 120649880.80825396 ns (95372.42531746626) 122790559.89567459 ns (177864.12994047254) 0.98
tpch_q2/arrow 119677313.62849204 ns (109929.5858809501) 120943903.19527777 ns (184436.403687492) 0.99
tpch_q2/parquet 155288792.6381746 ns (336419.1023452282) 155485352.4057143 ns (515378.11573810875) 1.00
tpch_q2/vortex-file-compressed 155812194.07936507 ns (727592.3697956353) 153173503.29218253 ns (448269.2861582339) 1.02
tpch_q2/vortex-file-uncompressed 164449707.23 ns (468252.8813749999) 164141701.78281745 ns (468763.4158333391) 1.00
tpch_q3/vortex-in-memory-no-pushdown 155299084.04500002 ns (580750.5700416565) 157548890.20829365 ns (3343486.2258794606) 0.99
tpch_q3/vortex-in-memory-pushdown 178199063.83154765 ns (324678.5629107058) 184884052.5 ns (1118028.217083335) 0.96
tpch_q3/arrow 143624085.7084127 ns (174775.65463788807) 154753455.02226192 ns (9164528.853422627) 0.93
tpch_q3/parquet 328029728.75 ns (879810.474999994) 335529750.75 ns (992076.1456249952) 0.98
tpch_q3/vortex-file-compressed 380957402.6 ns (386001.5424999893) 332762277.05 ns (970064.1750000119) 1.14
tpch_q3/vortex-file-uncompressed 395100743.9 ns (875809.1556250155) 408230824.65 ns (2113463.7974999845) 0.97
tpch_q4/vortex-in-memory-no-pushdown 109380899.77575395 ns (1031799.1887123138) 113379222.9829762 ns (166103.18720384687) 0.96
tpch_q4/vortex-in-memory-pushdown 134508711.51781744 ns (550840.4655937552) 141227089.41186506 ns (1589884.7473764867) 0.95
tpch_q4/arrow 100656541.16416666 ns (351653.4868125096) 105072541.47444443 ns (1037988.133541666) 0.96
tpch_q4/parquet 213635169.96666667 ns (263899.7850000113) 240052672.06666666 ns (1595406.685833335) 0.89
tpch_q4/vortex-file-compressed 340837528.2 ns (328660.5512500107) 317796785.75 ns (1564868.8687499762) 1.07
tpch_q4/vortex-file-uncompressed 309038398.4 ns (1686731.2337500155) 420626387.6 ns (8737218.239374995) 0.73
tpch_q5/vortex-in-memory-no-pushdown 289906931.8 ns (999757.5818749964) 349592962.7 ns (1311505.8318749964) 0.83
tpch_q5/vortex-in-memory-pushdown 301973319.8 ns (1088038.2975000143) 296967574.4 ns (572868.8543750048) 1.02
tpch_q5/arrow 284071667.5 ns (1336682.125) 332469561.4 ns (1700250.4868749678) 0.85
tpch_q5/parquet 432381275.95 ns (631785.0449999869) 514519580.8 ns (2395679) 0.84
tpch_q5/vortex-file-compressed 342338808.35 ns (2044775.5393750072) 406175301.9 ns (4489660.732499987) 0.84
tpch_q5/vortex-file-uncompressed 366865837.3 ns (2705602.657499999) 477626100.75 ns (7591919.666875005) 0.77
tpch_q6/vortex-in-memory-no-pushdown 41103523.703928575 ns (104380.57736359537) 47180262.42391534 ns (1550833.2145826705) 0.87
tpch_q6/vortex-in-memory-pushdown 92441220.38603175 ns (69802.67756745964) 97318888.58940476 ns (108702.55196428299) 0.95
tpch_q6/arrow 34674228.716587305 ns (38765.4185099192) 34914514.88021164 ns (36762.250247683376) 0.99
tpch_q6/parquet 149369843.84825397 ns (154273.1011865139) 157469202.6025397 ns (304364.95429959893) 0.95
tpch_q6/vortex-file-compressed 66629596.28138888 ns (263224.2521006875) 87518909.54099207 ns (2621472.616697423) 0.76
tpch_q6/vortex-file-uncompressed 253960971.3 ns (1972613.0337500125) 368000410.75 ns (12874408.981249988) 0.69
tpch_q7/vortex-in-memory-no-pushdown 563072103.7 ns (3100309.4287499785) 561349712.1 ns (4196312.807500005) 1.00
tpch_q7/vortex-in-memory-pushdown 589174839.6 ns (2326134.879999995) 703634240.2 ns (4544864.6987499595) 0.84
tpch_q7/arrow 532345529.2 ns (1681770.157499969) 657151347.2 ns (7604387.9575000405) 0.81
tpch_q7/parquet 686627725.4 ns (3615901.0900000334) 804264035.9 ns (5836405.662500024) 0.85
tpch_q7/vortex-file-compressed 752896228.6 ns (3409611.1037499905) 806969642.4 ns (7283627.30250001) 0.93
tpch_q7/vortex-file-uncompressed 803109928.2 ns (4630574.186250031) 971583990.2 ns (10099795.362499893) 0.83
tpch_q8/vortex-in-memory-no-pushdown 220145906.3 ns (363748.64999999106) 242095951.20000005 ns (1886233.450000003) 0.91
tpch_q8/vortex-in-memory-pushdown 233649006.9 ns (1006558.4245833457) 257694249.55 ns (738539.375) 0.91
tpch_q8/arrow 209775358.86666664 ns (859480.4050000161) 232112822.13333336 ns (1104172.832916692) 0.90
tpch_q8/parquet 481193562.9 ns (2305910.575625032) 525623589.9 ns (9203249.512499988) 0.92
tpch_q8/vortex-file-compressed 295951855.2 ns (720566.266874969) 326114617.85 ns (3543129.2956249714) 0.91
tpch_q8/vortex-file-uncompressed 331612392.65 ns (2581767.25) 405336256.9 ns (10162204.281874985) 0.82
tpch_q9/vortex-in-memory-no-pushdown 397725506.35 ns (2244245.920625001) 475445847.9 ns (3051276.5443750024) 0.84
tpch_q9/vortex-in-memory-pushdown 408361117.45 ns (3361210.6006249785) 480146804.35 ns (4620386.956874996) 0.85
tpch_q9/arrow 385700914.45 ns (1354885.0668750107) 472564777.25 ns (4678176.905000001) 0.82
tpch_q9/parquet 694648712.2 ns (4475806.538749993) 789859844 ns (5850230.768750012) 0.88
tpch_q9/vortex-file-compressed 459010767.25 ns (4038664.8243750036) 530908802.8 ns (6291757.4537499845) 0.86
tpch_q9/vortex-file-uncompressed 490784848.65 ns (3167568.775000006) 612744644.1 ns (12682721.346249938) 0.80
tpch_q10/vortex-in-memory-no-pushdown 231781429.9333333 ns (662753.1783333272) 229650010.23333335 ns (502687.62583333254) 1.01
tpch_q10/vortex-in-memory-pushdown 259368156.8 ns (1600750.0993749946) 294833594 ns (1618125.8981250226) 0.88
tpch_q10/arrow 224058188.1 ns (1303973.481250003) 249418158.05 ns (2099565.3918750137) 0.90
tpch_q10/parquet 474264348.1 ns (1741075.7212499976) 514893937.2 ns (11542829.632499993) 0.92
tpch_q10/vortex-file-compressed 475155739.45 ns (820450.400000006) 481836640.7 ns (3530849.828125) 0.99
tpch_q10/vortex-file-uncompressed 434625105.75 ns (1795174.098124981) 512894618.8 ns (13401169.712500006) 0.85
tpch_q11/vortex-in-memory-no-pushdown 171603757.6856746 ns (410989.3061706275) 200191315.6666667 ns (779042.5999999791) 0.86
tpch_q11/vortex-in-memory-pushdown 170720413.76833335 ns (178825.71552081406) 200410575.43333334 ns (1911425.4608333409) 0.85
tpch_q11/arrow 169517383.8043254 ns (453682.6363328248) 174617424.58099207 ns (136030.48409722745) 0.97
tpch_q11/parquet 178128766.29 ns (270192.4930833131) 217283900.8666667 ns (2723982.1508333385) 0.82
tpch_q11/vortex-file-compressed 217506636.56666666 ns (642417.6533333361) 264793055.3 ns (5486483.262500003) 0.82
tpch_q11/vortex-file-uncompressed 226294363.4 ns (638592.43583332) 280114492.75 ns (4134265.675000012) 0.81
tpch_q12/vortex-in-memory-no-pushdown 202556211.8333333 ns (279365.5383333266) 210536524.59999996 ns (1231164.1400000155) 0.96
tpch_q12/vortex-in-memory-pushdown 248570565.05 ns (441000.49437500536) 267377784.6 ns (1008874.065624997) 0.93
tpch_q12/arrow 171633096.72623017 ns (301870.0829722285) 176532181.72920635 ns (707289.9101488143) 0.97
tpch_q12/parquet 360892382.05 ns (509030.72562500834) 374767664.7 ns (3994547.949999988) 0.96
tpch_q12/vortex-file-compressed 640585391.7 ns (8248543.209999979) 681342948.5 ns (2726619.6500000358) 0.94
tpch_q12/vortex-file-uncompressed 434142182.45 ns (986386.9662500024) 494714877.15 ns (8373349.75562501) 0.88
tpch_q13/vortex-in-memory-no-pushdown 161256789.14535713 ns (2296980.074691981) 158610700.51436505 ns (441966.9483323395) 1.02
tpch_q13/vortex-in-memory-pushdown 158042314.85511905 ns (220747.30857142806) 172362415.49888888 ns (2298791.8800416887) 0.92
tpch_q13/arrow 155282613.7606746 ns (320651.74938096106) 158394869.07992062 ns (1031960.5836904794) 0.98
tpch_q13/parquet 298552327.25 ns (1113864.375) 301248497.7 ns (1131000.949999988) 0.99
tpch_q13/vortex-file-compressed 207645674.8 ns (1567111.2833333313) 211985084.46666664 ns (8913439.55583331) 0.98
tpch_q13/vortex-file-uncompressed 193550109.4 ns (2983972.8487499803) 197973826.89999998 ns (3642251.505833313) 0.98
tpch_q14/vortex-in-memory-no-pushdown 44298268.77813492 ns (68456.65808233991) 49171851.880595244 ns (1225462.154779762) 0.90
tpch_q14/vortex-in-memory-pushdown 82461775.025 ns (123491.3026458472) 83501758.05988094 ns (131495.03624404967) 0.99
tpch_q14/arrow 36230301.06928571 ns (67402.9645238109) 37660039.08928572 ns (442297.0392797664) 0.96
tpch_q14/parquet 218131175.7666667 ns (588932.2320833504) 218388411.2333333 ns (499757.31750001013) 1.00
tpch_q14/vortex-file-compressed 121433572.46730158 ns (470462.5201994106) 118222387.33134918 ns (427485.88334821165) 1.03
tpch_q14/vortex-file-uncompressed 194523138.99999997 ns (317985.2445833534) 190124163.7666667 ns (805362.6375000328) 1.02
tpch_q15/vortex-in-memory-no-pushdown 72014840.51545635 ns (264456.1323737651) 74768989.57686508 ns (713170.5230208337) 0.96
tpch_q15/vortex-in-memory-pushdown 115001871.94535716 ns (481736.56542262435) 118465890.39960317 ns (450173.35409127176) 0.97
tpch_q15/arrow 62953023.05380952 ns (99847.35607143492) 63432184.8525 ns (69373.41171875596) 0.99
tpch_q15/parquet 291426574.3 ns (523007.7199999988) 290194205.3 ns (474797.34999999404) 1.00
tpch_q15/vortex-file-compressed 221488433.3666667 ns (878694.4920833409) 215258277.93333334 ns (1221975.3237499893) 1.03
tpch_q15/vortex-file-uncompressed 354440764.45 ns (1496501.1549999714) 362840501.25 ns (3137465.3987499774) 0.98
tpch_q16/vortex-in-memory-no-pushdown 104778598.6463492 ns (122841.17430754006) 105906233.89670636 ns (353774.86528968066) 0.99
tpch_q16/vortex-in-memory-pushdown 123620578.83353174 ns (354063.9954107106) 124496906.53892858 ns (406907.3900937438) 0.99
tpch_q16/arrow 105051018.93210319 ns (495988.6335332319) 105172504.17575397 ns (76581.59544591606) 1.00
tpch_q16/parquet 121965760.48761904 ns (151060.2707291618) 121803031.32250002 ns (465298.32744792104) 1.00
tpch_q16/vortex-file-compressed 135711767.91178572 ns (472525.5547827482) 137279525.00119048 ns (315200.2364285737) 0.99
tpch_q16/vortex-file-uncompressed 137670585.2875 ns (365141.8941979408) 141005193.05900794 ns (217046.93390625715) 0.98
tpch_q17/vortex-in-memory-no-pushdown 547512619.3 ns (5770393.446250021) 549633117.7 ns (3897232.6424999833) 1.00
tpch_q17/vortex-in-memory-pushdown 622084806.8 ns (4556423) 635687094.9 ns (4385030.964999974) 0.98
tpch_q17/arrow 559337301.6 ns (2881230.4850000143) 542388623.5 ns (3803355.2350000143) 1.03
tpch_q17/parquet 583560253.3 ns (2846666.75) 587992360.8 ns (1455380.8037499785) 0.99
tpch_q17/vortex-file-compressed 604558669.4 ns (628441.8875000477) 616277637.6 ns (1505033.074999988) 0.98
tpch_q17/vortex-file-uncompressed 664258869.5 ns (2336670.037500024) 679877913 ns (1955727.550000012) 0.98
tpch_q18/vortex-in-memory-no-pushdown 1007092202.3 ns (4302388.58375001) 1002832377.7 ns (2665857.5499999523) 1.00
tpch_q18/vortex-in-memory-pushdown 1006178360.2 ns (3537443.0999999642) 1006758396.7 ns (2763010.7812499404) 1.00
tpch_q18/arrow 1007229012.3 ns (6042515.34375) 990296008.1 ns (3373496.3125000596) 1.02
tpch_q18/parquet 1176120334 ns (5902186.861250043) 1166494597.2 ns (4275139.549999952) 1.01
tpch_q18/vortex-file-compressed 1058511294.2 ns (5553556.882499993) 1033649691.2 ns (1752962.5575000048) 1.02
tpch_q18/vortex-file-uncompressed 1086180770.5 ns (5686312.741249919) 1101853616.5 ns (5525911.863749981) 0.99
tpch_q19/vortex-in-memory-no-pushdown 161652035.00555557 ns (120443.09770831466) 161949641.9519841 ns (580830.1470317543) 1.00
tpch_q19/vortex-in-memory-pushdown 247540805.8333333 ns (213427.13625000417) 253487003.75 ns (2738051.829999998) 0.98
tpch_q19/arrow 149727972.94095236 ns (179890.46564880013) 149145000.48293653 ns (217431.22656349838) 1.00
tpch_q19/parquet 472043060.45 ns (783952.8906250298) 472705565.85 ns (506729.1225000024) 1.00
tpch_q19/vortex-file-compressed 616928464.8 ns (1708923.6800000072) 613866941 ns (871475.8650000095) 1.00
tpch_q19/vortex-file-uncompressed 424195896.75 ns (1109831.5524999797) 423386658.75 ns (3522122.9525000155) 1.00
tpch_q20/vortex-in-memory-no-pushdown 245839811.79999995 ns (1941655.460833326) 245213107.43333334 ns (424988.5499999821) 1.00
tpch_q20/vortex-in-memory-pushdown 258870446.25 ns (1778106.4474999905) 265178880.6 ns (632029.15625) 0.98
tpch_q20/arrow 234965472.5 ns (435853.0666666776) 240191370.4333333 ns (606219.865416646) 0.98
tpch_q20/parquet 351342214.95 ns (771049.8856250048) 358312545.55 ns (571248.6712499857) 0.98
tpch_q20/vortex-file-compressed 351510027.6 ns (1437180.619999975) 350621332.4 ns (945387.5837500095) 1.00
tpch_q20/vortex-file-uncompressed 439532966.05 ns (1753286.2150000036) 452027226.7 ns (1940524.486250013) 0.97
tpch_q21/vortex-in-memory-no-pushdown 834985795.9 ns (1942060.6712499857) 840476950.9 ns (1301525.2549999952) 0.99
tpch_q21/vortex-in-memory-pushdown 874518549 ns (2923152.0499999523) 885787162.3 ns (1234208.6875) 0.99
tpch_q21/arrow 824286109.1 ns (5271417.876249969) 826434233.3 ns (2060745.0400000215) 1.00
tpch_q21/parquet 955814930.4 ns (2056744.1499999762) 960722995.9 ns (1084321.6787499785) 0.99
tpch_q21/vortex-file-compressed 1344373627.5 ns (5563118.75) 1227821431.7 ns (3048760.502500057) 1.09
tpch_q21/vortex-file-uncompressed 1300408024.3 ns (6414159.700000048) 1326921395.8 ns (2001900.821250081) 0.98
tpch_q22/vortex-in-memory-no-pushdown 68948031.00263889 ns (653664.0918749869) 68901159.70801587 ns (434751.78388889134) 1.00
tpch_q22/vortex-in-memory-pushdown 68561959.2867857 ns (215567.5403303504) 68460848.38607143 ns (135838.62076488137) 1.00
tpch_q22/arrow 66548673.89904761 ns (243486.0399508886) 64803303.911845244 ns (196798.55649925396) 1.03
tpch_q22/parquet 92903178.1286508 ns (209354.6758680567) 93014759.08869047 ns (182536.4490476176) 1.00
tpch_q22/vortex-file-compressed 100482155.42269841 ns (421048.56309523433) 100877645.3195635 ns (272764.0753293559) 1.00
tpch_q22/vortex-file-uncompressed 108322515.80142856 ns (635806.2276160717) 108949601.58654761 ns (189022.7831249982) 0.99

This comment was automatically generated by workflow using github-action-benchmark.

More subtle than I expected.

DeltaArray is a sequence of chunks. All chunks except the last must be "full", i.e. containing 1,024
values. The last chunk may contain as few as one value and is encoded differently from the rest.

In this PR, I introduced an "offset" and a "limit". Together they enable logical/lazy slicing while
preserving full chunks for later decompression. The offset is a value, less than 1024, which offsets
into the first chunk. The limit is either `None` or less than 1024. `None` represents no limit which
allows callers to avoid computing the length of the last chunk [1]. Internally, the limit is
converted to a "right-offset": `trailing_garbage` which is sliced away when decompression happens.

[1] Which is, a bit annoyingly, this:

```
match deltas.len() % 1024 {
    0 => 1024,
    n => n
}
```
@danking danking force-pushed the dk/recursively-compress-fsst-array branch from 7d4fc36 to c3125cc Compare September 25, 2024 15:10
The codes of an FSSTArray are a vector of binary-strings of one byte codes or an escape code
followed by a data. The offsets, unexpectedly, grow quite large, increasing the file size (for
example, the TPC-H l_comment column with this PR is 78% the byte size of itself on `develop`). Delta
encoding notably decreases the size but also inflates the compression time, seemingly proportional
to the space savings (TPC-H l_comment compresses in 111% of the time on `develop`).
@danking danking force-pushed the dk/recursively-compress-fsst-array branch from c3125cc to 155930d Compare September 25, 2024 15:17
@danking danking closed this Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant