Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qsv 0.134.0 #184711

Merged
merged 2 commits into from
Sep 10, 2024
Merged

qsv 0.134.0 #184711

merged 2 commits into from
Sep 10, 2024

Conversation

BrewTestBot
Copy link
Member

Created by brew bump


Created with brew bump-formula-pr.

release notes

Workflow demo Flow demo Toolbox demo

qsv pro command demo API demo Configurator demo

qsv pro v1 is here! 🎉

If you've been using qsv for a while, even if you're a command-line ninja, you'll find a lot of new capabilities in qsv pro that can make your data wrangling experience even better!

Apart from making qsv easier to use, qsv pro has a multitude of features including: view interactive data tables; browse stats/frequency/metadata; run recipes and tools (scripts); run Polars SQL queries; an interface using Retrieval Augmented Generation (RAG) techniques to attempt converting Natural Language queries to Polars SQL queries; regular expression search; export to multiple file formats; download/upload from/to compatible CKAN instances; design custom node-based flows and data pipelines; interact with a local API from external programs including the qsv pro command; run various qsv commands in a graphical user interface; and the list goes on!

That's just the beginning, there's more to come! You just have to try it!

Download qsv pro v1 now at qsvpro.dathere.com.

Other highlights include:

  • pro: new command to allow qsv to interact with the qsv pro API to tap into qsv pro exclusive features.
  • lens: new command to interactively view CSVs using the csvlens crate.
  • The ludicrously fast diff command is now easier to use with its --drop-equal-fields option. @janriemer continues to work on his csv-diff crate, and there's more diff UX improvements coming soon!
  • stats adds sum_length and avg_length "streaming" statistics in addition to the existing min_length and max_length metrics. These are especially useful for datasets with a lot of "free text" columns.
  • stats also got "smarter" and "faster" by dog-fooding its own statistics to make it run faster!
    It's a little complicated, but the way stats works is that it compiles the "streaming" statistics on the fly first, and the more expensive advanced statistics are "lazily" computed at the end.
    Since we now compile "sort order" in a streaming manner, we use this info when deriving cardinality at the end to see if we can skip sorting - an otherwise necessary step to get cardinality which is done by "scanning" all the sorted values of a column. Everytime two neighboring values differ in a sorted column, it increments the cardinality count.
    Apart from this "sort order" optimization, we also improved the "cardinality scan" algorithm - halving its memory footprint and making it faster still for larger datasets by parallelizing the computation!
    This in turn, makes the frequency command faster and more memory efficient!
  • we now also use our own fork of the csv crate, featuring SIMD-accelerated UTF-8 validation and other minor perf tweaks, making the entire qsv suite faster still!

Added

Changed

Fixed

New Contributors

Full Changelog: jqnatividad/qsv@0.133.1...0.134.0

@github-actions github-actions bot added rust Rust use is a significant feature of the PR or issue bump-formula-pr PR was created using `brew bump-formula-pr` labels Sep 10, 2024
qsv: update test

Signed-off-by: Rui Chen <[email protected]>
Copy link
Contributor

🤖 An automated task has requested bottles to be published to this PR.

@github-actions github-actions bot added the CI-published-bottle-commits The commits for the built bottles have been pushed to the PR branch. label Sep 10, 2024
@BrewTestBot BrewTestBot added this pull request to the merge queue Sep 10, 2024
Merged via the queue into master with commit 7dce3b8 Sep 10, 2024
15 checks passed
@BrewTestBot BrewTestBot deleted the bump-qsv-0.134.0 branch September 10, 2024 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bump-formula-pr PR was created using `brew bump-formula-pr` CI-published-bottle-commits The commits for the built bottles have been pushed to the PR branch. rust Rust use is a significant feature of the PR or issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants