Initial async API and uring usage #4078

vertexclique · 2024-01-27T04:56:48Z

Rationale of the change

I selected most frictionless bit to change. It was fs read/write.
I didn't improve read that much but I focused on write. Writes are improved as you saw below. I am 100% sure vectored writes are even faster.

Tests are passing locally. No public API change except async_read module rename and NucleiReader rename. Tbh, NucleiReader rename is also bad. I wanted to call it AsyncReader but it will get confused with AsyncRead trait.

Some Roadmap Items

Slowly move to futures-util and futures-lite.
Migrate towards pin-project.
Make all interfaces to use futures::Async* drop all Tokio read-likes.
Operate over raw byte types like &[u8] and &mut [u8] drop Bytes.
slowly phase out all spawn_ code in actual async context so user should do these. It is not our problem to manage async io on behalf of the user. Don't incorporate runtime's hardcoded methods if not necessary. (some storage APIs are actually require it, we still keep for them.)
Write Makefile or tasks to easily execute benchmarks.
Debunk what do we want to have from uring backend: low latency (more resource consumption like cpu), high latency less resource hungry? What do we want?
Other backends like epoll and kqueue should be selectable by features.
migrate tests, benchmarks and such to nuclei attributes like #[nuclei::test] , #[nuclei::bench] etc.
Add Reader benchmarks with more contention, it is already obvious that all of them improved but it will shine on harder reads. Larger concurrency than 16 concurrent read tasks are needed.

Performance

Full benchmark results against master:
https://gist.github.com/vertexclique/cd1307d38210bec2d5e3b768502daef0

If bufs fits in memory we are getting immense performance:

exact_buf_write/256 KiB time:   [71.903 ns 72.184 ns 72.523 ns]
                        thrpt:  [3366.4 GiB/s 3382.2 GiB/s 3395.4 GiB/s]
                 change:
                        time:   [-37.945% -37.559% -37.162%] (p = 0.00 < 0.05)
                        thrpt:  [+59.140% +60.151% +61.147%]
                        Performance has improved.

Bench PDF: exact_buf_write_256 KiB - Criterion.rs.pdf

Xuanwo · 2024-04-30T10:22:06Z

We have a discussion thread at https://discord.com/channels/1081052318650339399/1200638279939260436

Initial async API and uring usage

2649ade

vertexclique requested review from Xuanwo and PsiACE as code owners January 27, 2024 04:56

github-actions bot requested review from ClSlaid and xyjixyjixyji January 27, 2024 04:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial async API and uring usage #4078

Initial async API and uring usage #4078

vertexclique commented Jan 27, 2024 •

edited

Loading

Xuanwo commented Apr 30, 2024

Initial async API and uring usage #4078

Are you sure you want to change the base?

Initial async API and uring usage #4078

Conversation

vertexclique commented Jan 27, 2024 • edited Loading

Rationale of the change

Some Roadmap Items

Performance

Bench PDF: exact_buf_write_256 KiB - Criterion.rs.pdf

Xuanwo commented Apr 30, 2024

vertexclique commented Jan 27, 2024 •

edited

Loading