Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorize basic_string::find_last_of #4934

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

AlexGuteniev
Copy link
Contributor

🧭 Overview

  • Same pcmpestri approach as in find_first_of for 1 or 2 byte element. Did not bother making anything for larger characters.
  • The existing find_first_of was a bit copypasta reduced. Still the _first_ and the _last_ don't share the implementation, it would have been very confusing to iterate in different directions in a single loop.
  • The new functions have _pos in names to indicate being position based rather than pointer based. They are used in strings only, so pointers would have been extra conversion for no gain. We might add position-based find_first_of or pointer based find_last_of someday, so the naming scheme has provision for this.
  • Also the vectorization threshold was tuned (even for 8-bytes character, you'll hate it).
  • Test coverage was also expanded to cover the bug of having basic_string<unsigned long long>.

⏱️ Benchmark results

Benchmark main this
bm<AlgType::str_member_last, char>/2/3 7.79 ns 5.56 ns
bm<AlgType::str_member_last, char>/7/4 22.1 ns 12.5 ns
bm<AlgType::str_member_last, char>/9/3 23.8 ns 11.2 ns
bm<AlgType::str_member_last, char>/22/5 37.2 ns 11.9 ns
bm<AlgType::str_member_last, char>/58/2 63.3 ns 13.2 ns
bm<AlgType::str_member_last, char>/102/4 54.0 ns 16.3 ns
bm<AlgType::str_member_last, char>/325/1 119 ns 36.8 ns
bm<AlgType::str_member_last, char>/400/50 156 ns 178 ns
bm<AlgType::str_member_last, char>/1011/11 321 ns 100 ns
bm<AlgType::str_member_last, char>/1502/23 605 ns 308 ns
bm<AlgType::str_member_last, char>/3056/7 916 ns 284 ns
bm<AlgType::str_member_last, wchar_t>/2/3 5.82 ns 5.99 ns
bm<AlgType::str_member_last, wchar_t>/7/4 11.0 ns 11.6 ns
bm<AlgType::str_member_last, wchar_t>/9/3 11.2 ns 11.9 ns
bm<AlgType::str_member_last, wchar_t>/22/5 16.1 ns 12.8 ns
bm<AlgType::str_member_last, wchar_t>/58/2 36.9 ns 17.4 ns
bm<AlgType::str_member_last, wchar_t>/102/4 58.5 ns 24.1 ns
bm<AlgType::str_member_last, wchar_t>/325/1 173 ns 63.7 ns
bm<AlgType::str_member_last, wchar_t>/400/50 191 ns 221 ns
bm<AlgType::str_member_last, wchar_t>/1011/11 396 ns 393 ns
bm<AlgType::str_member_last, wchar_t>/1502/23 658 ns 573 ns
bm<AlgType::str_member_last, wchar_t>/3056/7 1285 ns 548 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/2/3 6.61 ns 5.64 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/7/4 22.9 ns 22.5 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/9/3 24.2 ns 12.1 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/22/5 83.5 ns 13.0 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/58/2 66.4 ns 17.7 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/102/4 162 ns 24.6 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/325/1 213 ns 63.0 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/400/50 5418 ns 603 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/1011/11 4257 ns 523 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/1502/23 9358 ns 1073 ns
bm<AlgType::str_member_last, wchar_t, L'\x03B1'>/3056/7 6275 ns 550 ns

@AlexGuteniev AlexGuteniev requested a review from a team as a code owner September 4, 2024 08:30
@CaseyCarter CaseyCarter added the performance Must go faster label Sep 4, 2024
@StephanTLavavej StephanTLavavej self-assigned this Sep 4, 2024
@StephanTLavavej

This comment was marked as resolved.

@StephanTLavavej
Copy link
Member

Conflict-free merge with toolset update, no clang-format regen needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
Status: Initial Review
Development

Successfully merging this pull request may close these issues.

3 participants