Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AVX implementation of graphene_simd4f_madd() #269

Merged
merged 2 commits into from
Aug 15, 2024
Merged

Add AVX implementation of graphene_simd4f_madd() #269

merged 2 commits into from
Aug 15, 2024

Conversation

ebassi
Copy link
Owner

@ebassi ebassi commented Aug 12, 2024

AVX introduced a fast multiplication-addition intrinsic.

All supported compilers define `__AVX__` when building with the AVX
instruction set enabled.
@ebassi ebassi force-pushed the madd-avx branch 2 times, most recently from 6e1bd1c to ac3f9a2 Compare August 12, 2024 11:11
AVX introduced the _mm_fmadd_ps() intrinsic, so we can use it if AVX (or
an equivalent instruction set) is available when building Graphene.

There is no functional difference in this commit if AVX is not
available, except that we moved from a generic static inline
implementation to a SIMD-specific one.
@ebassi ebassi merged commit db2b756 into master Aug 15, 2024
5 checks passed
@ebassi ebassi deleted the madd-avx branch August 15, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant