Add AVX implementation of graphene_simd4f_madd() #269

ebassi · 2024-08-12T11:02:20Z

AVX introduced a fast multiplication-addition intrinsic.

All supported compilers define `__AVX__` when building with the AVX instruction set enabled.

AVX introduced the _mm_fmadd_ps() intrinsic, so we can use it if AVX (or an equivalent instruction set) is available when building Graphene. There is no functional difference in this commit if AVX is not available, except that we moved from a generic static inline implementation to a SIMD-specific one.

Add AVX detection

df7fa97

All supported compilers define `__AVX__` when building with the AVX instruction set enabled.

ebassi force-pushed the madd-avx branch 2 times, most recently from 6e1bd1c to ac3f9a2 Compare August 12, 2024 11:11

ebassi force-pushed the madd-avx branch from ac3f9a2 to b185f55 Compare August 12, 2024 11:13

ebassi merged commit db2b756 into master Aug 15, 2024
5 checks passed

ebassi deleted the madd-avx branch August 15, 2024 17:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AVX implementation of graphene_simd4f_madd() #269

Add AVX implementation of graphene_simd4f_madd() #269

ebassi commented Aug 12, 2024

Add AVX implementation of graphene_simd4f_madd() #269

Add AVX implementation of graphene_simd4f_madd() #269

Conversation

ebassi commented Aug 12, 2024