-
Notifications
You must be signed in to change notification settings - Fork 129
Home
Gabrielle Demberck edited this page Aug 6, 2022
·
26 revisions
The library is developed in C++11. Separate branch that uses C++03 and releases based on it are provided for compatibility with older compilers.
For older releases please check out this page
The library supports the following architectures and instruction sets:
- x86, x86-64: SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, FMA3, FMA4, AVX512F, AVX512BW, AVX512DQ, AVX512VL, XOP
- ARM 32-bit: NEON, NEONv2
- ARM 64-bit: NEON, NEONv2
- PowerPC 32-bit big-endian: Altivec, VSX v2.06, VSX v2.07
- PowerPC 64-bit little-endian: Altivec, VSX v2.06, VSX v2.07
- MIPS 32-bit little-endian: MSA
- MIPS 64-bit little-endian: MSA
Supported compilers:
-
C++11 version:
- GCC: 4.8-7.x
- Clang: 3.3-4.0
- Xcode 7.0-9.x
- MSVC: 2013, 2015, 2017
- ICC (on both Linux and Windows): 2013, 2015, 2016, 2017
-
C++98 version
- GCC: 4.4-7.x
- Clang: 3.3-4.0
- Xcode 7.0-9.x
- MSVC: 2013, 2015, 2017
- ICC (on both Linux and Windows): 2013, 2015, 2016, 2017
Newer versions of the aforementioned compilers will generally work with either C++11 or C++98 version of the library. Older versions of these compilers will generally work with the C++98 version of the library.
Changes since v2.0:
- Various bug fixes
- Documentation has been significantly improved. The public API is now almost fully documented.
- Added support for MIPS MSA instruction set.
- Added support for PowerPC VSX v2.06 and v2.07 instruction sets.
- Added support for x86 AVX512BW, AVX512DQ and AVX512VL instruction sets.
- Added support for 64-bit little-endian PowerPC.
- Added support for arbitrary width vectors in
extract()
andinsert()
. - Added support for arbitrary source vectors to
to_int8()
,to_uint8()
,to_int16()
,to_uint16()
,to_int32()
,to_uint32()
,to_int64()
,to_uint64()
,to_float32()
,to_float64()
. - Added support for per-element integer shifts to
shift_r()
andshift_l()
. Fallback paths are provided for SSE2-AVX instruction sets that lack hardware per-element integer shift support. - Make
shuffle_bytes16()
,shuffle_zbytes16()
,permute_bytes16()
andpermute_zbytes()
more generic. - New functions:
popcnt
,reduce_popcnt
,for_each
,to_mask()
. - Xcode is now supported.
- The library has been refactored in such a way that older compilers are able to optimize vector emulation code paths much better than before.
- Deprecation: implicit conversion operators to native vector types has been deprecated and a replacement method has been provided instead. The implicit conversion operators may lead to wrong code being accepted without a compile error on Clang.
-
The Hail project