Command archsimd
Package archsimd provides access to architecture-specific SIMD operations.
This is a low-level package that exposes hardware-specific functionality. It currently supports AMD64.
This package is experimental, and not subject to the Go 1 compatibility promise. It only exists when building with the GOEXPERIMENT=simd environment variable set.
Vector types and operations
Vector types are defined as structs, such as Int8x16 and Float64x8, corresponding to the hardware's vector registers. On AMD64, 128-, 256-, and 512-bit vectors are supported.
Mask types are defined similarly, such as Mask8x16, and are represented as opaque types, handling the differences in the underlying representations. A mask can be converted to/from the corresponding integer vector type, or to/from a bitmask.
Operations are mostly defined as methods on the vector types. Most of them are compiler intrinsics and correspond directly to hardware instructions.
Common operations include:
- Load/Store: Load a vector from memory or store a vector to memory.
- Arithmetic: Add, Sub, Mul, etc.
- Bitwise: And, Or, Xor, etc.
- Comparison: Equal, Greater, etc., which produce a mask.
- Conversion: Convert between different vector types.
- Field selection and rearrangement: GetElem, Permute, etc.
- Masking: Masked, Merge.
The compiler recognizes certain patterns of operations and may optimize them to more performant instructions. For example, on AVX512, an Add operation followed by Masked may be optimized to a masked add instruction. For this reason, not all hardware instructions are available as APIs.
CPU feature checks
The package provides global variables to check for CPU features available at runtime. For example, on AMD64, the X86 variable provides methods to check for AVX2, AVX512, etc. It is recommended to check for CPU features before using the corresponding vector operations.
Notes
- This package is not portable, as the available types and operations depend on the target architecture. It is not recommended to expose the SIMD types defined in this package in public APIs.
- For performance reasons, it is recommended to use the vector types directly as values. It is not recommended to take the address of a vector type, allocate it in the heap, or put it in an aggregate type.