AVX Initialization Instructions

Instruction	Meaning
VZEROALL	Zero all YMM registers
VZEROUPPER	Zero upper bits of all YMM registers

Instructions set top

Data Transfer Instructions

Instruction	Meaning
Integer Operands
VMOVD	Move double word
VMOVQ	Move quad word
VMOVDQA	Move aligned double quad words
VMOVDQA32	Move aligned packed double word integer values using writemask
VMOVDQA64	Move aligned packed quad word integer values using writemask
VMOVDQU	Move unaligned double quad words
VMOVDQU8	Move unaligned packed byte integer values using writemask
VMOVDQU16	Move unaligned packed word integer values using writemask
VMOVDQU32	Move unaligned packed double word integer values using writemask
VMOVDQU64	Move unaligned packed quad word integer values using writemask
VMOVSLDUP	Loads/moves 128 bits duplicating the first and third 32-bit data elements
VMOVSHDUP	Loads/moves 128 bits duplicating the second and fourth 32-bit data elements
VMOVDDUP	Loads/moves 128 bits duplicating the lower 64-bit data elements
VPMASKMOVD	Conditional SIMD integer packed loads and stores of double word values
VPMASKMOVQ	Conditional SIMD integer packed loads and stores of quad word values
VPMOVMSKB	Move byte mask
VPALIGNR	Concatenate destination and source operands, extract byte aligned result shifted to the right by constant value
VALIGND	Shift right and merge vectors with double word granularity using immediate shift value
VALIGNQ	Shift right and merge vectors with quad word granularity using immediate shift value
Single Precision Floating-point Operands
VMOVSS	Move scalar single-precision floating-point value between YMM registers or between an YMM register and memory
VMOVAPS	Move four aligned packed single-precision floating-point values between YMM registers or between and YMM register and memory
VMOVUPS	Move four unaligned packed single-precision floating-point values between YMM registers or between and YMM register and memory
VMOVLPS	Move two packed single-precision floating-point values to the low quad word of an YMM register and memory
VMOVHPS	Move two packed single-precision floating-point values to the high quad word of an YMM register and memory
VMOVLHPS	Move two packed single-precision floating-point values from the low quad word to the high quad word of another YMM register
VMOVHLPS	Move two packed single-precision floating-point values from the high quad word to the low quad word of another YMM register
VMASKMOVPS	Conditional SIMD packed loads and stores of single-precision floating-point values
VMOVMSKPS	Extract sign mask from four packed single-precision floating-point value
Double Precision Floating-point Operands
VMOVSD	Move scalar double-precision floating-point value between YMM registers or between an YMM register and memory
VMOVAPD	Move two aligned packed double-precision floating-point values between YMM registers or between and YMM register and memory
VMOVUPD	Move two unaligned packed double-precision floating-point values between YMM registers or between and YMM register and memory
VMOVLPD	Move low packed double-precision floating-point value to the low quad word of an YMM register and memory
VMOVHPD	Move high packed double-precision floating-point value to the high quad word of an YMM register and memory
VMASKMOVPD	Conditional SIMD packed loads and stores of double-precision floating-point values
VMOVMSKPD	Extract sign mask from two packed double-precision floating-point value

Instructions set top

Broadcast Instructions

Instruction	Meaning
Byte Operands
VPBROADCASTB	Broadcast a byte integer value to all elements of a register
VPBROADCASTMB2Q	Broadcast byte size mask to all elements of a register
Word Operands
VPBROADCASTW	Broadcast a word integer value to all elements of a register
VPBROADCASTMW2D	Broadcast word size mask to all elements of a register
Double Word Operands
VPBROADCASTD	Broadcast a double word integer value to all elements of a register
VBROADCASTI32X2	Broadcast two double word values to all elements of a register
VBROADCASTI32X4	Broadcast four double word values to all elements of a register
VBROADCASTI32X8	Broadcast eight double word values to all elements of a register
Quad Word Operands
VPBROADCASTQ	Broadcast a quad word integer value to all elements of a register
VBROADCASTI64X2	Broadcast two quad word values to all elements of a register
VBROADCASTI64X4	Broadcast four quad word values to all elements of a register
Single Precision Floating-point Operands
VBROADCASTSS	Broadcast a single-precision floating-point value to all elements of a register
VBROADCASTF32X2	Broadcast two single-precision floating-point values to all elements of a register
VBROADCASTF32X4	Broadcast four single-precision floating-point values to all elements of a register
VBROADCASTF32X8	Broadcast eight single-precision floating-point values to all elements of a register
Double Precision Floating-point Operands
VBROADCASTSD	Broadcast a double-precision floating-point value to all elements of a register
VBROADCASTF64X2	Broadcast two double-precision floating-point values to all elements of a register
VBROADCASTF64X4	Broadcast four double-precision floating-point values to all elements of a register

Instructions set top

Expand Instructions

Instruction	Meaning
VPEXPANDD	Load sparse packed double word integer values from dense memory
VPEXPANDQ	Load sparse packed quad word integer values from dense memory
VEXPANDPS	Load sparse packed single-precision floating-point values from dense memory
VEXPANDPD	Load sparse packed double-precision floating-point values from dense memory

Instructions set top

Compress Instructions

Instruction	Meaning
VPCOMPRESSD	Store sparse packed double word integer values into dense memory
VPCOMPRESSQ	Store sparse packed quad word integer values into dense memory
VCOMPRESSPS	Store sparse packed single-precision floating-point values into dense memory
VCOMPRESSPD	Store sparse packed double-precision floating-point values into dense memory

Instructions set top

Insert Instructions

Instruction	Meaning
Integer Operands
VPINSRB	Insert a byte value from a register or memory into an YMM register
VPINSRW	Insert a word value from a register or memory into an YMM register
VPINSRD	Insert a double word value from register or memory into an YMM register
VPINSRQ	Insert a quad word value from register or memory into an YMM register
VINSERTI128	Insert 128-bits of packed integer values from the source into the destination operand
VINSERTI32X4	Insert 128-bits of packed integer values from the source into the destination operand at 128-bit granular offset. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand
VINSERTI64X2	Insert 128-bits of packed integer values from the source into the destination operand at 128-bit granular offset. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand
VINSERTI32X8	Insert 256-bits of packed integer values from the source into the destination operand at 256-bit granular offset. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand
VINSERTI64X4	Insert 256-bits of packed integer values from the source into the destination operand at 256-bit granular offset. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand
Floating-point Operands
VINSERTPS	Inserts a single-precision floating-point value from either a 32-bit memory location or selected from a specified offset in an YMM register to a specified offset in the destination YMM register. In addition, INSERTPS allows zeroing out selected data elements in the destination, using a mask
VINSERTF128	Insert 128-bits of packed floating-point values from the source into the destination operand
VINSERTF32X4	Insert 128-bits of packed floating-point values from the source into the destination operand at 128-bit granular offset. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand
VINSERTF64X2	Insert 128-bits of packed floating-point values from the source into the destination operand at 128-bit granular offset. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand
VINSERTF32X8	Insert 256-bits of packed floating-point values from the source into the destination operand at 256-bit granular offset. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand
VINSERTF64X4	Insert 256-bits of packed floating-point values from the source into the destination operand at 256-bit granular offset. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand

Instructions set top

Extract Instructions

Instruction	Meaning
Integer Operands
VPEXTRB	Extract a byte from an YMM register and insert the value into a general-purpose register or memory
VPEXTRW	Extract a word from an YMM register and insert the value into a general-purpose register or memory
VPEXTRD	Extract a double word from an YMM register and insert the value into a general-purpose register or memory
VPEXTRQ	Extract a quad word from an YMM register and insert the value into a general-purpose register or memory
VEXTRACTI128	Extract 128-bits of packed integer values from the source operand and store to the low 128-bit of the destination operand
VEXTRACTI32X4	Extract 128-bits of packed integer values from the source operand and store to the low 128-bit of the destination operand at 128-bit granular offset
VEXTRACTI64X2	Extract 128-bits of packed integer values from the source operand and store to the low 128-bit of the destination operand at 128-bit granular offset
VEXTRACTI32X8	Extract 256-bits of packed integer values from the source operand and store to the low 256-bit of the destination operand at 256-bit granular offset
VEXTRACTI64X4	Extract 256-bits of packed integer values from the source operand and store to the low 256-bit of the destination operand at 256-bit granular offset
Floating-point Operands
VEXTRACTPS	Extracts a single-precision floating-point value from a specified offset in an YMM register and stores the result to memory or a general-purpose register
VEXTRACTF128	Extract 128-bits of packed floating-point values from the source operand and store to the low 128-bit of the destination operand
VEXTRACTF32X4	Extract 128-bits of packed floating-point values from the source operand and store to the low 128-bit of the destination operand at 128-bit granular offset
VEXTRACTF64X2	Extract 128-bits of packed floating-point values from the source operand and store to the low 128-bit of the destination operand at 128-bit granular offset
VEXTRACTF32X8	Extract 256-bits of packed floating-point values from the source operand and store to the low 256-bit of the destination operand at 256-bit granular offset
VEXTRACTF64X4	Extract 256-bits of packed floating-point values from the source operand and store to the low 256-bit of the destination operand at 256-bit granular offset

Instructions set top

Gather Instructions

Instruction	Meaning
Double Word Operands
VPGATHERDD	Gather packed double word values using signed double word indices
VPGATHERQD	Gather packed double word values using signed quad word indices
Quad Word Operands
VPGATHERDQ	Gather packed quad word values using signed double word indices
VPGATHERQQ	Gather packed quad word values using signed quad word indices
Single Precision Floating-point Operands
VGATHERDPS	Gather packed single-precision floating-point values using signed double word indices
VGATHERQPS	Gather packed single-precision floating-point values using signed quad word indices
VGATHERPF0DPS	Sparse prefetch of packed single-precision floating-point values with signed double word indices using T0 hint
VGATHERPF1DPS	Sparse prefetch of packed single-precision floating-point values with signed double word indices using T1 hint
VGATHERPF0QPS	Sparse prefetch of packed single-precision floating-point values with signed quad word indices using T0 hint
VGATHERPF1QPS	Sparse prefetch of packed single-precision floating-point values with signed quad word indices using T1 hint
Double Precision Floating-point Operands
VGATHERDPD	Gather packed double-precision floating-point values using signed double word indices
VGATHERQPD	Gather packed double-precision floating-point values using signed quad word indices
VGATHERPF0DPD	Sparse prefetch of packed double-precision floating-point values with signed double word indices using T0 hint
VGATHERPF1DPD	Sparse prefetch of packed double-precision floating-point values with signed double word indices using T1 hint
VGATHERPF0QPD	Sparse prefetch of packed double-precision floating-point values with signed quad word indices using T0 hint
VGATHERPF1QPD	Sparse prefetch of packed double-precision floating-point values with signed quad word indices using T1 hint

Instructions set top

Scatter Instructions

Instruction	Meaning
Double Word Operands
VPSCATTERDD	Using signed double word indices, scatter double word values to memory using writemask
VPSCATTERQD	Using signed quad word indices, scatter double word values to memory using writemask
Quad Word Operands
VPSCATTERDQ	Using signed double word indices, scatter quad word values to memory using writemask
VPSCATTERQQ	Using signed quad word indices, scatter quad word values to memory using writemask
Single Precision Floating-point Operands
VSCATTERDPS	Using signed double word indices, scatter single-precision floating-point values to memory using writemask
VSCATTERQPS	Using signed quad word indices, scatter single-precision floating-point values to memory using writemask
VSCATTERPF0DPS	Using signed double word indices, prefetch sparse single-precision floating-point values using writemask and T0 hint with intent to write
VSCATTERPF1DPS	Using signed double word indices, prefetch sparse single-precision floating-point value using writemask and T1 hint with intent to write
VSCATTERPF0QPS	Using signed quad word indices, prefetch sparse single-precision floating-point values using writemask and T0 hint with intent to write
VSCATTERPF1QPS	Using signed quad word indices, prefetch sparse single-precision floating-point value using writemask and T1 hint with intent to write
Double Precision Floating-point Operands
VSCATTERDPD	Using signed double word indices, scatter double-precision floating-point values to memory using writemask
VSCATTERQPD	Using signed quad word indices, scatter double-precision floating-point values to memory using writemask
VSCATTERPF0DPD	Using signed double word indices, prefetch sparse double-precision floating-point values using writemask and T0 hint with intent to write
VSCATTERPF1QPD	Using signed double word indices, prefetch sparse double-precision floating-point value using writemask and T1 hint with intent to write
VSCATTERPF0QPD	Using signed quad word indices, prefetch sparse double-precision floating-point values using writemask and T0 hint with intent to write
VSCATTERPF1DPD	Using signed quad word indices, prefetch sparse double-precision floating-point value using writemask and T1 hint with intent to write

Instructions set top

Blending Instructions

Instruction	Meaning
Byte Operands
VPBLENDVB	Conditionally copies specified byte elements in the source operand to the destination, using an implied mask
VPBLENDMB	Performs blending of byte elements between the first and the second operand (register or memory), using the instruction mask selector
Word Operands
VPBLENDW	Conditionally copies specified word elements in the source operand to the destination, using an immediate byte control
VPBLENDMW	Performs blending of word elements between the first and the second operand (register or memory), using the instruction mask selector
Double Word Operands
VPBLENDD	Conditionally copies specified double word elements in the source operand to the destination, using an immediate byte control
VPBLENDMD	Performs blending of double word elements between the first and the second operand (register or memory), using the instruction mask selector
Quad Word Operands
VPBLENDMQ	Performs blending of quad word elements between the first and the second operand (register or memory), using the instruction mask selector
Single Precision Floating-point Operands
VBLENDPS	Conditionally copies specified data elements in the source operand to the destination, using an immediate byte control
VBLENDVPS	Conditionally copies specified data elements in the source operand to the destination, using an implied mask
VBLENDMPS	Performs blending between single-precision elements in the first operand with the elements in the second operand using an opmask register as select control
Double Precision Floating-point Operands
VBLENDPD	Conditionally copies specified data elements in the source operand to the destination, using an immediate byte control
VBLENDVPD	Conditionally copies specified data elements in the source operand to the destination, using an implied mask
VBLENDMPD	Performs blending between double-precision elements in the first operand with the elements in the second operand using an opmask register as select control

Instructions set top

Shuffle Instructions

Instruction	Meaning
Byte Operands
VPSHUFB	Shuffle packed byte values
Word Operands
VPSHUFLW	Shuffle packed low words values
VPSHUFHW	Shuffle packed high words values
Double Word Operands
VPSHUFD	Shuffle packed double words values
VSHUFI32X4	Shuffle 128-bit packed double word values
Quad Word Operands
VSHUFI64X2	Shuffle 128-bit packed quad word values
Single Precision Floating-point Operands
VSHUFPS	Shuffles values in packed single-precision floating-point operands
VSHUFF32X4	Shuffle 128-bit packed single-precision floating-point operands
Double Precision Floating-point Operands
VSHUFPD	Shuffles values in packed double-precision floating-point operands
VSHUFF64X2	Shuffle 128-bit packed double-precision floating-point operands

Instructions set top

Permute Instructions

Instruction	Meaning
Word Operands
VPERMW	Permute packed word elements
VPERMI2W	Permute packed word elements from two tables using indexes
Double Word Operands
VPERMD	Permute packed double word elements
VPERMI2D	Permute packed double word elements from two tables using indexes
Quad Word Operands
VPERMQ	Permute packed quad word elements
VPERMI2Q	Permute packed quad word elements from two tables using indexes
128-bits Integer Operands
VPERM2I128	Permute 128-bit integer fields using controls
Single Precision Floating-point Operands
VPERMPS	Permute packed single-precision floating-point elements
VPERMILPS	Permute packed single-precision floating-point elements using controls
VPERMI2PS	Permute packed single-precision elements from two tables using indexes
Double Precision Floating-point Operands
VPERMPD	Permute packed double-precision floating-point elements
VPERMILPD	Permute packed double-precision floating-point elements using controls
VPERMI2PD	Permute packed double-precision elements from two tables using indexes
128-bits Floating-point Operands
VPERM2F128	Permute 128-bit floating-point fields using controls

Instructions set top

Unpack Instructions

Instruction	Meaning
Byte Operands
VPUNPCKLBW	Unpack low-order bytes
VPUNPCKHBW	Unpack high-order bytes
Word Operands
VPUNPCKLWD	Unpack low-order words
VPUNPCKHWD	Unpack high-order words
Double Word Operands
VPUNPCKLDQ	Unpack low-order double words
VPUNPCKHDQ	Unpack high-order double words
Quad Word Operands
VPUNPCKLQDQ	Unpack low quad words
VPUNPCKHQDQ	Unpack high quad words
Single Precision Floating-point Operands
VUNPCKLPS	Unpacks and interleaves the two low-order values from two single-precision floating-point operands
VUNPCKHPS	Unpacks and interleaves the two high-order values from two single-precision floating-point operands
Double Precision Floating-point Operands
VUNPCKLPD	Unpacks and interleaves the low values from two packed double-precision floating-point operands
VUNPCKHPD	Unpacks and interleaves the high values from two packed double-precision floating-point operands

Instructions set top

Pack Instructions

Instruction	Meaning
Words into Bytes
VPACKSSWB	Pack words into bytes with signed saturation
VPACKUSWB	Pack words into bytes with unsigned saturation
Double Words into Words
VPACKSSDW	Pack double words into words with signed saturation
VPACKUSDW	Pack double words into words with unsigned saturation

Instructions set top

Conversion Instructions

Instruction	Meaning
Byte to Word
VPMOVSXBW	Sign extend the lower 8-bit integer of each packed word element into packed signed word integers
VPMOVZXBW	Zero extend the lower 8-bit integer of each packed word element into packed signed word integers
Byte to Double Word
VPMOVSXBD	Sign extend the lower 8-bit integer of each packed double word element into packed signed double word integers
VPMOVZXBD	Zero extend the lower 8-bit integer of each packed double word element into packed signed double word integers
Byte to Quad Word
VPMOVSXBQ	Sign extend the lower 8-bit integer of each packed quad word element into packed signed quad word integers
VPMOVZXBQ	Zero extend the lower 8-bit integer of each packed quad word element into packed signed quad word integers
Word to Byte
VPMOVWB	Converts packed word integers into packed bytes with truncation
VPMOVSWB	Converts packed signed word integers into packed signed bytes using signed saturation
VPMOVUSWB	Converts packed unsigned word integers into packed unsigned bytes using unsigned saturation
Word to Double Word
VPMOVSXWD	Sign extend the lower 16-bit integer of each packed double word element into packed signed double word integers
VPMOVZXWD	Zero extend the lower 16-bit integer of each packed double word element into packed signed double word integers
Word to Quad Word
VPMOVSXWQ	Sign extend the lower 16-bit integer of each packed quad word element into packed signed quad word integers
VPMOVZXWQ	Zero extend the lower 16-bit integer of each packed quad word element into packed signed quad word integers
Double Word to Byte
VPMOVDB	Converts packed double word integers into packed bytes with truncation
VPMOVSDB	Converts packed signed double word integers into packed signed bytes using signed saturation
VPMOVUSDB	Converts packed unsigned double word integers into packed unsigned bytes using unsigned saturation
Double Word to Word
VPMOVDW	Converts packed double word integers into packed words with truncation
VPMOVSDW	Converts packed signed double word integers into packed signed words using signed saturation
VPMOVUSDW	Converts packed unsigned double word integers into packed unsigned words using unsigned saturation
Double Word to Quad Word
VPMOVSXDQ	Sign extend the lower 32-bit integer of each packed quad word element into packed signed quad word integers
VPMOVZXDQ	Zero extend the lower 32-bit integer of each packed quad word element into packed signed quad word integers
Quad Word to Byte
VPMOVQB	Converts packed quad word integers into packed bytes with truncation
VPMOVSQB	Converts packed signed quad word integers into packed signed bytes using signed saturation
VPMOVUSQB	Converts packed unsigned quad word integers into packed unsigned bytes using unsigned saturation
Quad Word to Word
VPMOVQW	Converts packed quad word integers into packed words with truncation
VPMOVSQW	Converts packed signed quad word integers into packed signed words using signed saturation
VPMOVUSQW	Converts packed unsigned quad word integers into packed unsigned words using unsigned saturation
Quad Word to Double Word
VPMOVQD	Converts packed quad word integers into packed double words with truncation
VPMOVSQD	Converts packed signed quad word integers into packed signed double words using signed saturation
VPMOVUSQD	Converts packed unsigned quad word integers into packed unsigned double words using unsigned saturation
Double Word to Single Precision Floating-point
VCVTSI2SS	Convert scalar signed double word integer to scalar single-precision floating-point value
VCVTUSI2SS	Convert scalar unsigned double word integer to scalar single-precision floating-point value
VCVTDQ2PS	Convert packed signed double word integers to packed single-precision floating-point values
VCVTUDQ2PS	Convert packed unsigned double word integers to packed single-precision floating-point values
Double Word to Double Precision Floating-point
VCVTSI2SD	Convert scalar signed double word integer to scalar double-precision floating-point value
VCVTUSI2SD	Convert scalar unsigned double word integer to scalar double-precision floating-point value
VCVTDQ2PD	Convert packed signed double word integers to packed double-precision floating-point values
VCVTUDQ2PD	Convert packed unsigned double word integers to packed double-precision floating-point values
Quad Word to Single Precision Floating-point
VCVTSI2SS	Convert scalar signed quad word integer to scalar single-precision floating-point value
VCVTUSI2SS	Convert scalar unsigned quad word integer to scalar single-precision floating-point value
VCVTQQ2PS	Convert packed signed quad word integers to packed single-precision floating-point values
VCVTUQQ2PS	Convert packed unsigned quad word integers to packed single-precision floating-point values
Quad Word to Double Precision Floating-point
VCVTSI2SD	Convert scalar signed quad word integer to scalar double-precision floating-point value
VCVTUSI2SD	Convert scalar unsigned quad word integer to scalar double-precision floating-point value
VCVTQQ2PD	Convert packed signed quad word integers to packed double-precision floating-point values
VCVTUQQ2PD	Convert packed unsigned quad word integers to packed double-precision floating-point values
Half Precision Floating-point to Single Precision Floating-point
VCVTPH2PS	Convert eight/four data element containing 16-bit floating-point data into eight/four single-precision floating-point data
Single Precision Floating-point to Double Word
VCVTSS2SI	Convert scalar single-precision floating-point value to scalar signed double word integer
VCVTSS2USI	Convert scalar single-precision floating-point value to scalar unsigned double word integer
VCVTPS2DQ	Convert packed single-precision floating-point values to packed signed double word integers
VCVTPS2UDQ	Convert packed single-precision floating-point values to packed unsigned double word integers
VCVTTSS2SI	Convert with truncation scalar single-precision floating-point value to scalar signed double word integer
VCVTTSS2USI	Convert with truncation scalar single-precision floating-point value to scalar unsigned double word integer
VCVTTPS2DQ	Convert with truncation packed single-precision floating-point values to packed signed double word integers
VCVTTPS2UDQ	Convert with truncation packed single-precision floating-point values to packed unsigned double word integers
Single Precision Floating-point to Quad Word
VCVTSS2SI	Convert scalar single-precision floating-point value to scalar signed quad word integer
VCVTSS2USI	Convert scalar single-precision floating-point value to scalar unsigned quad word integer
VCVTPS2QQ	Convert packed single-precision floating-point values to packed signed quad word integers
VCVTPS2UQQ	Convert packed single precision floating-point values to packed unsigned quad word integers
VCVTTSS2SI	Convert with truncation scalar single-precision floating-point value to scalar signed quad word integer
VCVTTSS2USI	Convert with truncation scalar single-precision floating-point value to scalar unsigned quad word integer
VCVTTPS2QQ	Convert with truncation packed single precision floating-point values to packed signed quad word integers
VCVTTPS2UQQ	Convert with truncation packed single precision floating-point values to packed unsigned quad word integers
Single Precision Floating-point to Half Precision Floating-point
VCVTPS2PH	Convert eight/four data element containing single-precision floating-point data into eight/four 16-bit floating-point data
Single Precision Floating-point to Double Precision Floating-point
VCVTSS2SD	Convert scalar single-precision floating-point value to scalar double-precision floating-point value
VCVTPS2PD	Convert packed single-precision floating-point values to packed double-precision floating-point values
Double Precision Floating-point to Double Word
VCVTSD2SI	Convert scalar double-precision floating-point value to scalar signed double word integer
VCVTSD2USI	Convert scalar double-precision floating-point value to scalar unsigned double word integer
VCVTPD2DQ	Convert packed double-precision floating-point values to packed signed double word integers
VCVTPD2UDQ	Convert packed double-precision floating-point values to packed unsigned double word integers
VCVTTSD2SI	Convert with truncation scalar double-precision floating-point value to scalar signed double word integer
VCVTTSD2USI	Convert with truncation scalar double-precision floating-point value to scalar unsigned double word integer
VCVTTPD2DQ	Convert with truncation packed double-precision floating-point values to packed signed double word integers
VCVTTPD2UDQ	Convert with truncation packed double-precision floating-point values to packed unsigned double word integers
Double Precision Floating-point to Quad Word
VCVTSD2SI	Convert scalar double-precision floating-point value to scalar signed quad word integer
VCVTSD2USI	Convert scalar double-precision floating-point value to scalar unsigned quad word integer
VCVTPD2QQ	Convert packed double-precision floating-point values to packed signed quad word integers
VCVTPD2UQQ	Convert packed double-precision floating-point values to packed unsigned quad word integers
VCVTTSD2SI	Convert with truncation scalar double-precision floating-point value to scalar signed quad word integer
VCVTTSD2USI	Convert with truncation scalar double-precision floating-point value to scalar unsigned quad word integer
VCVTTPD2QQ	Convert with truncation packed double-precision floating-point values to packed signed quad word integers
VCVTTPD2UQQ	Convert with truncation packed double-precision floating-point values to packed unsigned quad word integers
Double Precision Floating-point to Single Precision Floating-point
VCVTSD2SS	Convert scalar double-precision floating-point value to scalar single-precision floating-point value
VCVTPD2PS	Convert packed double-precision floating-point values to packed single-precision floating-point values

Instructions set top

Logical Instructions

Instruction	Meaning
Byte Operands
VPTESTMB	Performs a bitwise logical AND of packed byte integers and set mask
VPTESTNMB	Performs a bitwise logical NOT AND of packed byte integers and set mask
Word Operands
VPTESTMW	Performs a bitwise logical AND of packed word integers and set mask
VPTESTNMW	Performs a bitwise logical NOT AND of packed word integers and set mask
Double Word Operands
VPTESTMD	Performs a bitwise logical AND of packed double word integers and set mask
VPTESTNMD	Performs a bitwise logical NOT AND of packed double word integers and set mask
VPANDD	Bitwise logical AND of packed double word integers
VPANDND	Bitwise logical AND NOT of packed double word integers
VPORD	Bitwise logical OR of packed double word integers
VPXORD	Bitwise logical exclusive XOR of packed double word integers
VPTERNLOGD	Bitwise ternary logic with double word granularity. The immediate value determines the specific binary function being implemented
Quad Word Operands
VPTESTMQ	Performs a bitwise logical AND of packed quad word integers and set mask
VPTESTNMQ	Performs a bitwise logical NOT AND of packed quad word integers and set mask
VPANDQ	Bitwise logical AND of packed quad word integers
VPANDNQ	Bitwise logical AND NOT of packed quad word integers
VPORQ	Bitwise logical OR of packed quad word integers
VPXORQ	Bitwise logical exclusive XOR of packed quad word integers
VPTERNLOGQ	Bitwise ternary logic with quad word granularity. The immediate value determines the specific binary function being implemented
Integer Operands
VPTEST	Performs a logical AND between the destinations with this mask and sets the ZF flag if the result is zero. The CF flag (zero for TEST) is set if the inverted mask AND with the destination is all zero
VPAND	Bitwise logical AND
VPANDN	Bitwise logical AND NOT
VPOR	Bitwise logical OR
VPXOR	Bitwise logical exclusive OR
Single Precision Floating-point Operands
VTESTPS	Packed bit test of single-precision floating-point elements
VANDPS	Perform bitwise logical AND of packed single-precision floating-point values
VANDNPS	Perform bitwise logical AND NOT of packed single-precision floating-point values
VORPS	Perform bitwise logical OR of packed single-precision floating-point values
VXORPS	Perform bitwise logical XOR of packed single-precision floating-point values
Double Precision Floating-point Operands
VTESTPD	Packed bit test of double-precision floating-point elements
VANDPD	Perform bitwise logical AND of packed double-precision floating-point values
VANDNPD	Perform bitwise logical AND NOT of packed double-precision floating-point values
VORPD	Perform bitwise logical OR of packed double-precision floating-point values
VXORPD	Perform bitwise logical XOR of packed double-precision floating-point values

Instructions set top

Shift and Rotate Instructions

Instruction	Meaning
Word Operands
VPSLLW	Shift packed words left logical
VPSRLW	Shift packed words right logical
VPSRAW	Shift packed words right arithmetic
VPSLLVW	Variable bit shift left logical
VPSRLVW	Variable bit shift right logical
VPSRAVW	Variable bit shift right arithmetic
Double Word Operands
VPSLLD	Shift packed double words left logical
VPSRLD	Shift packed double words right logical
VPSRAD	Shift packed double words right arithmetic
VPSLLVD	Variable bit shift left logical
VPSRLVD	Variable bit shift right logical
VPSRAVD	Variable bit shift right arithmetic
VPROLD	Rotate double words left using immediate bits count
VPRORD	Rotate double words right using immediate bits count
VPROLVD	Rotate double words left using variable bits count
VPRORVD	Rotate double words right using variable bits count
Quad Word Operands
VPSLLQ	Shift packed quad word left logical
VPSRLQ	Shift packed quad word right logical
VPSRAQ	Shift packed quad words right arithmetic
VPSLLVQ	Variable bit shift left logical
VPSRLVQ	Variable bit shift right logical
VPSRAVQ	Variable bit shift right arithmetic
VPROLQ	Rotate quad words left using immediate bits count
VPRORQ	Rotate quad words right using immediate bits count
VPROLVQ	Rotate quad words left using variable bits count
VPRORVQ	Rotate quad words right using variable bits count
Double Quad Word Operands
VPSLLDQ	Shift double quad word left logical
VPSRLDQ	Shift double quad word right logical

Instructions set top

Comparison Instructions

Instruction	Meaning
Byte Operands
VPCMPEQB	Compare packed bytes for equal
VPCMPGTB	Compare packed signed byte integers for greater than
VPCMPB	Compare packed signed byte values into mask
VPCMPUB	Compare packed unsigned byte values into mask
Word Operands
VPCMPEQW	Compare packed words for equal
VPCMPGTW	Compare packed signed word integers for greater than
VPCMPW	Compare packed signed word values into mask
VPCMPUW	Compare packed unsigned word values into mask
Double Word Operands
VPCMPEQD	Compare packed double words for equal
VPCMPGTD	Compare packed signed double word integers for greater than
VPCMPD	Compare packed signed double word values into mask
VPCMPUD	Compare packed unsigned double word values into mask
Quad Word Operands
VPCMPEQQ	Compare packed quad words for equal
VPCMPGTQ	Compare packed signed quad word integers for greater than
VPCMPQ	Compare packed signed quad word values into mask
VPCMPUQ	Compare packed unsigned quad word values into mask
Single Precision Floating-point Operands
VCMPEQPS	Compare packed single-precision floating-point values and set mask if destination value is equal to source value
VCMPLTPS	Compare packed single-precision floating-point values and set mask if destination value is less than source value
VCMPLEPS	Compare packed single-precision floating-point values and set mask if destination value is less than or equal to source value
VCMPGTPS	Compare packed single-precision floating-point values and set mask if destination value is greater than source value
VCMPGEPS	Compare packed single-precision floating-point values and set mask if destination value is greater than or equal to source value
VCMPUNORDPS	Compare packed single-precision floating-point values and set mask if at least one of the two source operands is a NaN
VCMPNEQPS	Compare packed single-precision floating-point values and set mask if destination value is not equal to source value
VCMPNLTPS	Compare packed single-precision floating-point values and set mask if destination value is not less than source value
VCMPNLEPS	Compare packed single-precision floating-point values and set mask if destination value is not less than or equal to source value
VCMPNGTPS	Compare packed single-precision floating-point values and set mask if destination value is not greater than source value
VCMPNGEPS	Compare packed single-precision floating-point values and set mask if destination value is not greater than or equal to source value
VCMPORDPS	Compare packed single-precision floating-point values and set mask if neither of both source operands is a NaN
VCMPEQSS	Compare scalar single-precision floating-point values and set mask if destination value is equal to source value
VCMPLTSS	Compare scalar single-precision floating-point values and set mask if destination value is less than source value
VCMPLESS	Compare scalar single-precision floating-point values and set mask if destination value is less than or equal to source value
VCMPGTSS	Compare scalar single-precision floating-point values and set mask if destination value is greater than source value
VCMPGESS	Compare scalar single-precision floating-point values and set mask if destination value is greater than or equal to source value
VCMPUNORDSS	Compare scalar single-precision floating-point values and set mask if at least one of the two source operands is a NaN
VCMPNEQSS	Compare scalar single-precision floating-point values and set mask if destination value is not equal to source value
VCMPNLTSS	Compare scalar single-precision floating-point values and set mask if destination value is not less than source value
VCMPNLESS	Compare scalar single-precision floating-point values and set mask if destination value is not less than or equal to source value
VCMPNGTSS	Compare scalar single-precision floating-point values and set mask if destination value is not greater than source value
VCMPNGESS	Compare scalar single-precision floating-point values and set mask if destination value is not greater than or equal to source value
VCMPORDSS	Compare scalar single-precision floating-point values and set mask if neither of both source operands is a NaN
VCOMISS	Perform ordered comparison of scalar single-precision floating-point value and set flags in EFLAGS register
VUCOMISS	Perform unordered comparison of scalar single-precision floating-point value and set flags in EFLAGS register
Double Precision Floating-point Operands
VCMPEQPD	Compare packed double-precision floating-point values and set mask if destination value is equal to source value
VCMPLTPD	Compare packed double-precision floating-point values and set mask if destination value is less than source value
VCMPLEPD	Compare packed double-precision floating-point values and set mask if destination value is less than or equal to source value
VCMPGTPD	Compare packed double-precision floating-point values and set mask if destination value is greater than source value
VCMPGEPD	Compare packed double-precision floating-point values and set mask if destination value is greater than or equal to source value
VCMPUNORDPD	Compare packed double-precision floating-point values and set mask if at least one of the two source operands is a NaN
VCMPNEQPD	Compare packed double-precision floating-point values and set mask if destination value is not equal to source value
VCMPNLTPD	Compare packed double-precision floating-point values and set mask if destination value is not less than source value
VCMPNLEPD	Compare packed double-precision floating-point values and set mask if destination value is not less than or equal to source value
VCMPNGTPD	Compare packed double-precision floating-point values and set mask if destination value is not greater than source value
VCMPNGEPD	Compare packed double-precision floating-point values and set mask if destination value is not greater than or equal to source value
VCMPORDPD	Compare packed double-precision floating-point values and set mask if neither of both source operands is a NaN
VCMPEQSD	Compare scalar double-precision floating-point values and set mask if destination value is equal to source value
VCMPLTSD	Compare scalar double-precision floating-point values and set mask if destination value is less than source value
VCMPLESD	Compare scalar double-precision floating-point values and set mask if destination value is less than or equal to source value
VCMPGTSD	Compare scalar double-precision floating-point values and set mask if destination value is greater than source value
VCMPGESD	Compare scalar double-precision floating-point values and set mask if destination value is greater than or equal to source value
VCMPUNORDSD	Compare scalar double-precision floating-point values and set mask if at least one of the two source operands is a NaN
VCMPNEQSD	Compare scalar double-precision floating-point values and set mask if destination value is not equal to source value
VCMPNLTSD	Compare scalar double-precision floating-point values and set mask if destination value is not less than source value
VCMPNLESD	Compare scalar double-precision floating-point values and set mask if destination value is not less than or equal to source value
VCMPNGTSD	Compare scalar double-precision floating-point values and set mask if destination value is not greater than source value
VCMPNGESD	Compare scalar double-precision floating-point values and set mask if destination value is not greater than or equal to source value
VCMPORDSD	Compare scalar double-precision floating-point values and set mask if neither of both source operands is a NaN
VCOMISD	Perform ordered comparison of scalar double-precision floating-point value and set flags in EFLAGS register
VUCOMISD	Perform unordered comparison of scalar double-precision floating-point value and set flags in EFLAGS register

Instructions set top

Packed Arithmetic Instructions

Instruction	Meaning
Byte Operands
VPADDB	Add packed byte integers
VPADDUSB	Add packed unsigned byte integers with unsigned saturation
VPADDSB	Add packed signed byte integers with signed saturation
VPSUBB	Subtract packed byte integers
VPSUBUSB	Subtract packed unsigned byte integers with unsigned saturation
VPSUBSB	Subtract packed signed byte integers with signed saturation
Word Operands
VPADDW	Add packed word integers
VPADDUSW	Add packed unsigned word integers with unsigned saturation
VPADDSW	Add packed signed word integers with signed saturation
VPHADDW	Adds two adjacent, signed 16-bit integers horizontally from the source and destination operands and packs the signed 16-bit results to the destination operand
VPHADDSW	Adds two adjacent, signed 16-bit integers horizontally from the source and destination operands and packs the signed, saturated 16-bit results to the destination operand
VPSUBW	Subtract packed word integers
VPSUBUSW	Subtract packed unsigned word integers with unsigned saturation
VPSUBSW	Subtract packed signed word integers with signed saturation
VPHSUBW	Performs horizontal subtraction on each adjacent pair of 16-bit signed integers by subtracting the most significant word from the least significant word of each pair in the source and destination operands. The signed 16-bit results are packed and written to the destination operand
VPHSUBSW	Performs horizontal subtraction on each adjacent pair of 16-bit signed integers by subtracting the most significant word from the least significant word of each pair in the source and destination operands. The signed, saturated 16-bit results are packed and written to the destination operand
VPMULHUW	Multiply packed unsigned integers and store high result
VPMULLW	Multiply packed signed word integers and store low result
VPMULHW	Multiply packed signed word integers and store high result
VPMULHRSW	Multiplies vertically each signed 16-bit integer from the destination operand with the corresponding signed 16-bit integer of the source operand, producing intermediate, signed 32-bit integers. Each intermediate 32-bit integer is truncated to the 18 most significant bits. Rounding is always performed by adding 1 to the least significant bit of the 18-bit intermediate result. The final result is obtained by selecting the 16 bits immediately to the right of the most significant bit of each 18-bit intermediate result and packed to the destination operand
VPMADDUBSW	Multiplies each unsigned byte value with the corresponding signed byte value to produce an intermediate, 16-bit signed integer. Each adjacent pair of 16-bit signed values are added horizontally. The signed, saturated 16-bit results are packed to the destination operand
Double Word Operands
VPADDD	Add packed double word integers
VPHADDD	Adds two adjacent, signed 32-bit integers horizontally from the source and destination operands and packs the signed 32-bit results to the destination operand
VPSUBD	Subtract packed double word integers
VPHSUBD	Performs horizontal subtraction on each adjacent pair of 32-bit signed integers by subtracting the most significant double word from the least significant double word of each pair in the source and destination operands. The signed 32-bit results are packed and written to the destination operand
VPMULLD	Returns four lower 32-bits of the 64-bit results of signed 32-bit integer multiplies
VPMADDWD	Multiply and add packed word integers
Quad Word Operands
VPADDQ	Add packed quad word integers
VPSUBQ	Subtract packed quad word integers
VPMULUDQ	Multiply packed unsigned double word integers
VPMULDQ	Returns two 64-bit signed result of signed 32-bit integer multiplies
VPMULLQ	Returns two lower 64-bits of the 128-bit results of signed 64-bit integer multiplies
Single Precision Floating-point Operands
VADDSS	Add scalar single-precision floating-point value
VADDPS	Add packed single-precision floating-point values
VHADDPS	Performs a single-precision addition on contiguous data elements. The first data element of the result is obtained by adding the first and second elements of the first operand; the second element by adding the third and fourth elements of the first operand; the third by adding the first and second elements of the second operand; and the fourth by adding the third and fourth elements of the second operand
VSUBSS	Subtract scalar single-precision floating-point value
VSUBPS	Subtract packed single-precision floating-point values
VHSUBPS	Performs a single-precision subtraction on contiguous data elements. The first data element of the result is obtained by subtracting the second element of the first operand from the first element of the first operand; the second element by subtracting the fourth element of the first operand from the third element of the first operand; the third by subtracting the second element of the second operand from the first element of the second operand; and the fourth by subtracting the fourth element of the second operand from the third element of the second operand
VADDSUBPS	Performs single-precision addition on the second and fourth pairs of 32-bit data elements within the operands; single-precision subtraction on the first and third pairs
VMULSS	Multiply scalar single-precision floating-point value
VMULPS	Multiply packed single-precision floating-point values
VDIVSS	Divide scalar single-precision floating-point value
VDIVPS	Divide packed single-precision floating-point values
Double Precision Floating-point Operands
VADDSD	Add scalar double precision floating-point value
VADDPD	Add packed double-precision floating-point values
VHADDPD	Performs a double-precision addition on contiguous data elements. The first data element of the result is obtained by adding the first and second elements of the first operand; the second element by adding the first and second elements of the second operand
VSUBSD	Subtract scalar double-precision floating-point value
VSUBPD	Subtract scalar double-precision floating-point value
VHSUBPD	Performs a double-precision subtraction on contiguous data elements. The first data element of the result is obtained by subtracting the second element of the first operand from the first element of the first operand; the second element by subtracting the second element of the second operand from the first element of the second operand
VADDSUBPD	Performs double-precision addition on the second pair of quad words, and double-precision subtraction on the first pair
VMULSD	Multiply scalar double-precision floating-point value
VMULPD	Multiply packed double-precision floating-point values
VDIVSD	Divide scalar double-precision floating-point value
VDIVPD	Divide packed double-precision floating-point values

Instructions set top

Fused Arithmetic Instructions

Instruction	Meaning
Single Precision Floating-point Operands
VFMADD132SS	Fused multiply-add of scalar single-precision floating-point values
VFMADD213SS	Fused multiply-add of scalar single-precision floating-point values
VFMADD231SS	Fused multiply-add of scalar single-precision floating-point values
VFMADD132PS	Fused multiply-add of packed single-precision floating-point values
VFMADD213PS	Fused multiply-add of packed single-precision floating-point values
VFMADD231PS	Fused multiply-add of packed single-precision floating-point values
VFNMADD132SS	Fused negative multiply-add of scalar single-precision floating-point values
VFNMADD213SS	Fused negative multiply-add of scalar single-precision floating-point values
VFNMADD231SS	Fused negative multiply-add of scalar single-precision floating-point values
VFNMADD132PS	Fused negative multiply-add of packed single-precision floating-point values
VFNMADD213PS	Fused negative multiply-add of packed single-precision floating-point values
VFNMADD231PS	Fused negative multiply-add of packed single-precision floating-point values
VFMSUB132SS	Fused multiply-subtract of scalar single-precision floating-point values
VFMSUB213SS	Fused multiply-subtract of scalar single-precision floating-point values
VFMSUB231SS	Fused multiply-subtract of scalar single-precision floating-point values
VFMSUB132PS	Fused multiply-subtract of packed single-precision floating-point values
VFMSUB213PS	Fused multiply-subtract of packed single-precision floating-point values
VFMSUB231PS	Fused multiply-subtract of packed single-precision floating-point values
VFNMSUB132SS	Fused negative multiply-subtract of scalar single-precision floating-point values
VFNMSUB213SS	Fused negative multiply-subtract of scalar single-precision floating-point values
VFNMSUB231SS	Fused negative multiply-subtract of scalar single-precision floating-point values
VFNMSUB132PS	Fused negative multiply-subtract of packed single-precision floating-point values
VFNMSUB213PS	Fused negative multiply-subtract of packed single-precision floating-point values
VFNMSUB231PS	Fused negative multiply-subtract of packed single-precision floating-point values
VFMADDSUB132PS	Fused multiply-alternating add/subtract of packed single-precision floating-point values
VFMADDSUB213PS	Fused multiply-alternating add/subtract of packed single-precision floating-point values
VFMADDSUB231PS	Fused multiply-alternating add/subtract of packed single-precision floating-point values
VFMSUBADD132PS	Fused multiply-alternating subtract/add of packed single-precision floating-point values
VFMSUBADD213PS	Fused multiply-alternating subtract/add of packed single-precision floating-point values
VFMSUBADD231PS	Fused multiply-alternating subtract/add of packed single-precision floating-point values
Double Precision Floating-point Operands
VFMADD132SD	Fused multiply-add of scalar double-precision floating-point values
VFMADD213SD	Fused multiply-add of scalar double-precision floating-point values
VFMADD231SD	Fused multiply-add of scalar double-precision floating-point values
VFMADD132PD	Fused multiply-add of packed double-precision floating-point values
VFMADD213PD	Fused multiply-add of packed double-precision floating-point values
VFMADD231PD	Fused multiply-add of packed double-precision floating-point values
VFNMADD132SD	Fused negative multiply-add of scalar double-precision floating-point values
VFNMADD213SD	Fused negative multiply-add of scalar double-precision floating-point values
VFNMADD231SD	Fused negative multiply-add of scalar double-precision floating-point values
VFNMADD132PD	Fused negative multiply-add of packed double-precision floating-point values
VFNMADD213PD	Fused negative multiply-add of packed double-precision floating-point values
VFNMADD231PD	Fused negative multiply-add of packed double-precision floating-point values
VFMSUB132SD	Fused multiply-subtract of scalar double-precision floating-point values
VFMSUB213SD	Fused multiply-subtract of scalar double-precision floating-point values
VFMSUB231SD	Fused multiply-subtract of scalar double-precision floating-point values
VFMSUB132PD	Fused multiply-subtract of packed double-precision floating-point values
VFMSUB213PD	Fused multiply-subtract of packed double-precision floating-point values
VFMSUB231PD	Fused multiply-subtract of packed double-precision floating-point values
VFNMSUB132SD	Fused negative multiply-subtract of scalar double-precision floating-point values
VFNMSUB213SD	Fused negative multiply-subtract of scalar double-precision floating-point values
VFNMSUB231SD	Fused negative multiply-subtract of scalar double-precision floating-point values
VFNMSUB132PD	Fused negative multiply-subtract of packed double-precision floating-point values
VFNMSUB213PD	Fused negative multiply-subtract of packed double-precision floating-point values
VFNMSUB231PD	Fused negative multiply-subtract of packed double-precision floating-point values
VFMADDSUB132PD	Fused multiply-alternating add/subtract of packed double-precision floating-point values
VFMADDSUB213PD	Fused multiply-alternating add/subtract of packed double-precision floating-point values
VFMADDSUB231PD	Fused multiply-alternating add/subtract of packed double-precision floating-point values
VFMSUBADD132PD	Fused multiply-alternating subtract/add of packed double-precision floating-point values
VFMSUBADD213PD	Fused multiply-alternating subtract/add of packed double-precision floating-point values
VFMSUBADD231PD	Fused multiply-alternating subtract/add of packed double-precision floating-point values

Instructions set top

Primitives of Functions

Instruction	Meaning
Byte Operands
VPABSB	Computes the absolute value of each signed byte data element
VPSIGNB	Negates each signed integer element of the destination operand if the sign of the corresponding element in the source operand is less than zero
VPAVGB	Compute average of packed unsigned byte integers
VPMINUB	Minimum of packed unsigned byte integers
VPMINSB	Minimum of packed signed byte integers
VPMAXUB	Maximum of packed unsigned byte integers
VPMAXSB	Maximum of packed signed byte integers
VPSADBW	Compute sum of absolute differences
VMPSADBW	Performs eight 4-byte wide Sum of Absolute Differences operations to produce eight word integers
VDBPSADBW	Double block packed Sum of Absolute Differences on unsigned bytes
Word Operands
VPABSW	Computes the absolute value of each signed 16-bit data element
VPSIGNW	Negates each signed integer element of the destination operand if the sign of the corresponding element in the source operand is less than zero
VPAVGW	Compute average of packed unsigned word integers
VPMINUW	Minimum of packed unsigned word integers
VPMINSW	Minimum of packed signed word integers
VPMAXUW	Maximum of packed unsigned word integers
VPMAXSW	Maximum of packed signed word integers
VPHMINPOSUW	Finds the value and location of the minimum unsigned word from one of 8 horizontally packed unsigned words. The resulting value and location (offset within the source) are packed into the low double word of the destination YMM register
Double Word Operands
VPABSD	Computes the absolute value of each signed 32-bit data element
VPSIGND	Negates each signed integer element of the destination operand if the sign of the corresponding element in the source operand is less than zero
VPMINUD	Minimum of packed unsigned double word integers
VPMINSD	Minimum of packed signed double word integers
VPMAXUD	Maximum of packed unsigned double word integers
VPMAXSD	Maximum of packed signed double word integers
VPLZCNTD	Count the number of leading zero bits in each packed double word element
VPCONFLICTD	Detect conflicts within a vector of packed double word values into dense memory
Quad Word Operands
VPABSQ	Computes the absolute value of each signed 64-bit data element
VPMINUQ	Minimum of packed unsigned quad word integers
VPMINSQ	Minimum of packed signed quad word integers
VPMAXUQ	Maximum of packed unsigned quad word integers
VPMAXSQ	Maximum of packed signed quad word integers
VPLZCNTQ	Count the number of leading zero bits in each packed quad word element
VPCONFLICTQ	Detect conflicts within a vector of packed quad word values into dense memory
Single Precision Floating-point Operands
VSQRTSS	Compute square root of scalar single-precision floating-point value
VSQRTPS	Compute square roots of packed single-precision floating-point values
VMINSS	Return minimum scalar single-precision floating-point value
VMINPS	Return minimum packed single-precision floating-point values
VMAXSS	Return maximum scalar single-precision floating-point value
VMAXPS	Return maximum packed single-precision floating-point values
VROUNDSS	Round the low packed single precision floating-point value into an integer value and return a rounded floating-point value
VROUNDPS	Round packed single precision floating-point values into integer values and return rounded floating-point values
VRNDSCALESS	Round scalar single-precision floating-point value to include a given number of fraction bits
VRNDSCALEPS	Round packed single-precision floating-point values to include a given number of fraction bits
VDPPS	Perform single-precision dot products for up to 4 elements and broadcast
VRANGESS	Range restriction calculation for pairs of scalar single-precision floating-point values
VRANGEPS	Range restriction calculation for packed pairs of single-precision floating-point values
VREDUCESS	Perform a reduction transformation on a scalar single-precision floating-point value by subtracting a number of fraction bits
VREDUCEPS	Perform reduction transformation on packed single-precision floating-point values by subtracting a number of fraction bits
VGETEXPSS	Convert the biased exponent of scalar single-precision floating-point value to floating-point value representing unbiased integer exponent
VGETEXPPS	Convert the biased exponent of packed single-precision floating-point values to floating-point values representing unbiased integer exponent
VGETMANTSS	Extract the normalized mantissa from scalar single-precision floating-point value
VGETMANTPS	Extract the normalized mantissa from packed single-precision floating-point values
VSCALEFSS	Scale scalar single-precision floating-point value
VSCALEFPS	Scale packed single-precision floating-point values
VEXP2PS	Approximation to the exponential 2^x of packed single-precision floating-point values with less than 2^-23 relative error
VFPCLASSSS	Tests scalar single-precision floating-point value for the following categories: NaN, +0, -0, +Inf, -Inf, denormal, finite, negative
VFPCLASSPS	Tests packed single-precision floating-point values for the following categories: NaN, +0, -0, +Inf, -Inf, denormal, finite, negative
VFIXUPIMMSS	Fix up special scalar single-precision floating-point value
VFIXUPIMMPS	Fix up special packed single-precision floating-point values
VRCP14SS	Computes the approximate reciprocal of the scalar single-precision floating-point value. The max relative error is less than 2^-28
VRCP14PS	Computes the approximate reciprocals of the packed single-precision floating-point values. The max relative error is less than 2^-28
VRCP28SS	Computes the approximate reciprocal of the scalar single-precision floating-point value. The max relative error is less than 2^-28
VRCP28PS	Computes the approximate reciprocals of the packed single-precision floating-point values. The max relative error is less than 2^-28
VRSQRT14SS	Computes the approximate reciprocal square root of the scalar single-precision floating-point value. The max relative error is less than 2^-14
VRSQRT14PS	Computes the approximate reciprocal square roots of the packed single-precision floating-point values. The max relative error is less than 2^-14
VRSQRT28SS	Computes the approximate reciprocal square root of the scalar single-precision floating-point value. The max relative error is less than 2^-28
VRSQRT28PS	Computes the approximate reciprocal square roots of the packed single-precision floating-point values. The max relative error is less than 2^-28
VRCPPS	Compute reciprocals of packed single-precision floating-point values
VRCPSS	Compute reciprocal of scalar single-precision floating-point value
VRSQRTPS	Compute reciprocals of square roots of packed single-precision floating-point values
VRSQRTSS	Compute reciprocal of square root of scalar single-precision floating-point value
Double Precision Floating-point Operands
VSQRTSD	Compute scalar square root of scalar double-precision floating-point value
VSQRTPD	Compute packed square roots of packed double-precision floating-point values
VMINSD	Return minimum scalar double-precision floating-point value
VMINPD	Return minimum packed double-precision floating-point values
VMAXSD	Return maximum scalar double-precision floating-point value
VMAXPD	Return maximum packed double-precision floating-point values
VROUNDSD	Round the low packed double precision floating-point value into an integer value and return a rounded floating-point value
VROUNDPD	Round packed double precision floating-point values into integer values and return rounded floating-point values
VRNDSCALESD	Round scalar double-precision floating-point value to include a given number of fraction bits
VRNDSCALEPD	Round packed double-precision floating-point values to include a given number of fraction bits
VDPPD	Perform double-precision dot product for up to 2 elements and broadcast
VRANGESD	Range restriction calculation for pairs of scalar double-precision floating-point values
VRANGEPD	Range restriction calculation for packed pairs of double-precision floating-point values
VREDUCESD	Perform a reduction transformation on a scalar double-precision floating-point value by subtracting a number of fraction bits
VREDUCEPD	Perform reduction transformation on packed double-precision floating-point values by subtracting a number of fraction bits
VGETEXPSD	Convert the biased exponent of scalar double-precision floating-point value to floating-point value representing unbiased integer exponent
VGETEXPPD	Convert the biased exponent of packed double-precision floating-point values to floating-point values representing unbiased integer exponent
VGETMANTSD	Extract the normalized mantissa from scalar double-precision floating-point value
VGETMANTPD	Extract the normalized mantissa from packed double-precision floating-point values
VSCALEFSD	Scale scalar double-precision floating-point value
VSCALEFPD	Scale packed double-precision floating-point values
VEXP2PD	Approximation to the exponential 2^x of packed double-precision floating-point values with less than 2^-23 relative error
VFPCLASSSD	Tests scalar double-precision floating-point value for the following categories: NaN, +0, -0, +Inf, -Inf, denormal, finite, negative
VFPCLASSPD	Tests packed double-precision floating-point values for the following categories: NaN, +0, -0, +Inf, -Inf, denormal, finite, negative
VFIXUPIMMSD	Fix up special scalar double-precision floating-point value
VFIXUPIMMPD	Fix up special packed double-precision floating-point values
VRCP14SD	Computes the approximate reciprocal of the scalar double-precision floating-point value. The max relative error is less than 2^-28
VRCP14PD	Computes the approximate reciprocals of the packed double-precision floating-point values. The max relative error is less than 2^-28
VRCP28SD	Computes the approximate reciprocal of the scalar double-precision floating-point value. The max relative error is less than 2^-28
VRCP28PD	Computes the approximate reciprocals of the packed double-precision floating-point values. The max relative error is less than 2^-28
VRSQRT14SD	Computes the approximate reciprocal square root of the scalar double-precision floating-point value. The max relative error is less than 2^-14
VRSQRT14PD	Computes the approximate reciprocal square roots of the packed double-precision floating-point values. The max relative error is less than 2^-14
VRSQRT28SD	Computes the approximate reciprocal square root of the scalar double-precision floating-point value. The max relative error is less than 2^-28
VRSQRT28PD	Computes the approximate reciprocal square roots of the packed double-precision floating-point values. The max relative error is less than 2^-28

Instructions set top

Opmask Instructions

Instruction	Meaning
8-bit Operands
KMOVB	Move 8-bit from and to mask registers
KTESTB	Set ZF and CF depending on sign bit AND and ANDN of 8-bit masks
KORTESTB	Bitwise logical OR of two 8-bit masks with setting ZF CF accordingly
KNOTB	Bitwise NOT of 8-bits mask
KANDB	Bitwise logical AND of two 8-bit masks
KANDNB	Bitwise logical AND NOT of two 8-bit masks
KORB	Bitwise logical OR of two 8-bit masks
KXORB	Bitwise logical XOR of two 8-bit masks
KXNORB	Bitwise logical XNOR of two 8-bit masks
KADDB	Add two 8-bit masks
KSHIFTLB	Shift left 8-bit mask register
KSHIFTRB	Shift right 8-bit mask register
KUNPCKBW	Unpack and interleave 8-bit masks
VPMOVM2B	Convert a mask register to a vector register
VPMOVB2M	Converts a vector register to a mask register
16-bit Operands
KMOVW	Move 16-bit from and to mask registers
KTESTW	Set ZF and CF depending on sign bit AND and ANDN of 16-bit masks
KORTESTW	Bitwise logical OR of two 8-bit masks with setting ZF CF accordingly
KNOTW	Bitwise NOT of 16-bits mask
KANDW	Bitwise logical AND of two 16-bit masks
KANDNW	Bitwise logical AND NOT of two 16-bit masks
KORW	Bitwise logical OR of two 16-bit masks
KXORW	Bitwise logical XOR of two 16-bit masks
KXNORW	Bitwise logical XNOR of two 16-bit masks
KADDW	Add two 16-bit masks
KSHIFTLW	Shift left 16-bit mask register
KSHIFTRW	Shift right 16-bit mask register
KUNPCKWD	Unpack and interleave 16-bit masks
VPMOVM2W	Convert a mask register to a vector register
VPMOVW2M	Converts a vector register to a mask register
32-bit Operands
KMOVD	Move 32-bit from and to mask registers
KTESTD	Set ZF and CF depending on sign bit AND and ANDN of 32-bit masks
KORTESTD	Bitwise logical OR of two 8-bit masks with setting ZF CF accordingly
KNOTD	Bitwise NOT of 32-bits mask
KANDD	Bitwise logical AND of two 32-bit masks
KANDND	Bitwise logical AND NOT of two 32-bit masks
KORD	Bitwise logical OR of two 32-bit masks
KXORD	Bitwise logical XOR of two 32-bit masks
KXNORD	Bitwise logical XNOR of two 32-bit masks
KADDD	Add two 32-bit masks
KSHIFTLD	Shift left 32-bit mask register
KSHIFTRD	Shift right 32-bit mask register
KUNPCKDQ	Unpack and interleave 32-bit masks
VPMOVM2D	Convert a mask register to a vector register
VPMOVD2M	Converts a vector register to a mask register
64-bit Operands
KMOVQ	Move 64-bit from and to mask registers
KTESTQ	Set ZF and CF depending on sign bit AND and ANDN of 64-bit masks
KORTESTQ	Bitwise logical OR of two 8-bit masks with setting ZF CF accordingly
KNOTQ	Bitwise NOT of 64-bits mask
KANDQ	Bitwise logical AND of two 64-bit masks
KANDNQ	Bitwise logical AND NOT of two 64-bit masks
KORQ	Bitwise logical OR of two 64-bit masks
KXORQ	Bitwise logical XOR of two 64-bit masks
KXNORQ	Bitwise logical XNOR of two 64-bit masks
KADDQ	Add two 64-bit masks
KSHIFTLQ	Shift left 64-bit mask register
KSHIFTRQ	Shift right 64-bit mask register
VPMOVM2Q	Convert a mask register to a vector register
VPMOVQ2M	Converts a vector register to a mask register

Instructions set top

String and Text Processing Instructions

Instruction	Meaning
VPCMPESTRI	Packed compare explicit-length strings, return index in ECX/RCX
VPCMPESTRM	Packed compare explicit-length strings, return mask in YMM0
VPCMPISTRI	Packed compare implicit-length strings, return index in ECX/RCX
VPCMPISTRM	Packed compare implicit-length strings, return mask in YMM0

Instructions set top

Secure Hash Algorithm Instructions

Instruction	Meaning
SHA1RNDS4	Perform four rounds of SHA1 operation
SHA1MSG1	Perform an intermediate calculation for the next four SHA1 message double words
SHA1MSG2	Perform a final calculation for the next four SHA1 message double words
SHA1NEXTE	Calculate SHA1 state variable E after four founds
SHA256RNDS2	Perform two rounds of SHA256 operation
SHA256MSG1	Perform an intermediate calculation for the next four SHA256 message double words
SHA256MSG2	Perform a final calculation for the next four SHA256 message double words

Instructions set top

State Management Instructions

Instruction	Meaning
VLDMXCSR	Load MXCSR register
VSTMXCSR	Save MXCSR register state

Instructions set top

Agent Synchronization Instructions

Instruction	Meaning
MONITOR	Sets up an address range used to monitor write-back stores
MWAIT	Enables a processor to enter into an optimized state while waiting for a write-back store to the address range set up by the MONITOR instruction

Instructions set top

Cacheability Control, Prefetch and Ordering Instructions

Instruction	Meaning
VLDDQU	Special 128-bit unaligned load designed to avoid cache line splits
PREFETCHNTA	Load 32 or more of bytes from memory to a selected level of the processor’s cache hierarchy using NTA hint
PREFETCHT0	Load 32 or more of bytes from memory to a selected level of the processor’s cache hierarchy using T0 hint
PREFETCHT1	Load 32 or more of bytes from memory to a selected level of the processor’s cache hierarchy using T1 hint
PREFETCHT2	Load 32 or more of bytes from memory to a selected level of the processor’s cache hierarchy using T2 hint
PREFETCHW	Prefetch data into caches in anticipation of a write
PREFETCHWT1	Prefetch data into caches with intent to write and T1 hint
CLFLUSH	Flushes and invalidates a memory operand and its associated cache line from all levels of the processor’s cache hierarchy
CLFLUSHOPT	Flushes and invalidates a memory operand and its associated cache line from all levels of the processor’s cache hierarchy with optimized memory system throughput
SFENCE	Serializes store operations
LFENCE	Serializes load operations
MFENCE	Serializes load and store operations
VMASKMOVDQU	Non-temporal store of selected bytes from an YMM register into memory
VMOVNTPS	Non-temporal store of four packed single-precision floating-point values from an YMM register into memory
VMOVNTPD	Non-temporal store of two packed double-precision floating-point values from an YMM register into memory
VMOVNTDQ	Non-temporal store of double quad word from an YMM register into memory
VMOVNTDQA	Provides a non-temporal hint that can cause adjacent 16-byte items within an aligned 64-byte region (a streaming line) to be fetched and held in a small set of temporary buffers ("streaming load buffers"). Subsequent streaming loads to other aligned 16-byte items in the same streaming line may be supplied from the streaming load buffer and can improve throughput
MOVNTI	Non-temporal store of a double word from a general-purpose register into memory
PAUSE	Improves the performance of "spin-wait loops"

Instructions set top

Single Instruction Multiple Data (SIMD) instructions set

Contents

AVX Initialization Instructions

Data Transfer Instructions

Broadcast Instructions

Expand Instructions

Compress Instructions

Insert Instructions

Extract Instructions

Gather Instructions

Scatter Instructions

Blending Instructions

Shuffle Instructions

Permute Instructions

Unpack Instructions

Pack Instructions

Conversion Instructions

Logical Instructions

Shift and Rotate Instructions

Comparison Instructions

Packed Arithmetic Instructions

Fused Arithmetic Instructions

Primitives of Functions

Opmask Instructions

String and Text Processing Instructions

Secure Hash Algorithm Instructions

State Management Instructions

Agent Synchronization Instructions

Cacheability Control, Prefetch and Ordering Instructions