This article is mainly about the introduction of DSP320C6000, and focuses on the detailed description of the instruction list of DSP320C6000.

DSP320C6000

The TMS320C6000 product is a DSP chip launched by TI in 1997. The DSP chip is compatible with fixed-point and floating-point. The fixed-point series is the TMS320C62xx series and the floating-point series is the TMS320C67xx series. In March 2000, TI released a new C64xx core. The main frequency is 1.1GHz, the processing speed is 9000MIPS, and it has been widely used in the field of image processing and streaming media.

There are 8 parallel processing units in the C6000 chip, which are divided into the same two groups. The DSP's architecture adopts a very long instruction word (vliw) structure, with a single instruction word length of 32 bits, 8 instructions in the instruction packet, and a total word length of 256 bits. The functional units for executing instructions have been allocated at compile time, and each 256-bit instruction packet can be allocated to 8 processing units at the same time through a special instruction allocation module when the program is running, and 8 units are running at the same time. The chip's highest clock frequency is 300MHz (67xx series), and when the internal 8 processing units are running in parallel, its maximum processing capacity can reach 1600MIPS.

DSP320C6000 series

DSP320C6000 instruction list collection

DSP320C6000 instruction list collection

DSP320C6000 instruction list collection

Brief description of inline instruction assembly instruction

int _abs (int src);

int _labs (__int40_t src) ABS returns the absolute value of src

int _add2 (int src1, int src2) ADD2 adds the high and low 16 bits of src1 and the high and low 16 bits of src2 respectively, and puts the high and low 16 bits of the result

ushort & _amem2 (void *ptr);LDHU

STHU loads a halfword from memory to dst, which must be aligned with 2bytes (read or store)

const ushort & _amem2_const (const void *ptr); LDHU must be 2byte aligned (read)

unsigned & _amem4 (void *ptr);LDW

STW must be 4byte aligned (read or save)

const unsigned & _amem4_const (const void *ptr); LDW must be 4byte aligned (read)

double & _amemd8 (void *ptr);LDW/LDW

STW/STW must be 8byte aligned (read or save)

const double & _amemd8_const (const void *ptr); LDDW must be 8byte aligned (read)

unsigned _clr (unsigned src2, unsigned csta, unsigned cstb); CLR specifies the first and last bits that need to be cleared

unsigned _clrr (unsigned src2, int src1); CLR clears the specified bit in src2 to 0, and the first and last bits of cleared to 0 are specified by the lower 10 bits of src1

__int40_t _dtol (double src); Reinterpret a double register as a __int40_t

long long _dtoll (double src); reinterpret a double register as a long long

int _ext (int src2, unsigned csta, unsigned cstb); EXT extracts the area specified by csta and cstb from src2 and sign-extends to 32 bits. The extracted area is first moved to the left and then to the right.

int _extr (int src2, int src1); EXT is the same as above, the difference: the number of bits of left and right shift is specified by the lower 10 bits of src1

unsigned _extu (unsigned src2, unsigned csta, unsigned cstb); EXTU is the same as above, the difference is that 0 is extended to 32 bits in the end.

unsigned _extur (unsigned src2, intsrc1); EXTU is the same as above, the difference: the number of bits shifted left and right is specified by the lower 10 bits of src1. Example:

_ftoi (1.0) == 1065353216U

unsigned _ftoi (float src); Interpret the bits of float as unsigned

unsigned _hi (double src); returns the high bit (odd bit) of the double register

unsigned _hill (long long src); returns the high bit (odd bit) of the longlong register

double _itod (unsigned src2, unsignedsrc1); Create a new double register in order to interpret the two unsigned values, where src2 is the high (odd) register and src1 is the low (even) register

float _itof (unsigned src); Interpret the bits in unsigned as float Example:

_itof (0x3f800000) = 1.0

long long _itoll (unsigned src2, unsignedsrc1); Create a new longlong register in order to explain the value of 2 unsigned, where src2 is the high (odd) register and src1 is the low (even)

unsigned _lmbd (unsigned src1, unsignedsrc2); LMBD searches for 1 or 0 in src2, 1 or 0 is determined by the LSB of src1, and returns the number of bits changed

unsigned _lo (double src); returns the low (odd) register of the double register

unsigned _loll (long long src); returns the low (odd) register of the longlong register

double _ltod (__int40_t src); Interpret a __int40_t register as a double register

double _lltod (long long src); Interpret a longlong register as a double register

int _mpy (int src1, int src2); MPYSrc1 and src2 are multiplied, the operand is signed by default

int _mpyus (unsigned src1, int src2); MPYUS multiplies unsigned src1 and signed src2, S is used for the signed operand, when both operands are signed or unsigned

int _mpysu (int src1, unsigned src2); MPYSU is the same as above

unsigned _mpyu (unsigned src1, unsigned src2); MPYU is the same as above, the default is unsigned

int _mpyh (int src1, int src2); MPYH is the same as above, the difference is shown in the figure

int _mpyhus (unsigned src1, int src2); MPYHUS

int _mpyhsu (int src1, unsigned src2); MPYHSU

unsigned _mpyhu (unsigned src1, unsigned src2); MPYHU

int _mpyhl (int src1, int src2); MPYHL is the same as above, the difference is shown in the figure

int _mpyhuls (unsigned src1, int src2); MPYHULS

int _mpyhslu (int src1, unsigned src2); MPYHSLU

unsigned _mpyhlu (unsigned src1, unsigned src2); MPYHLU

int _mpylh (int src1, int src2); MPYLH

int _mpyluhs (unsigned src1, int src2); MPYLUHS

int _mpylshu (int src1, unsigned src2); MPYLSHU

unsigned _mpylhu (unsigned src1, unsigned src2); MPYLHU

void _nassert (int src); Do not generate code, tell the optimizer something

unsigned _norm (int src);

unsigned _lnorm (__int40_t src); NORM

Return the number of redundant sign bits of src2, see the figure for details

int _sadd (int src1, int src2);

long _lsadd (int src1, __int40_t src2); SADD adds src1 and src2 and saturates the result

int _sat (__int40_t src2); SAT converts a 40-bit long to a 32-bit signed int, saturating the result if necessary

unsigned _set (unsigned src2, unsignedcsta, unsigned cstb); SET sets the area specified in src2 to 1, and the specified area is specified by csta and cstb

unsigned _setr (unit src2, int src1); SET

int _smpy (int src1, int src2); SMPY multiplies the low 16 bits of src1 and the low 16 bits of src2

int _smpyh (int src1, int src2); SMPYH high 16 bits

int _smpyhl (int src1, int src2); SMPYHL

int _smpylh (int src1, int src2); SMPYLH

int _sshl (int src2, unsigned src1); SSHL shifts src2 to the left with the src1 operand and saturates the result to 32 bits

int _ssub (int src1, int src2);

__int40_t _lssub (int src1, __int40_tsrc2); SSUB subtracts src2 from src1 and saturates the result (src1-src2)

unsigned _subc (unsigned src1, unsignedsrc2); Conditional subtraction and left shift of SUBC (usually used for division)

int _sub2 (int src1, int src2); SUB2 subtracts the high and low 16 bits of src1 from the high and low 16 bits of src2. Any borrowing of the lower 16 bits will not affect the upper 16 bits.

int _abs2 (int src); ABS2 calculates the absolute value of 16 bits

int _add4 (int src1, int src2); ADD4 adds 4 pairs of 8 digits of src1 and src2. No saturation, carry will not affect other 8 digits

long long & _amem8 (void *ptr);LDDW

STDW loads and stores 8bytes, the pointer must be 8byte aligned

const long long & _amem8_const (const void *ptr); LDDW loads 8bytes, the pointer must be 8byte aligned

__float2_t & _amem8_f2(void * ptr);LDDW

STDW loads and stores 8bytes, the pointer must be 8byte aligned, and c6x.h must be included

const __float2_t & _amem8_f2_const(void * ptr); LDDW loads 8bytes, the pointer must be 8byte aligned, and c6x.h must be included

double & _amemd8 (void *ptr);LDDW

STDW

const double & _amemd8_const (const void *ptr);LDDW

int _avg2 (int src1, int src2); AVG2 calculates the average value of each pair of signed 16 positions

unsigned _avgu4 (unsigned, unsigned); AVGU4 calculates the average value of each pair of signed 8 digits

unsigned _bitc4 (unsigned src); BITC4 counts the number of bits each 8-bit is 1, and writes the result to the corresponding position

unsigned _bitr (unsigned src); BITR reverses the order of bits

int _cmpeq2 (int src1, int src2); CMPEQ2 compares whether the value of every 16 bits is equal, and puts the result into the lowest 2 bits of dst

int _cmpeq4 (int src1, int src2); CMPEQ4 compares whether the value of every 8 bits is equal, and the result is put into the lowest 4 bits of dst, and equal to 1 is set, otherwise it is 0

int _cmpgt2 (int src1, int src2); CMPGT2 is compared with every 16 bits, src1》src2, set to 1; otherwise, set to 0. Put the result into the lowest 2 bits of dst

unsigned _cmpgtu4 (unsigned src1, unsigned src2); CMPGTU4 compares unsigned every 8 bits, src1》src2, set to 1; otherwise, set to 0. Put the result into the lowest 4 bits of dst

unsigned _deal (unsigned src); DEAL extracts the odd and even bits of the bits in src for reorganization, the even bits are placed in the low 16 bits, and the odd bits are placed in the high 16 bits

int _dotp2 (int src1, int src2);

__int40_t _ldotp2 (int src1, int src2); DOTP2

DOTP2 does a dot product of 16-bit signed pairs in src1 and src2, and the result is written as a signed 32-bit int or sign-extended to 64-bit

int _dotpn2 (int src1, int src2); DOTPN2 subtracts the dot product of the 16-bit signed numbers in src1 and src2

int _dotpnrsu2 (int src1, unsigned src2); DOTPNRSU2 The dot product of the upper 16 bits of Src1 and src2 minus the dot product of the lower 16 bits. The number in Src1 is treated as signed, and the number in src2 is treated as unsigned, plus 2^15, and the result is signed to the right by 16 bits

int _dotprsu2 (int src1, unsigned src2); DOTPRSU2 The dot product of the upper 16 bits of Src1 and src2 plus the dot product of the lower 16 bits. The number in Src1 is treated as signed, and the number in src2 is treated as unsigned, plus 2^15, and the result is signed to the right by 16 bits

int _dotpsu4 (int src1, unsigned src2); DOTPSU4 multiplies every 8 digits of src1 and src2 and then sums them. Every 8 digits of src1 is regarded as signed, and every 8 digits of src2 is regarded as unsigned.

unsigned _dotpu4 (unsigned src1, unsigned src2); DOTPU4 is treated as unsigned

int _gmpy4 (int src1, int src2); GMPY4 multiplies the 4 unsigned src1 and src2 into the Galois field

int _max2 (int src1, int src2); MAX2 compares two signed 16-bit integers of src1 and src2, and takes the larger value

int _min2 (int src1, int src2); MIN2 compares two signed 16-bit integers of src1 and src2, and takes the smaller value

unsigned _maxu4 (unsigned src1, unsigned src2); MAXU4 compares 4 unsigned 8-bit integers of src1 and src2, and takes the larger value

unsigned _minu4 (unsigned src1, unsigned src2); MINU4 compares 4 unsigned 8-bit integers of src1 and src2, and takes the smaller value

ushort & _mem2 (void * ptr);LDB/LDB

STB/STB load and store 2bytes, no alignment required

const ushort & _mem2_const (const void * ptr); LDB/LDB load 2 bytes, no alignment required

unsigned & _mem4 (void * ptr);LDNW

STNW loads and stores 4bytes without alignment

const unsigned & _mem4_const (const void * ptr); LDNW loads 4 bytes without alignment

long long & _mem8 (void * ptr);LDNDW

STNDW loads and stores 8bytes without alignment

const long long & _mem8_const (const void * ptr); LDNDW loads 8byte without alignment

double & _memd8 (void * ptr);LDNDW

STNDW loads and stores 8bytes without alignment

const double & _memd8_const (const void * ptr); LDNDW loads 8byte, no alignment required

long long _mpy2ll (int src1, int src2); MPY2 multiplies the two signed 16 bits of src1 and src2 respectively, and writes the two 32-bit results into longlong

long long _mpyhill (int src1, int src2); MPYHI treats the high 16 bits of src1 as a signed 16 bit multiplied by the signed 32 bit of src2, and the result is written into the low 48 bits of longlong

long long _mpylill (int src1, int src2); MPYLI regards the low 16 bits of src1 as a signed 16 bit multiplied by the signed 32 bit of src2, and the result is written into the low 48 bits of longlong

int _mpyhir (int src1, int src2); MPYHIR treats the high 16 bits of src1 as a 16-bit signed multiplied by src2 signed 32-bit. The product is converted to 32 bits by adding 2^14 in round mode, and finally shifted to the right by 15 bits

int _mpylir (int src1, int src2); MPYLIR treats the lower 16 bits of src1 as a 16-bit signed multiplying src2 signed 32-bit. The product is converted to 32 bits by adding 2^14 in round mode, and finally shifted to the right by 15 bits

long long _mpysu4ll (int src1, unsignedsrc2); MPYSU4 multiplies the 4 8-bit signed src1 by the 4 8-bit unsigned src2 to obtain 4 16-bit signed, forming a 64-bit

long long _mpyu4ll (unsigned src1, unsigned src2); MPYU4 multiplies the 4 unsigned 8 bits of src1 and src2 to obtain 4 unsigned 16 bits to form a 64-bit number

int _mvd (int src2); MVD moves the data of src2 into the return value, using the multiplication pipeline (delay)

unsigned _pack2 (unsigned src1, unsigned src2); PACK2

unsigned _packh2 (unsigned src1, unsigned src2); PACKH2

unsigned _packh4 (unsigned src1, unsigned src2); PACKH4

unsigned _packl4 (unsigned src1, unsigned src2); PACKL4

unsigned _packhl2 (unsigned src1, unsigned src2); PACKHL2

unsigned _packlh2 (unsigned src1, unsigned src2); PACKLH2

unsigned _rotl (unsigned src1, unsignedsrc2); ROTL shifts the 32 bits of src2 to the left according to the lowest 5 bits of src1, and the remaining high 5-31 bits in src1 are ignored

int _sadd2 (int src1, int src2); SADD2 adds two 16-bit signed numbers in src1 and src2 to generate two 16 signed numbers that are saturated.

int _saddus2 (unsigned src1, int src2); SADDUS2 adds the 2 unsigned 16-bit numbers in src1 and the 2 16-bit signed numbers in src to get 2 unsigned 16-bit numbers

unsigned _saddu4 (unsigned src1, unsigned src2); SADDU4 adds 4 unsigned 8-digit numbers in src1 and src2

unsigned _shfl (unsigned src2); SHFL interleaves the high 16 and low 16 bits of src2

unsigned _shlmb (unsigned src1, unsigned src2); SHLMB shifts src2 to the left by 1 byte, and then fills the highest bit of src1 into the extra position after shifting src2 to the left

unsigned _shrmb (unsigned src1, unsigned src2); SHRMB shifts src2 to the right by 1 byte, and then fills the lowest bit of src1 into the extra position after shifting src2 to the right

int _shr2 (int src1, unsigned src2); SHR2 shifts the two 16-bit signed numbers of src2 to the right. The number of bits shifted to the right is determined by the lower 5 bits of src1. The extra position is extended by the sign bit

unsigned shru2 (unsigned src1, unsignedsrc2); SHRU2 shifts the two 16-bit unsigned numbers of src2 to the right, the number of bits shifted to the right is determined by the lower 5 bits of src1, and the extra position is extended by 0

long long _smpy2ll (int src1, int src2); SMPY2 multiplies the two signed 16-digit numbers in src1 and src2, and then shifts it to the left by 1 bit, and then saturates.

int _spack2 (int src1, int src2); SPACK2 saturates a signed 32-bit number in src1 and src2 to signed 16 bits, and then puts the saturation result of src1 into the high 16 bits of dst, and the saturation result of src2 Put the lower 16 bits of dst

unsigned _spacku4 (int src1, int src2); SPACKU4 saturates the 4 signed 16 digits in src1 and src2 into unsigned 8 digits,

int _sshvl (int src2, int src1); SSHVL shifts the signed 32-bit number in src2 to the left or right, and the number of shifts is determined by the number of bits specified by src1.

src1 is between [-31, 31], if src1 is positive, src2 is shifted to the left; if src1 is negative, src2 is shifted to the right |src1| and the sign bit is extended

int _sshvr (int src2, int src1); SSHVR shifts the signed 32-digit number in src2 to the left or right, and the number of shifts is determined by the number of bits specified by src1.

src1 is between [-31, 31], if src1 is positive, src2 is shifted to the right and is sign extended; if src1 is negative, src2 is shifted to the left |src1|

int _sub4 (int src1, int src2); SUB4 subtracts the 4 8-digit numbers in src1 and src2 without saturation

int _subabs4 (int src1, int src2); SUBABS4 subtracts 4 unsigned 8 bits in src1 and src2 to find the absolute value

unsigned _swap4 (unsigned src); SWAP4 swaps the 4 8-bit unsigned numbers of src as shown in the figure

unsigned _unpkhu4 (unsigned src); UNPKHU4 extension 0

unsigned _unpklu4 (unsigned src); UNPKLU4 expand 0

unsigned _xpnd2 (unsigned src); XPND2 is expanded according to the lowest 2 bits of src, bit1 expands the high 16 bits, bit0 expands the low 16 bits

unsigned _xpnd4 (unsigned src); XPND4 is expanded according to the lowest 4 bits of src

long long _addsub (int src1, int src2); ADDSUB does 2 steps in parallel:

1. src2+src1-》dst_o

2. src1-src2-》dst_e

long long _addsub2 (int src1, int src2); ADDSUB 216-bit signed

ADD2: The high and low 16 bits of src2 + the high and low 16 bits of src1 -> dst_o

SUB2: The high and low 16 bits of src1-the high and low 16 bits of src2-"dst_e

long long _cmpy (unsigned src1, unsigned src2); CMPY signed 16 bits

The dot product of the upper 16 bits of Src1 and src2-the dot product of the lower 16 bits of src1 and src2-"dst_o

Saturation (dot product of the upper 16 bits of src1 and src2 + dot product of the lower 16 bits of src1 and src2)-"dst_e

unsigned _cmpyr (unsigned src1, unsigned src2); CMPYR

unsigned _cmpyr1 (unsigned src1, unsigned src2); CMPYR1

long long _ddotp4 (unsigned src1, unsigned src2); DDOTP4 is not saturated

long long _ddotph2 (long long src1, unsigned src2); DDOTPH2

long long _ddotpl2 (long long src1, unsigned src2); DDOTPL2

unsigned _ddotph2r (long long src1, unsigned src2); DDOTPH2R

unsigned _ddotpl2r (long long src1, unsigned src2); DDOTPL2R

long long _dmv (int src1, int src2); DMV moves two registers into one register at a time

long long _dpack2 (unsigned src1, unsigned src2); DPACK2

long long _dpackx2 (unsigned src1, unsigned src2); DPACKX2

__float2_t _fmdv_f2 (float src1, floatsrc2) DMV

unsigned _gmpy (unsigned src1, unsigned src2); multiplication on the GMPY Galois field

long long _mpy2ir (int src1, int src2); MPY2IR performs 16-bit by 32-bit.

Treat the high 16 bits and low 16 bits of src1 as signed 16 bits; treat the value of src2 as signed 32 bits.

The product is added by 2^14round to 32 bits, and then the result is shifted by 15 bits to the right.

The lower 32 bits of the 2 results are written to dst_o:dst_e

int _mpy32 (int src1, int src2); MPY32 performs 32-bit by 32-bit. All are signed, the lower 32 bits of the 64-bit result are written to dst

long long _mpy32ll (int src1, int src2); MPY3232-bit signed number × 32-bit signed number, signed 64-bit result is written to dst

long long _mpy32su (int src1, int src2); MPY32SUsrc1 signed 32 bits × src2 unsigned 32 bits = dst signed 64 bits

long long _mpy32us (unsigned src1, intsrc2); MPY32USsrc1 unsigned 32 bits × src2 signed 32 bits = dst signed 64 bits

long long _mpy32u (unsigned src1, unsigned src2); MPY32Usrc1 unsigned 32 x src2 unsigned 32 = dst unsigned 64

int _rpack2 (int src1, int src2); RPACK2

long long _saddsub (unsigned src1, unsigned src2); SADDSUB goes in parallel:

1. Saturation (src1+src2)-"dst_o

2. Saturation (src1-src2)-"dst_e

long long _saddsub2 (unsigned src1, unsigned src2); SADDSUB2 performs SADD2 and SSUB2 instructions in parallel

long long _shfl3 (unsigned src1, unsignedsrc2); SHFL3 is shown in the figure, generating a long long

int _smpy32 (int src1, int src2); SMPY3232 bit signed × 32 bit signed, the 64-bit result is shifted to the left by 1 bit and then saturated, and then the upper 32 bits of the result are written to dst

int _ssub2 (unsigned src1, unsignedsrc2); 2 16-bit signed in SSUB2Src1-2 signed 16-bit in src2, the result is saturated

unsigned _xormpy (unsigned src1, unsigned src2); XORMPY plus Varro domain multiplication

int _dpint (double src); DPINT converts double to int (round)

__int40_t _f2tol (__float2_t src); Interpret a __float2_t as an __int40

__float2_t _f2toll (__float2_t src); interpret a __float2_t as a longlong

double _fabs (double src); ABSDP puts the absolute value of src into dst.

float _fabsf (float src); ABSSP

__float2_t _lltof2 (long long src); Interpret a longlong as a __float2_t

__float2_t _ltof2 (__int40_t src); Interpret an __int40 into a __float2_t

__float2_t & _mem8_f2(void * ptr);LDNDW

STNDW loads a 64-bit value from memory

const __float2_t & _mem8_f2_const(void * ptr);LDNDW

STNDW

long long _mpyidll (int src1, int src2); MPYIDSrc1×src2-》dst

double_mpysp2dp (float src1, float src2);MPYSP2DPSrc1×src2-》dst

double_mpyspdp (float src1, doublesrc2);MPYSPDPSrc1×src2-》dst

double _rcpdp (double src); RCPDP 64-bit double reciprocal approximation into dst

float _rcpsp (float src); the reciprocal approximation of RCPSP 32-bit float

double _rsqrdp (double src); the approximate value of the reciprocal square root of RSQRDP64-bit double

float _rsqrsp (float src); the approximate value of the reciprocal square root of RSQRSP 32-bit float

int _spint (float); SPINTFloat is converted to int

ADDDP Add 2 doubles

ADDSP 2 floats added

AND bit and

ANDN and reverse

MPYSP 2 floats multiplied

OR bit or

SUBDP subtract 2 doubles

Subtract SUBSP2 and float

XOR

__x128_t _ccmatmpy (long long src1, __x128_t src2); CMATMPY

long long _ccmatmpyr1 (long long src1, __x128_t src2); CCMATMPYR1

long long _ccmpy32r1 (long long src1, long long src2); CCMPY32R1

__x128_t _cmatmpy (long long src1, __x128_t src2); CMATMPY

long long _cmatmpyr1 (long long src1, __x128_t src2); CMATMPYR1

long long _cmpy32r1 (long long src1, long long src2); CMPY32R1

__x128_t _cmpysp (__float2_t src1, __float2_t src2); CMPYSP

double _complex_conjugate_mpysp (double src1, double src2); CMPYSP

DSUBSP

double _complex_mpysp (double src1, double src2); CMPYSP

DADDSP

int _crot90 (int src); 90 degree rotation of CROT90 complex number

int _crot270 (int src); 270 degree rotation of CROT270 complex number

long long _dadd (long long src1, long longsrc2); 2 32-bit signed numbers of DADDSrc1 + 2 32-bit signed numbers of src2

long long _dadd2 (long long src1, long long src2); DADD24 signed 16-bit addition

__float2_t _daddsp (__float2_t src1, __float2_t src2); DADDSP

long long _dadd_c (scst5 immediate src1, long long src2); DADD2 way float addition

long long _dapys2 (long long src1, long long src2); DAPYS2

long long _davg2 (long long src1, long long src2); DAVG2 signed 16 bits

long long _davgnr2 (long long src1, long long src2); DAVGNR2 signed 16-bit, no round mode

long long _davgnru4 (long long src1, long long src2); DAVGNRU4 unsigned 8-bit, no round mode

long long _davgu4 (long long src1, long long src2); DAVGU4 unsigned 8-bit

long long _dccmpyr1 (long long src1, long long src2); DCCMPYR1

unsigned _dcmpeq2 (long long src1, long long src2); DCMPEQ 216-bit comparison, return 1 for equality, 0 for unequal

unsigned _dcmpeq4 (long long src1, long long src2); DCMPEQ 48-bit comparison, return 1 for equality, 0 for unequal

unsigned _dcmpgt2 (long long src1, long long src2); DCMPGT216 bit comparison, src1 "src-" 1, otherwise return 0

unsigned _dcmpgtu4 (long long src1, long long src2); DCMPGTU 48-bit comparison, src1 "src-" 1, otherwise it returns 0

__x128_t _dccmpy (long long src1, long long src2); DCCMPY

__x128_t _dcmpy (long long src1, long long src2); DCMPY

long long _dcmpyr1 (long long src1, long long src2); DCMPYR1

long long _dcrot90 (long long src); DCROT90

long long _dcrot270 (long long src); DCROT270

long long _ddotp4h (__x128_t src1, __x128_t src2); DDOTP4H executes 2 dotp4h, both are signed

long long _ddotpsu4h (__x128_t src1, __x128_t src2 ); DDOTPSU4H executes 2 dotpsu4h, one signed and one unsigned

__float2_t _dinthsp (int src); The 16-bit signed number in DINTHSPSrc is converted to single-precision floating point and placed in dst_e and dst_o

__float2_t _dinthspu (unsigned src); The 16-bit unsigned number in DINTHSPUSrc is converted to single-precision floating point and placed in dst_e and dst_o

__float2_t _dintsp (long long src); The signed 32-bit in DINTSPSrc is converted to single-precision floating point and placed in dst_e and dst_o

__float2_t _dintspu (long long src); The unsigned 32-bit in DINTSPUSrc is converted to single-precision floating point and placed in dst_e and dst_o

long long _dmax2 (long long src1, long long src2); DMAX2 compares the size of the 16-bit signed number in src1 and src2, and puts the larger one in dst

long long _dmaxu4 (long long src1, long long src2); DMAXU4 compares the size of the 8-bit signed number in src1 and src2, and puts the larger one in dst

long long _dmin2 (long long src1, long long src2); DMIN2 compares the size of the 16-bit signed number in src1 and src2, and puts the smaller in dst

long long _dminu4 (long long src1, long long src2); DMINU4 compares the size of the 8-bit signed number in src1 and src2, and puts the smaller in dst

__x128_t _dmpy2 (long long src1, long long src2); DMPY2 multiplies the 16-bit signed number in src1 and src2 to get a 32-bit signed number and put it into a 128-bit register

__float2_t _dmpysp (__float2_t src1, __float2_t src2); DMPYSP

__x128_t _dmpysu4 (long long src1, long long src2); DMPYSU4 multiplies the 8-bit signed number in src1 by the unsigned 8-bit in src2, and waits until signed 16-bit

__x128_t _dmpyu2 (long long src1, long long src2); DMPYU 216-bit unsigned numbers are multiplied to get 32-bit numbers into 128-bit registers

__x128_t _dmpyu4 (long long src1, long long src2); DMPYU 48-bit unsigned number multiplied to get a signed 16-bit result

long long _dmvd (long long src1, unsigned src2); DMVD moves two registers into one register. Move 2 times in sequence, which is useful when dealing with a lot of double words. Reduce register pressure

int _dotp4h (long long src1, long long src2); DOTP4H performs the dot product of two series of 16-bit values

long long _dotp4hll (long long src1, long long src2); DOTP4H returns different values

int _dotpsu4h (long long src1, long longsrc2); DOTPSU4HSrc1 is treated as signed 16 bits, src2 is treated as unsigned 16 bits, and a 32-bit result is obtained

long long _dotspu4hll (long long src1, long long src2); DOTPSU4HSrc1 is treated as signed 16-bit, src2 is treated as unsigned 16-bit, and a 64-bit result is obtained

long long _dpackh2 (long long src1, long long src2); DPACKH2

long long _dpackh4 (long long src1, long long src2); DPACKH4 executes 2 PACKH4 in parallel

long long _dpacklh2 (long long src1, long long src2); DPACKLH2

long long _dpacklh4 (unsigned src1, unsigned src2); DPACKLH4 executes PACKH4 and PACKL4 in parallel

long long _dpackl2 (long long src1, long long src2); DPACKL2

long long _dpackl4 (long long src1, long long src2); DPACKL4 executes 2 PACKL4 in parallel

long long _dsadd (long long src1, long long src2); DSADD adds two signed 32-bit numbers in src1 to two signed 32-bit numbers in src2, and the result is saturated

long long _dsadd2 (long long src1, long long src2); the result of DSADD2 is saturated to [-2^15 2^15]

long long _dshl (long long src1, unsignedsrc2); DSHL shifts the two 32 bits in long long to the left, and complements them with 0 (signed 32 bits)

long long _dshl2 (long long src1, unsigned src2); DSHL2 shifts the 4 16 bits in long long to the left, and complements them with 0 (signed 16 bits)

long long _dshr (long long src1, unsignedsrc2); DSHR right shift, sign bit complement (signed 32-bit)

long long _dshr2 (long long src1, unsigned src2); DSHR2 right shift, sign bit complement (signed 16 bits)

long long _dshru (long long src1, unsigned src2); DSHRU right shift, 0 complement (unsigned 32-bit)

long long _dshru2 (long long src1, unsigned src2); DSHRU2 right shift, 0 complement (unsigned 16-bit)

__x128_t _dsmpy2 (long long src1, long long src2); see figure for DSMPY2

long long _dspacku4 (long long src1, long long src2); DSPACKU4 performs 2 SPACK4 in parallel

long long _dspint (__float2_t src); DSPINT converts 2 single-precision numbers in src into 2 integers

unsigned _dspinth (__float2_t src); DSPINTH converts two single-precision floating-point numbers of src_e and src_o to a signed 16-bit integer

long long _dssub (long long src1, long long src2); DSSUB subtracts two 32-bit signed numbers in src1 from two 32-bit signed numbers in src2, and the result is saturated [-2^31 (2^ 31)-1]

long long _dssub2 (long long src1, long long src2); DSSUB 24 16-bit signed numbers are subtracted, and the result is saturated [-2^15 (2^15)-1]

long long _dsub (long long src1, long longsrc2); DSUB is not saturated

long long _dsub2 (long long src1, long long src2); DSUB2 is not saturated

__float2_t _dsubsp (__float2_t src1, __float2_t src2); DSUBSP 32-bit single-precision number subtraction

long long _dxpnd2 (unsigned src); DXPND2

long long _dxpnd4 (unsigned src); DXPND4

__float2_t _fdmvd_f2 (float src1, floatsrc2); see MVD for DMVD

int _land (int src1, int src2); LAND logical and

int _landn (int src1, int src2);LANDN

int _lor (int src1, int src2); LOR logical OR

void _mfence(); MFENCE delays the instruction fetch pipeline until the busy flag of the memory system is reduced

double_mpysp2dp (float src1, float src2); MPYSP2DP multiplies two floats to get a double result

double_mpyspdp (float src1, doublesrc2); MPYSPDP 1 float×1 double to get 1 double

long long _mpyu2 (unsigned src1, unsigned src2); MPYU 22 unsigned 16 digits × 2 unsigned 16 digits to get 2 unsigned 32 digits

__x128_t _qmpy32 (__x128_t src1, __x128_t src2); QMPY324 Road: 32-bit signed × 32-bit signed, the lower 32 bits of the result are put into dst

__x128_t _qmpysp (__x128_t src1, __x128_t src2); QMPYSP

__x128_t _qsmpy32r1 (__x128_t src1, __x128_t src2); QSMPY32R14 road: signed 32 bits × signed 32 bits, get 32 ​​bits. The difference from QMOY32 is saturation round

unsigned _shl2 (unsigned src1, unsignedsrc2); SHL 22 signed 16 bits, shifted left. The lower 4 bits of Src2 are the number of shifted bits. The result is also treated as signed 16 bits

long long _unpkbu4 (unsigned src); UNPKBU4 expands unsigned 8-bit to unsigned 16-bit

long long _unpkh2 (unsigned src); UNPKH2 signed 16-bit sign extension

long long _unpkhu2 (unsigned src); UNPKHU2 unsigned 16 bits for 0 extension

long long _xorll_c (scst5 immediate src1, long long src2); XOR logic exclusive OR

Conclusion

This is the end of the related introduction about DSP320C6000. Please correct me if there are any deficiencies.

Related reading recommendations: Optimized design of Viterbi decoding program based on TMS320C6000 series DSP

Related reading recommendations: tms320c6000 series dsp programming tools and guides

LED Floor Panels

Led Floor Panels,Led Dance Floor Panel,48W Led Light Panel,Floor White Uplight Panel Led

Kindwin Technology (H.K.) Limited , https://www.szktlled.com