8
Blackfin (ADSP-BFXXX) Reference V2.3.1 13 th November 2009, [email protected] PROGRAMMING MODEL R0 to R7 Data registers R0, R1, R2, R3 volatile P0 to P5 Pointer registers P0, P1. P2 volatile FP Frame pointer SP Stack pointer A0, A1 Accumulator registers LC0, LC1 Loop counters DSP REGISTERS – ALL VOLATILE I0 to I3 index registers (Ireg) M0 to M3 modify registers (Mreg) B0 to B3 base registers L0 to L3 length registers Breg start of circular buffer of length Lreg using post-increment register Mreg with index register Ireg NOTATION CONVENTION imm signed immediate uimm unsigned immediate imm3 -4 to +3 uimm3 0 to 7 reg Any register R0 to R7, P0 to P5 dreg Any data register R0 to R7 Preg Any pointer register P0 to P5 statbit: AZ, AN, AC0, AC1, V, VS, AV0, AV0S, AV1, AV1S, AQ reg_lo low part of register (R0.L) reg_hi high part of register (P0.H) PARAMETER PASSING EXAMPLE #define INPAR4_ON_STACK 20 // NOT IN R3 #define INPAR3_SPACE_ON_STACK 16 // In R2 #define INPAR2_SPACE_ON_STACK 12 // In R1 #define INPAR1_SPACE_ON_STACK 8 // In R0 #define RETS_LOCATION_ON_STACK 4 #define OLD_FP_LOCATION_ON_STACK 0 // Relative to FP #define SAVED_P3 20 // Relative to SP #define SAVED_P4 16 #define OUTPAR4_ON_STACK 12 // NOT IN R3 #define OUTPAR3_SPACE_ON_STACK 8 // In R2 #define OUTPAR2_SPACE_ON_STACK 4 // In R1 #define OUTPAR1_SPACE_ON_STACK 0 // In R0 .extern _Somewhere; .extern _Subroutine; section program; .global _Foo; // void Foo(INPAR1, INPAR2, INPAR3, INPAR4) _Foo: LINK 24; // 16 spaces for new stack + 2 saved registers [SP + SAVED_P4] = P4; // Save non-volatile registers on the stack P4.L = lo(_Somewhere); // Point to memory location _Somewhere P4.H = hi(_Somewhere); // Reference resolved by linker since .extern [FP + INPAR1_SPACE_ON_STACK] = R0; // Save for later [FP + INPAR3_SPACE_ON_STACK] = R2; // Save for later R0 = [FP + INPAR4_ON_STACK]; // OUTPAR4 = INPAR4 [SP + OUTPAR4_ON_STACK] = R0; R2 = -6 (X); // Sign extend OUTPAR3value = 0xFFFFFFFA; // R1 = R1; // OUTPAR2 = INPAR2 R0 = -6 (Z); // Zero extend OUTPAR1 value = 0x0000FFFA; CALL _Subroutine; // Subroutine(0xFFFF, INPAR2, 0xFFFF, INPAR4) W[P4] = R0; // Store return value as 16-bit P4.L = lo(FIO_FLAG_D); P4.H = hi(FIO_FLAG_D); // Constant from // <defsBF533.h> requires hi/lo macros P4 = [SP + SAVED_P4]; Also see P0 = [FP + 4]; // Get RETS UNLINK UNLINK RTS JUMP (P0); // Faster? PROGRAM FLOW INSTRUCTIONS JUMP User_Label PC replaced by address of User_Label JUMP (Preg) PC replaced by value in P-register IF CC Jump UserLabel if CC = 1 PC replaced by address of User_Label IF !CC Jump UserLabel if CC = 0 PC replaced by address of User_Label IF CC Jump UserLabel (bp) IF !CC Jump UserLabel (bp) are versions where the branch is predicted to be taken. Correctly predicting branches improves pipeline performance CALL User_Label PC replaced by address of User_Label next instruction Æ RETS CALL (Preg) PC replaced by value in P-register next instructions Æ RETS RTS return from subroutine (RETS) RTI return from interrupt (RETI) , RTX return from exception (RETX) RTN return from NME (RETN) RTE return from emulation (RETE) Return register used in brackets Loop loop_name loopcounter; Loop_begin loop_name; 1 st instr. Loop_end loop_name; last instruction Lsetup(Label_1 st instruction, Label_last) loopcounter; Can use Loopcounter, Loopcounter = Preg or Loopcounter = Preg >> 1 LTn, LBn, LCn (Loop_Top, Loop_Bottom, Loop_Counter) can be set directly LOAD / STORE INSTRUCTIONS reg_lo = uimm16; reg_hi = uimm16; half-word loads reg = uimm16 (Z); zero extended to 32 bits reg = imm16 (X); signed extended to 32 bits (also imm7 version) Loading 32 bit values reg.L = uimm32 & 0xFFFF; reg.H =(uimm32 >>16) & 0xFFFF; BUT .IMPORT value; reg.L = value; reg.H = value; (half-word correct) Preg = [ indirect_address ]; [indirect_address] = Pref; where indirect address is Preg, Preg++, Preg--, Preg + offset, Preg – offset, FP – offset Offsets factor of 4 Dreg = [ indirect address ]; [indirect_address] = Dreg; where indirect address is Preg, Preg++, Preg--, Preg + small / large offset, Preg – large offset, FP – offset. Preg ++ Preg, Ireg, Ireg++. Ireg--, Ireg ++ Mreg Dreg = W [ indirect address ] (Z); zero-extend half word fetch Dreg = W [ indirect address ] (X); sign-extend half word fetch Dreg = B[indirect_address] (Z); Dreg = B[indirect_address] (X) where indirect address is Preg, Preg++, Preg--, Preg + offset, Preg - offset, Word access only Preg ++ Preg offset factor of 2 Dreg_lo = W[indirect_address]; Dreg_hi = W[indirect_address]; W[indirect_addres] = Dreg_lo; W[indirect_address] = Dreg_hi; where indirect address is Ireg, Ireg++, Ireg--, Preg, Preg ++ Preg COMPARE INSTRUCTIONS CC = Operand_1 == Operand_2; CC = Operand_1 <= Operand_2; signed compare CC = Operand_1 <= Operand_2 (UI); unsigned compare CC = Operand_1 < Operand_2; signed compare CC = Operand_1 < Operand_2 (UI); unsigned compare Compare Data Registers -- Not parallel (16-bit) Operand_1 Dreg Operand_2 Dreg or small constant where small constant is imm3 or uimm3 COMPARE (CONTINUED) Compare Pointer Registers -- Not parallel (16-bit) Operand_1 Preg Operand_2 Preg or small constant where small constant is imm3 or uimm3 Compare Accumulator Registers -- Not parallel (16-bit) Operand_1 A0 Operand_2 A1 Always signed compares MOVE CC INSTRUCTIONS Dest OP CC Dest Dreg, statbit CC OP Source; Source Dreg, statbit OP =, |=, &=, ^= e.g. R0 |= CC; Note: CC = Dreg, CC = 1 if Dreg != 0 NEGATE CC INSTRUCTIONS CC = ! CC; MOVE INSTRUCTIONS genreg = genreg ; genreg = dagreg ; dagreg = genreg ; dagreg = dagreg ; genreg = USP ; USP = genreg ; Dreg = sysreg ; /* sysreg to 32-bit D-register */ sysreg = Dreg ; /* 32-bit D-register to sysreg */ sysreg = Preg ; /* 32-bit P-register to sysreg */ sysreg = USP; A0 = A1 ; /* move 40-bit Accumulator value */ A1 = A0 ; /* move 40-bit Accumulator value */ A0 = Dreg ; /* 32-bit D-register to 40-bit A0, sign extended */ A1 = Dreg ; /* 32-bit D-register to 40-bit A1, sign extended */ Accumulator to D-register Move: Dreg_even = A0 (opt_mode) ; /* move 32-bit A0.W to even Dreg */ Dreg_odd = A1 (opt_mode) ; /* move 32-bit A1.W to odd Dreg */ Dreg_even = A0, Dreg_odd = A1 (opt_mode) ; /* move both Accumulators to a register pair */ Dreg_odd = A1, Dreg_even = A0 (opt_mode) ; /* move both Accumulators to a register pair */ IF CC DPreg = DPreg ; /* move if CC = 1 */ Dreg, Preg, SP, FP IF ! CC DPreg = DPreg ; /* move if CC = 0 */ Dreg, Preg, SP, FP Dreg = Dreg_lo (Z) ; Dreg = Dreg_lo (X) ; Dreg = Dreg.B (Z); Dreg = Dreg.B (X); lowest 8 bits Acc.X = Dreg_lo; Least significant 8-bits moved Dreg_lo = Acc.X; 8 bits moved, sign extended Acc.L = Dreg_lo; Least significant 16-bits moved Dreg_lo = Acc.L; 16 bits moved Acc.H = Dreg_hi; Most significant 16-bits moved Dreg_hi = Acc.H; 16 bits moved Accumulator to Half D-register Move supports the following options Signed fraction format (default). Unsigned fraction format (saturated) (FU). Signed and unsigned integer formats (IS) (IU). Signed fraction with truncation (T), Signed fraction with scaling and rounding (S2RND), Signed integer with scaling (ISS2), Signed integer with high word extract (IH) MORE INFO TO BE ADDED

Blackfin (ADSP-BFXXX) Reference PROGRAM FLOW …people.ucalgary.ca/.../BlackfinReferenceSheet_11by8_1Sept2010.pdf · Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, [email protected]

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Blackfin (ADSP-BFXXX) Reference PROGRAM FLOW …people.ucalgary.ca/.../BlackfinReferenceSheet_11by8_1Sept2010.pdf · Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, smithmr@ucalgary.ca

Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, [email protected]

PROGRAMMING MODEL R0 to R7 Data registers R0, R1, R2, R3 volatile P0 to P5 Pointer registers P0, P1. P2 volatile FP Frame pointer SP Stack pointer A0, A1 Accumulator registers LC0, LC1 Loop counters DSP REGISTERS – ALL VOLATILE I0 to I3 index registers (Ireg) M0 to M3 modify registers (Mreg) B0 to B3 base registers L0 to L3 length registers Breg start of circular buffer of length Lreg using post-increment register Mreg with index register Ireg NOTATION CONVENTION imm signed immediate uimm unsigned immediate imm3 -4 to +3 uimm3 0 to 7 reg Any register R0 to R7, P0 to P5 dreg Any data register R0 to R7 Preg Any pointer register P0 to P5 statbit: AZ, AN, AC0, AC1, V, VS, AV0, AV0S, AV1, AV1S, AQ reg_lo low part of register (R0.L) reg_hi high part of register (P0.H) PARAMETER PASSING EXAMPLE #define INPAR4_ON_STACK 20 // NOT IN R3 #define INPAR3_SPACE_ON_STACK 16 // In R2 #define INPAR2_SPACE_ON_STACK 12 // In R1 #define INPAR1_SPACE_ON_STACK 8 // In R0 #define RETS_LOCATION_ON_STACK 4 #define OLD_FP_LOCATION_ON_STACK 0 // Relative to FP #define SAVED_P3 20 // Relative to SP #define SAVED_P4 16 #define OUTPAR4_ON_STACK 12 // NOT IN R3 #define OUTPAR3_SPACE_ON_STACK 8 // In R2 #define OUTPAR2_SPACE_ON_STACK 4 // In R1 #define OUTPAR1_SPACE_ON_STACK 0 // In R0 .extern _Somewhere; .extern _Subroutine; section program; .global _Foo; // void Foo(INPAR1, INPAR2, INPAR3, INPAR4) _Foo: LINK 24; // 16 spaces for new stack + 2 saved registers [SP + SAVED_P4] = P4; // Save non-volatile registers on the stack P4.L = lo(_Somewhere); // Point to memory location _Somewhere P4.H = hi(_Somewhere); // Reference resolved by linker since .extern [FP + INPAR1_SPACE_ON_STACK] = R0; // Save for later [FP + INPAR3_SPACE_ON_STACK] = R2; // Save for later R0 = [FP + INPAR4_ON_STACK]; // OUTPAR4 = INPAR4 [SP + OUTPAR4_ON_STACK] = R0; R2 = -6 (X); // Sign extend OUTPAR3value = 0xFFFFFFFA; // R1 = R1; // OUTPAR2 = INPAR2 R0 = -6 (Z); // Zero extend OUTPAR1 value = 0x0000FFFA; CALL _Subroutine; // Subroutine(0xFFFF, INPAR2, 0xFFFF, INPAR4) W[P4] = R0; // Store return value as 16-bit P4.L = lo(FIO_FLAG_D); P4.H = hi(FIO_FLAG_D); // Constant from // <defsBF533.h> requires hi/lo macros P4 = [SP + SAVED_P4]; Also see P0 = [FP + 4]; // Get RETS UNLINK UNLINK RTS JUMP (P0); // Faster?

PROGRAM FLOW INSTRUCTIONS JUMP User_Label PC replaced by address of User_Label JUMP (Preg) PC replaced by value in P-register IF CC Jump UserLabel if CC = 1 PC replaced by address of User_Label IF !CC Jump UserLabel if CC = 0 PC replaced by address of User_Label IF CC Jump UserLabel (bp) IF !CC Jump UserLabel (bp) are versions where the branch is predicted to be taken. Correctly predicting branches improves pipeline performance CALL User_Label PC replaced by address of User_Label next instruction RETS CALL (Preg) PC replaced by value in P-register next instructions RETS RTS return from subroutine (RETS) RTI return from interrupt (RETI) , RTX return from exception (RETX) RTN return from NME (RETN) RTE return from emulation (RETE) Return register used in brackets Loop loop_name loopcounter; Loop_begin loop_name; 1st instr. Loop_end loop_name; last instruction Lsetup(Label_1stinstruction, Label_last) loopcounter; Can use Loopcounter, Loopcounter = Preg or Loopcounter = Preg >> 1 LTn, LBn, LCn (Loop_Top, Loop_Bottom, Loop_Counter) can be set directly

LOAD / STORE INSTRUCTIONS reg_lo = uimm16; reg_hi = uimm16; half-word loads reg = uimm16 (Z); zero extended to 32 bits reg = imm16 (X); signed extended to 32 bits (also imm7 version) Loading 32 bit values reg.L = uimm32 & 0xFFFF; reg.H =(uimm32 >>16) & 0xFFFF; BUT .IMPORT value; reg.L = value; reg.H = value; (half-word correct) Preg = [ indirect_address ]; [indirect_address] = Pref; where indirect address is Preg, Preg++, Preg--, Preg + offset, Preg – offset, FP – offset Offsets factor of 4 Dreg = [ indirect address ]; [indirect_address] = Dreg; where indirect address is Preg, Preg++, Preg--, Preg + small / large offset, Preg – large offset, FP – offset. Preg ++ Preg, Ireg, Ireg++. Ireg--, Ireg ++ Mreg Dreg = W [ indirect address ] (Z); zero-extend half word fetch Dreg = W [ indirect address ] (X); sign-extend half word fetch Dreg = B[indirect_address] (Z); Dreg = B[indirect_address] (X) where indirect address is Preg, Preg++, Preg--, Preg + offset, Preg - offset, Word access only Preg ++ Preg offset factor of 2 Dreg_lo = W[indirect_address]; Dreg_hi = W[indirect_address]; W[indirect_addres] = Dreg_lo; W[indirect_address] = Dreg_hi; where indirect address is Ireg, Ireg++, Ireg--, Preg, Preg ++ Preg

COMPARE INSTRUCTIONS CC = Operand_1 == Operand_2; CC = Operand_1 <= Operand_2; signed compare CC = Operand_1 <= Operand_2 (UI); unsigned compare CC = Operand_1 < Operand_2; signed compare CC = Operand_1 < Operand_2 (UI); unsigned compare Compare Data Registers -- Not parallel (16-bit) Operand_1 Dreg Operand_2 Dreg or small constant where small constant is imm3 or uimm3

COMPARE (CONTINUED) Compare Pointer Registers -- Not parallel (16-bit) Operand_1 Preg Operand_2 Preg or small constant where small constant is imm3 or uimm3 Compare Accumulator Registers -- Not parallel (16-bit) Operand_1 A0 Operand_2 A1 Always signed compares

MOVE CC INSTRUCTIONS Dest OP CC Dest Dreg, statbit CC OP Source; Source Dreg, statbit OP =, |=, &=, ^= e.g. R0 |= CC; Note: CC = Dreg, CC = 1 if Dreg != 0 NEGATE CC INSTRUCTIONS CC = ! CC; MOVE INSTRUCTIONS genreg = genreg ; genreg = dagreg ; dagreg = genreg ; dagreg = dagreg ; genreg = USP ; USP = genreg ; Dreg = sysreg ; /* sysreg to 32-bit D-register */ sysreg = Dreg ; /* 32-bit D-register to sysreg */ sysreg = Preg ; /* 32-bit P-register to sysreg */ sysreg = USP; A0 = A1 ; /* move 40-bit Accumulator value */ A1 = A0 ; /* move 40-bit Accumulator value */ A0 = Dreg ; /* 32-bit D-register to 40-bit A0, sign extended */ A1 = Dreg ; /* 32-bit D-register to 40-bit A1, sign extended */ Accumulator to D-register Move: Dreg_even = A0 (opt_mode) ; /* move 32-bit A0.W to even Dreg */ Dreg_odd = A1 (opt_mode) ; /* move 32-bit A1.W to odd Dreg */ Dreg_even = A0, Dreg_odd = A1 (opt_mode) ; /* move both Accumulators to a register pair */ Dreg_odd = A1, Dreg_even = A0 (opt_mode) ; /* move both Accumulators to a register pair */ IF CC DPreg = DPreg ; /* move if CC = 1 */ Dreg, Preg, SP, FP IF ! CC DPreg = DPreg ; /* move if CC = 0 */ Dreg, Preg, SP, FP Dreg = Dreg_lo (Z) ; Dreg = Dreg_lo (X) ; Dreg = Dreg.B (Z); Dreg = Dreg.B (X); lowest 8 bits Acc.X = Dreg_lo; Least significant 8-bits moved Dreg_lo = Acc.X; 8 bits moved, sign extended Acc.L = Dreg_lo; Least significant 16-bits moved Dreg_lo = Acc.L; 16 bits moved Acc.H = Dreg_hi; Most significant 16-bits moved Dreg_hi = Acc.H; 16 bits moved Accumulator to Half D-register Move supports the following options Signed fraction format (default). Unsigned fraction format (saturated) (FU). Signed and unsigned integer formats (IS) (IU). Signed fraction with truncation (T), Signed fraction with scaling and rounding (S2RND), Signed integer with scaling (ISS2), Signed integer with high word extract (IH) MORE INFO TO BE ADDED

Page 2: Blackfin (ADSP-BFXXX) Reference PROGRAM FLOW …people.ucalgary.ca/.../BlackfinReferenceSheet_11by8_1Sept2010.pdf · Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, smithmr@ucalgary.ca

STACK INSTRUCTIONS -- SP point to next used location [ -- SP] = allreg; allreg = [SP ++]; [ -- SP] = ( R7 : Dreglim, P5 : Preglimit) – or Dreg and Preg on their own LINK uimm (Manual says minimum value is 8, but LINK 0 and LINK 4 seem OK) Saves RETS and FP on stack, copies SP into FP and then decrements SP UNLINK causes FP SP then Mem[SP ++] FP, Mem[SP++] RETS

LOGICAL INSTRUCTIONS Dreg = Dreg1 LOGICAL_OP Dreg2; LOGICAL_OP - &, |, ^ Dreg = ~Dreg1; complement Also BXOR and BXORSHIFT -- more later BIT INSTRUCTIONS BitInstruction(Dreg, bit position) where bit_position is 0 to 31 BitInstruction is BITCLR (clear), BITSET (set), BITTGL (toggle), CC =BITTST (Dreg, bit position) Bit test CC = !BITTST (Dreg, bit position) Bit test R0 = R1.B(X); R0 = R1.B(Z); // Extract and sign extend a byte value // CAN”T DO MATH ON A BYTE VALUE DIRECTLY Dreg = DEPOSIT ( backgroundDreg, foregroundDreg ) ; Dreg = DEPOSIT ( Dreg, Dreg ) (X) ; /* sign-extended */ ARITHMETIC INSTRUCTIONS dest_reg = ABS src_reg; dest_reg = src_reg_1 + src_reg_2; NOTE: dest_reg.LorH = src_reg1.LorH + src_reg2.LorH (mode); mode = (NS) or (S) // Arithmetic is saturating or non-saturating (normal math is NS) NOTE: dest_reg = src_reg_1 +|- srec_reg_2; H + H and L + L operations both done // Can also do + | +, + | -, - | +, - | - Dreg_lo_hi = Dreg + Dreg (RND20) ; STEP 1: Downshift by 4 and then Dreg_lo_hi = Dreg - Dreg (RND20) ; STEP 2: perform operation, round top 16 bits STEP 3: and use top 16 bits – fractional number Dreg_lo_hi = Dreg + Dreg (RND12) ; STEP 1: Upshift by 4 and then Dreg_lo_hi = Dreg - Dreg (RND12) ; STEP 2: perform operation, STEP 3: round and use top 16 bits Dreg = MAX ( Dreg , Dreg ) ; Dreg = MIN ( Dreg , Dreg ) ; Preg -= Preg ; Ireg -= Mreg ; Preg += Preg (BREV) ; Ireg += Mreg (opt_brev) ; dest_reg = src_reg_0 * src_reg_1 (opt_mode) (16 bit mult) Dreg *= Dreg ; (32 bit mult) accumulator = src_reg_0 * src_reg_1 (opt_mode) accumulator += src_reg_0 * src_reg_1 (opt_mode) accumulator –= src_reg_0 * src_reg_1 (opt_mode) dest_reg_half = (accumulator = src_reg_0 * src_reg_1) (opt_mode) dest_reg_half = (accumulator += src_reg_0 * src_reg_1) (opt_mode) dest_reg_half = (accumulator –= src_reg_0 * src_reg_1) (opt_mode) dest_reg = (accumulator = src_reg_0 * src_reg_1) (opt_mode) dest_reg = (accumulator += src_reg_0 * src_reg_1) (opt_mode) dest_reg = (accumulator –= src_reg_0 * src_reg_1) (opt_mode) dest_reg = – src_reg; dest_accumulator = – src_accumulator dest_reg = src_reg (RND) (32 bit to 16 bit round and saturate) accumulator = accumulator (S) dest_reg = SIGNBITS sample_register dest_reg = src_reg_1 - src_reg_2; Ireg -= 2 ; Ireg -= 4 ; ROTATE dest_reg = ROT src_reg BY rotate_magnitude; accumulator_new = ROT accumulator_old BY rotate_magnitude;

SHIFT / ROTATE INSTRUCTIONS dest_pntr = (dest_pntr + src_reg) << 1; Down shift not allowed dest_pntr = (dest_pntr + src_reg) << 2; dest_reg = (dest_reg + src_reg) << 1; dest_reg = (dest_reg + src_reg) << 2; dest_pntr = adder_pntr + ( src_pntr << 1 ); dest_pntr = adder_pntr + ( src_pntr << 2 ); ARITHMETIC SHIFT ASHIFT or >>> dest_reg >>>= shift_magnitude; dest_reg = src_reg >>> shift_magnitude (opt_sat); dest_reg = src_reg << shift_magnitude (S); accumulator = accumulator >>> shift_magnitude; dest_reg = ASHIFT src_reg BY shift_magnitude (opt_sat); accumulator = ASHIFT accumulator BY shift_magnitude; NOTE – ASHIFT using registershave to use register.L R0 = ASHIFT R1 BY R2.L; LOGICAL SHIFT LSHIFT or >> dest_pntr = src_pntr >> 1; dest_pntr = src_pntr << 1; dest_pntr = src_pntr >> 2; dest_pntr = src_pntr << 2; dest_reg >>= shift_magnitude; dest_reg <<= shift_magnitude; dest_reg = src_reg >> shift_magnitude; dest_reg = src_reg << shift_magnitude; dest_reg = LSHIFT src_reg BY shift_magnitude; PARALLEL OPERATION EXAMPLES 32-bit ALU/MAC instruction || 16-bit instruction || 16-bit instruction ; saa (r1:0, r3:2) || r0=[i0++] || r2=[i1++] ; mnop || r1 = [i0++] || r3 = [i1++] ; r7.h=r7.l=sign(r2.h)*r3.h + sign(r2.l)*r3.l || i0+=m3 || r0=[i0] ; NOTE: If two parallel memory operations, only one can involve a Preg NOTE: If two parallel memory operations, then only one can be a write EXTERNAL EVENT MANAGEMENT NOP 16-bit NOP MNOP 32-bit NOP e.g. MNOP || NOP || NOP ; IDLE; CSYNC; (core sync), SSYNC; (system sync), CLI Dreg (clear interrupts,and save old interrupts to Dreg. STI Dreg (set interrupts from Dreg), RAISE uimm4 (force interrupt – effectively software interrupt of any interrupt) EXCPT uimm4 (force exception – effectively software interrupt of any exception) TESTSET (Preg) The Test and Set Byte (Atomic) instruction loads an indirectly addressed memory byte, tests whether it is zero, then sets the most significant bit of the memory byte without affecting any other bits. If the byte is originally zero, the instruction sets the CC bit. If the byte is originally nonzero the instruction clears the CC bit. The sequence of this memory transaction is atomic – meaning it can’t be blocked by interrupts as would the sequence Read memory into R0, test R0, if CC zero then set R0 = 1, Store R0 back to memory. VIDEO PIXEL INSTRUCTIONS ALIGN8, ALIGN16, ALIGN24, DISALGNEXCPT, BYTEOP3P (Dual 16-Bit Add / Clip), Dual 16-Bit Accumulator Extraction with Addition, BYTEOP16P (Quad 8-Bit Add), BYTEOP1P (Quad 8-Bit Average – Byte), BYTEOP2P (Quad 8-Bit Average – Half-Word), BYTEPACK (Quad 8-Bit Pack), BYTEOP16M (Quad 8-Bit Subtract), SAA (Quad 8-Bit Subtract-Absolute-Accumulate), BYTEUNPACK (Quad 8-Bit Unpack)

VECTOR INSTRUCTIONS basically 2 16 bit ops Add on Sign, VIT_MAX (Compare-Select), Vector Arithmetic Shift, Vector Logical Shift, Vector MIN, Vector Multiply, Vector Multiply and Multiply-Accumulate, Vector Negate (Two’s Complement), Vector PACK, Vector SEARCH Example Vector Add / Subtract dest = src_reg_0 +|+ src_reg_1; Example Vector MAX dest_reg = MAX ( src_reg_0, src_reg_1 ) (V) Example Vector ABS dest_reg = ABS source_reg (V) TEST DRIVEN DEVELOPMENT SYNTAX EXAMPLES // This function is developed bool WaitForAWhileASM(short int time_wanted); // Assume this function is already written long int CalculateTwiceAccuracy(short int time1, short int time2); This function returns the timing accuracy 100 * (2* time1 –time2) / time2 TEST_CONTROLT(TEST_GROUP_NAME); TEST(Q2_TESTS, DEVELOPER_TEST) { // If the parameter time is less than 0 return false (because it can’t be done); CHECK( WaitForAWhileASM(-1) = = false ); CHECK( WaitForAWhileASM(1000) = = true ); // When you make the parameter time bigger, then the subroutine takes longer to return // Needs to be checked for both a long and a short time // 2 * Time for WaitForAWhileASM(X) = time for WaitForAWhileASM(2X) ; // The accuracy of timing should be better than 2% or 2 part in 100. unsigned long int time1, time2;

time1 = MEASURE_EXECUTION_TIME(WaitForAWhileASM(400)); time2 = MEASURE_EXECUTION_TIME(WaitForAWhileASM(800)); CHECK( CalculateTwiceAccuracy(time1, time2) < 2); // 1 is 1%, 2 is 2% time1 = MEASURE_EXECUTION_TIME(WaitForAWhileASM(1)); time2 = MEASURE_EXECUTION_TIME(WaitForAWhileASM(2)); CHECK( CalculateTwiceAccuracy(time1, time2) < 2); // 1 is 1%, 2 is 2%

} TEST(Q4_TEST, DEVELOPER_TEST) { StopCoreTimer( ); // Stop the timer InitializeCoreTimer(0x200000, 0x200000, 1); // Set some sensible values into the core timer registers StartCoreTimer(3 ); // Start the timer long int time1 = ReadCoreTimerAndResetASM(0x2000, 0x2000); // Assert statement #1 – check that the core timer value has got smaller since started CHECK(time1 < 0x200000); // Assert statement #2 – From the code we expect 2 writes and 1 read (in total) to occur // when we run the ReadCoreTimerAndResetASM( ) function WatchDataClass<unsigned long int> coretimer_access( 2, (unsigned long int *) pTCOUNT, (unsigned long int *) pTPERIOD); // NOTE that the 2 in this line means we are specifying 2 memory locations for the // WatchDataClass to keep track on. // NOTE that the WatchDataClass and WATCH_MEMORY_RANGE keep track of ALL // memory read and write operations that occur in the memory locations between the // addresses for TCOUNT and TPERIOD WATCH_MEMORY_RANGE(coretimer_access, ReadCoreTimerAndResetASM (0x3000, 0x3000)); CHECK(coretimer_access.getReadsWrites() = = 3); // Assert statement #3 – get the final value of the core-timer registers // getFinalValue(0) would get the final value of the TCOUNT register // getFinalValue(1) would get the final value of the TPERIOD register // It is the order in the WatchDataClass line that counts, not the parameter order in // ReadCoreTimerAndResetASM( ) CHECK(coretimer_access.getFinalValue(1) = = 0x3000); } TEST_FILE_RUN_NOTIDICATION(TEST_GROUP_NAME);

Page 3: Blackfin (ADSP-BFXXX) Reference PROGRAM FLOW …people.ucalgary.ca/.../BlackfinReferenceSheet_11by8_1Sept2010.pdf · Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, smithmr@ucalgary.ca

E-Unit functions and constants Header information #define EMBEDDEDBEDUNIT_LITE #include <EmbeddedUnit/EmbeddedUnit.h> Basic EmbeddedUnit TEST_CONTROLT(TEST_GROUP_NAME); TEST(TEST_NAME) ; TEST_FILE_RUN_NOTIDICATION(TEST_GROUP_NAME); CHECK(expression) CHECK(a == b); XF_CHECK(expression) CHECK_EQUAL(expected, actual) XF_CHECK_EQUAL(expected, actual) CHECK_CLOSE(expected, actual, tolerance) CHECK_ARRAY_EQUAL(expected, actual, count) CHECK_ARRAY_CLOSE(expected, actual, count, tolerance) CHECK_ARRAY2D_CLOSE(expected, actual, rows, columns, tolerance) CHECK_THROW(expression, ExpectedExceptionType) CHECK_ASSERT(expression) REPORT(msg) TIME_CONSTRAINT(ms), TIME_CONSTRAINT_US TIME_CONSTRAINT_EXEMPT() MEASURE_EXECUTION_TIME(time) HARD_TIME_CONSTRAINT_TRY(ms), HARD_TIME_CONSTRAINT_CATCH() HARD_TIME_CONSTRAINT_END() MEMORY_CONSTRAINT(maxChange) MEMORY_CONSTRAINT_EXEMPT( ) void CodeCoverageStartLogging(int loopCompress = 2) void CodeCoverageStopLogging(void)

Lab. 1 File names expected InitFlashASM.asm void InitFlashASM( void) InitFlashPortASM.asm void InitFlashPortASM(void) InitializiGPIOFlagsASM.asm void Initialize_GPIOFlagsASM(void) ReadFlashLEDASM.asm unsigned short ReadFlashLEDASM(void) ReadGPIOFlagsASM.asm unsigned short ReadGPIOFlagsASM(void) WriteFlashLEDASM.asm void WriteFlashLEDASM(unsigned short) Lab. 2 Time Triggered Scheduler functions void TTCOS_Dispatch_Tasks(void) -- This is the 'dispatcher' function. When a task (function) is due to run, TTCOS_Dispatch_Tasks() will run it. This function must be called (repeatedly) from the main loop. unsigned char TTCOS_Add_Task(Do_X, delay, period) – Causes a task (function) to be executed at regular intervals after a fixed delay. unsigned char TTCOS_Delete_Task(const unsigned char TASK_INDEX) -- Removes a task from the scheduler. Note that this does *not* delete the associated function from memory: it simply means that it is no longer called by the scheduler. void TTCOS_Init_Scheduler(void) -- Scheduler initialisation function. Prepares scheduler data structures and sets up timer interrupts at required rate. You must call this function before using the scheduler. void TTCOS_Start_Scheduler(void) -- Starts the scheduler, by enabling interrupts. NOTE: Usually called after all regular tasks are added, to keep the tasks synchronised. NOTE: ONLY THE SCHEDULER INTERRUPT SHOULD BE ENABLED!!! void TTCOS_Update(void) -- This is the scheduler ISR Callback. It is called at a rate determined by the timer settings in SCH_Init_Scheduler(). void TTCOS_EnterLowPowerMode(void) – Cause processor to enter low power mode The following functions are processor dependent and need to be built during Lab2 void SetUpTimerInterrupt(void); void StartTimerInterrupts( void); void StopTimerInterrupts(void ); void EnterLowPowerMode(void);

Lab. 3 Additional Time Triggered Scheduler functions unsigned char TTCOS_Identify_TaskID(Do_X) -- This returns the Task ID of the task Do_X. This is very useful as you can then do unsigned char taskID = TTCOS_Identify_TaskID(Do_X) TTCOS_Delete_Task (taskID) to delete (stop) another task. I have added the following functions unsigned char TTCOS_Stop_Task(Do_X) which will stop another task task Do_X and unsigned char TTCOS_SelfStop_Task (Do_X) which enables the task Do_X to stop itself from running e.g. void Task_Foo( void) { // Stop task FlashLED5 TTCOS_StopTask(FlashLED5); // Restart it running at another flash rate TTCOS_Add_Task(FlashLED5, NO_DELAY, 10); // Stop this task Task_Foo from running again TTCOS_SelfStop_Task(Task_Foo); }

Page 4: Blackfin (ADSP-BFXXX) Reference PROGRAM FLOW …people.ucalgary.ca/.../BlackfinReferenceSheet_11by8_1Sept2010.pdf · Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, smithmr@ucalgary.ca

PROGRAMMABLE FLAGS (PF) REGISTERS

NOTE: FIO_FLAG_D BITS ARE SET DURING EDGE-TRIGGERED

INTERRUPTS AND MUST BE CLEARED INSIDE THE INTERRUPT SERVICE ROUTINE

WATCH-DOG TIMER

WATCH-DOG STATUS REGISTER – WDOG_STAT The 32-bit Watchdog Status register (WDOG_STAT) contains the current count value. Reads to WDOG_STAT return the current count value. When watchdog enabled, WDOG_STAT decrements by 1 on SCLK. When reaches 0, t timer stops counting and the event selected (WDOG_CTL) is generated. Timer disabled, write WDOG_CNT to pre-load WDOG_STAT. Timer enabled, write WDOG_STAT register loads it with WDOG_CNT.

PROGRAMMABLE FLAGS (PF) REGISTERS

THERE IS A FLAG TOGGLE REGISTER (FIO_FLAG_T) (WRITE 1 TO TOGGLE) AS WELL AS FLAG SET AND FLAG CLEAR

TO ENABLE A PF INTERRUPT , SET THE CORRESPONDING FLAG MASK BIT TO FORCE AN INTERRUPT (AFTER ENABLE) SET THE CORRESPONDING

INTERRUPT SET BIT

NOTE: THE FOLLOWING INTERRUPT REGISTERS HAVE A SIMILAR FORMAT FIO_MASKA_C (CLEAR – W1C) FIO_MASKA_T (TOGGLE – W1T)

WATCH-DOG TIMER

Also Toggle FIO_FLAG_T

Page 5: Blackfin (ADSP-BFXXX) Reference PROGRAM FLOW …people.ucalgary.ca/.../BlackfinReferenceSheet_11by8_1Sept2010.pdf · Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, smithmr@ucalgary.ca

INTERRUPT CONTROL

IPENd HAS SAME FORMAT AS ILAT BUT IS READ ONLY

SIC_ISR SYSTEM INTERRUPT STATUS REGISTER HAS THE SAME FORMAT EXCEPT THAT 0 MEANS DEASSERTED (NO INTERRUPT), 1 MEANS

ASSERTED – ALL BITS READ ONLY

INTERRUPT CONTROL

EXAMPLE 68K CODE Conditional operations

SUBROUTINE AND PARAMETER PASSING

MOTOROLA BRANCHES AFTER CMP D0, D1 BEQ (D0 == D1) ) BNE (D0 != D1) BGE (D1 >= D0) BGT (D1 > D0) BLE (D1 <= D0) BHS (D1 >= D0 UI) BHI (D1 > D0 UI) BLS (D1 <= D0 UI) BLO (D1 < D0 UI)

Page 6: Blackfin (ADSP-BFXXX) Reference PROGRAM FLOW …people.ucalgary.ca/.../BlackfinReferenceSheet_11by8_1Sept2010.pdf · Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, smithmr@ucalgary.ca

CORE TIMER

GENERAL PURPOSE

TIMER0, TIMER1, TIMER2 All three GP timers have equivalent registers

e.g. THUS TIMERX_WIDTH MEANS REPLACE X BY 0, 1, 2 TIMER0_WIDTH, TIMER1_WIDTH, TIMER2_WIDTH

TIMERx_COUNTER, x = 0, 1, 2 TIMERx_PERIOD, x = 0, 1, 2 TIMERx_WIDTH, x = 0, 1, 2 All have following format

Relationship between PERIOD, WIDTH and COUNT

GENERAL PURPOSE TIMER0, TIMER1, TIMER2

DYNAMIC POWER MANAGEMENT

Also Core Timer TPERIOD

Also TIMERx WIDTH

Page 7: Blackfin (ADSP-BFXXX) Reference PROGRAM FLOW …people.ucalgary.ca/.../BlackfinReferenceSheet_11by8_1Sept2010.pdf · Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, smithmr@ucalgary.ca

SPI INTERFACE

THERE IS ALSO READ ONLY SPI_RDBR (RECEIVE) AND SPI_SHADOW (COPY OF SPI_RDBR FOR TEST)

SPI HARDWARE

C++ STACK FRAME

Build the stack using LINK X;

Where X = sum of the number of bytes of outgoing arguments (16)

+ 4 * number of registers saved + 4 * number of long local variables + 2 * number of short local variables

+ 1 * number of char variables where X is round UP to nearest 4

NOTE: only the space for outgoing arguments has a

special location on the stack. The order of local variables and save space is the choice of the

programmer

Page 8: Blackfin (ADSP-BFXXX) Reference PROGRAM FLOW …people.ucalgary.ca/.../BlackfinReferenceSheet_11by8_1Sept2010.pdf · Blackfin (ADSP-BFXXX) Reference V2.3.1 13th November 2009, smithmr@ucalgary.ca

BLACKFIN PIPELINE DURING EX_INTERRUPT_HANDLER ( )

Note: ALL other interrupts blocked – not good)

If you save RETI to the stack ([--SP] = RETI) as part of your interrupt handler then the Blackfin hardware automatically

allows higher interrupts to interrupt lower interrupts.

From C++ EX_REENTRANT_HANDLER( )