1. MULTICORE INFORMATION AND POPULAR TEXAS INSTRUMENT MULTICORE
DSP PROCESSORS UDAY WALVEKAR MTECH NIELIT CALICUT
2. WHY MULTICORE Gap between processor and memory speeds.
Constraints in parallelism on instructions. Increased power
consumption by single core processors.
3. MULTICORE
4. MULTICORE A multi-core processor is a single computing
component with two or more independent actual processing units
(called "cores). Homogenious and heterogenious. Maximum possible
gain governed by AHMDAL'S law. Developed from instruction level
parallelism and thread level parallelism.
5. MULTICORE Share caches or not. Shared memory or message
passing inter-core communication methods. Partitoning.
Communication. Agglomeration. Mapping.
6. MULTICORE Simultaneous Multithreading(SMT) present in cores
and some times not.
7. MULTICORE
8. MULTICORE Cache coherence problem. Invalidation protocol
with snooping
9. MULTICORE PROGRAMMING Default affinity mask is all 1s. OS
scheduler tries to avoid migration as much as possible. Soft and
hard Affinity.
10. MULTICORE PROGRAMMING #include int sched_getaffinity(pid_t
pid, unsigned int len, unsigned long * mask); int
sched_setaffinity(pid_t pid, unsigned int len, unsigned long *
mask); win@win-Lenovo-Z580:~$ taskset -p 3108 pid 2763's current
affinity mask: f
11. MULTICORE TO DSP TI multicore DSP: TMS320C6474 .
TMS320C6674 (fixed and floating) TMS320C66AK2L06(arm+dsp+Keystone
2).
12. TMS320C6474 3 TMS320C64x+TM DSP Cores. Instruction Cycle
Time: 0.83 ns (1.2-GHz Device); 1 ns (1-GHz Device); 1.18 ns
(850-MHz Device). Cpu core structure same as c6713dsk. The complex
multiply (CMPY) instruction takes four 16-bit inputs and produces a
32-bit real and a 32-bit imaginary output. New instructions such as
32-bit multiplications, complex multiplications, packing, sorting,
bit manipulation, and 32-bit Galois field multiplication.
13. TMS320C6474 Boot Sequence DSP's internal memory is loaded
with program and data sections. The DSP's internal registers are
programmed with predetermined values. Public ROM Boot Core 0 is
released from reset and begins executing from the L3 ROM base
address and brings other cores out of reset by setting to 1 the
EVTPULSE4 bit (bit 4).
14. TMS320C6474
15. TMS320C6474
16. TMS320C6474
17. TMS320C6474
18. TMS320C6474 PERIPHERALS The primary purpose of the EDMA3 is
to service user programmed data transfers between two memory mapped
slave endpoints on the device. The interrupt controller allows for
up to 128 system events to be programmed to any of the twelve CPU
interrupt inputs. A race condition may exist when certain masters
write data to the DDR2 memory controller. The inter-integrated
circuit (I2C) module provides interface between a C64x+ DSP and
other devices compliant with Philips Semiconductors Inter-IC bus
(I2C bus) specification.
19. TMS320C6474 PERIPHERALS The Ethernet Media Access
Controller (EMAC) module provides an efficient interface between
the C6474 DSP core processor and the networked community. The
device contains the Semaphore module for the management of shared
resources of the DSP cores. The read-modify-write sequence and
Direct, InDirect accesses. Supports 3 masters and contains 32
semaphores. Frame synchronization handles timing and time alignment
on the device by coordinating timing between the DSP cores.
20. TMS320C6674 Four TMS320C66xTM DSP Core Subsystems. Each
with 1.0 GHz or 1.25 Ghz. Network Coprocessor. KeyStone
Architecture-Multicore Navigator, TeraNet, Multicore Shared Memory
Controller, and HyperLink. The C66x core incorporates 90 new
instructions (compared to the C64x+ core) targeted for floating
point and vector math oriented processing
21. 66AK2L06 Four TMS320C66x DSP Core Subsystems and Each With
1.0 GHz or 1.2 Ghz. Two ARM Cortex -A15 MPCoreTM Processors at Up
to 1.2 Ghz. Understanding.
22. CONCLUSION Realize the imporatance of multicore. Its has
large issues but even larger advantages. THANK YOU
23. TMS320C6474 PERIPHERALS The primary purpose of the EDMA3 is
to service user programmed data transfers between two memory mapped
slave endpoints on the device. The interrupt controller allows for
up to 128 system events to be programmed to any of the twelve CPU
interrupt inputs. A race condition may exist when certain masters
write data to the DDR2 memory controller. The inter-integrated
circuit (I2C) module provides interface between a C64x+ DSP and
other devices compliant with Philips Semiconductors Inter-IC bus
(I2C bus) specification.