BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

VLSI PROJECTS 2016Sno. CODE Topic Abstract YEAR

1.VLSI2016_0

1

A Low-Cost Low-Power All-Digital Spread-Spectrum Clock

Generator

In this brief, a low-cost low-power all-digital spread spectrum clock generator (ADSSCG) is presented. The proposed ADSSCG can provide an accurate programmable spreading ratio with process, voltage, and temperature variations. To maintain the frequency stability while performing triangular modulation, the fast-relocked mechanism is proposed. The proposed fast-relocked ADSSCG is implemented in a standard performance 90-nm CMOS process, and the active area is 200 µm × 200 µm. The experimental results show that the electromagnetic interference reduction is 14.61 dB with a 0.5% spreading ratio and 19.69 dB with a 2% spreading ratio at 270 MHz. The power consumption is 443 µW at 270 MHz with a 1.0 V power supply.

2015-2016

2.VLSI2016_0

2

A Combined SDC-SDF Architecture for Normal I/O

Pipelined Radix-2 FFT

We present an efficient combined single-path delay commutator-feedback (SDC-SDF) radix-2 pipelined fast Fourier transform architecture, which includes log2 N − 1 SDC stages, and 1 SDF stage. The SDC processing engine is proposed to achieve 100% hardware resource utilization by sharing the common arithmetic resource in the time-multiplexed approach, including both adders and multipliers. Thus, the required number of complex multipliers is reduced to log4 N − 0.5, compared with log2 N − 1 for the other radix-2 SDC/SDF architectures. In addition, the proposed architecture requires roughly minimum number of complex adders log2 N + 1 and complex delay memory 2N + 1.5 log2 N − 1.5.

2015-2016

3. VLSI2016_03

A Class of SEC-DED-DAEC Codes Derived From Orthogonal Latin

Square Codes

Radiation-induced soft errors are a major reliability concern for memories. To ensure that memory contents are not corrupted, single error correction double error detection (SEC-DED) codes are commonly used, however, in advanced technology nodes, soft errors frequently affect more than one memory bit. Since SEC-DED codes cannot correct multiple errors, they are often combined with interleaving. Interleaving, however, impacts memory design and performance and cannot always be used in small memories. This limitation has spurred interest in codes that can correct adjacent bit errors. In particular, several SEC-DED double adjacent error correction (SEC-DED-DAEC) codes have recently been proposed. Implementing DAEC has a cost as it impacts the decoder complexity and delay. Another issue is that most of the new SEC-DED-DAEC codes miscorrect some double nonadjacent bit errors. In this brief, a new class of SEC-DED-DAEC codes is derived from orthogonal latin squares codes. The new codes significantly reduce the decoding complexity and delay. In addition, the codes do not

2015-2016

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry. Email Id: [email protected]

Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

NEXGEN TECHNOLOGY


miscorrect any double nonadjacent bit errors. The main disadvantage of the new codes is that they require a larger number of parity check bits. Therefore, they can be useful when decoding delay or complexity is critical or when miscorrection of double nonadjacent bit errors is not acceptable. The proposed codes have been implemented in Hardware Description Language and compared with some of the existing SEC-DED-DAEC codes.

4.VLSI2016_0

4

Design of Efficient Content Addressable Memories in

High-Performance FinFET Technology

Content addressable memories (CAMs) enable highspeed parallel search operations in table lookup-based applications, such as Internet routers and processor caches. Traditional CAM design has always suffered from the high dynamic power consumption associated with its large and active parallel hardware. However, deeply scaled technology nodes, with multigate devices replacing planar MOSFETs, are expected to bring new tradeoffs to CAM design. FinFET, a vertical-channel gate-wraparound double-gate device, has emerged as the best alternative to planar MOSFET. In this brief, for the first time, we explore the design space of symmetric and asymmetric gate-workfunction FinFET CAMs. We propose several design alternatives and evaluate them in terms of their dc and transient metrics fordifferent mismatch probabilities using technology computeraided design simulations with 22-nm FinFET devices. We also propose two orthogonal layout styles for CAM design and show that one of them (vertical-search line) outperforms the other (vertical-match line) in terms of total power (22.3%) and search delay (5.8%).

2015-2016

5.VLSI2016_0

5

A New Efficiency-Improvement Low-Ripple

Charge-Pump Boost Converter Using Adaptive

Slope Generator With Hysteresis Voltage

Comparison Techniques

The new efficiency-improvement low-ripple charge pump boost converter using adaptive slope generator with hysteresis voltage comparison techniques is proposed in this paper. This proposed converter can reduce output voltage ripple, because its inductor is connected to the output. This proposed converter adopts a new controlled architecture, self-adaptive slope generator with hysteresis comparison technology, to shorten the transient response. The proposed boost converter has been fabricated with TSMC 0.35-µm CMOS 2P4M processes, and a total chip area of 1.49 mm × 1.49 mm. Its maximum output current is 260 mA when the output voltage is 3.6 V. When the supply voltage is 3.3 V, the output voltage can be 3.6–5.1 V. The maximum efficiency is 90.99% and the minimum output ripple is 10.8 mV. Finally, the theoretical analysis is verified to be correct by the experimental results.

2015-2016

6. VLSI2016_06

A 0.25-V 28-nW 58-dB Dynamic Range

Asynchronous Delta Sigma Modulator

In this paper, we present a single-bit clock-lessasynchronous delta–sigma modulator (ADSM) operating at just 0.25 V power supply. Several circuit approaches were employed to enable such low-voltage operation and maintain

2015-2016



NEXGEN TECHNOLOGY


in 130-nm Digital CMOS Process

high performance. One approach involved utilizing bulk-driven transistors in sub threshold region with trans conductance-enhancement topology. Another approach was to employ distributed transistor layout structure to mitigate the effect of low output impedance due to halo drain implants employed in today’s digital CMOS process. The ADSM achieved a characteristic center frequency of 630 Hz. It had an effective signal-to-noise-plus-distortion ratio (SNDR) of 58 dB or effective number of bits (ENOB) 9 b and just 28-nW power dissipation. A detailed analytical model capturing the effect of non-idealities of the individual circuit components is also presented for the first time with a close agreement with experimental results.

7.VLSI2016_0

7

Range Unlimited Delay-Interleaving and -Recycling

Clock Skew Compensation and Duty-Cycle

Correction Circuit

A clock skew-compensation and duty-cycle correction circuit (CSADC) is used as the second-level clock distributingcircuit to align a system global clock while maintaining a 50% duty cycle. A power-efficient, range-unlimited, and accuracy enhanced CSADC, designed mainly with a new delay-interleaving and -recycling technique that mitigates operating frequency limitations while keeping overhead costs low, is proposed in this paper. Our preliminary research results prove the feasibility of the proposed technique and show that the operating frequency ranges from 110 MHz to 1.75 GHz, with the corrected duty cycle varying from 51.2% to 48.9% based on 0.18-µm CMOS technology. Meanwhile, the lock-in time, static phase error, and power consumption are, respectively, 26 clock cycles, 4.2 ps, and 5.58 mW at 1.75 GHz.

2015-2016

8.VLSI2016_0

8

Obfuscating DSP Circuits via High-Level

Transformations

This paper presents a novel approach to designobfuscated circuits for digital signal processing (DSP) applications using high-level transformations, a key-based obfuscating finite-state machine (FSM), and a reconfiguration. The goal is to design DSP circuits that are harder to reverse engineer. High level transformations of iterative data-flow graphs have been exploited for area-speed-power tradeoffs. This is the first attempt to develop a design flow to apply high-level transformations that not only meet these tradeoffs but also simultaneously obfuscate the architectures both structurally and functionally. Several modes of operations are introduced for obfuscation where the outputs are meaningful from a signal processing point of view, but are functionally incorrect. Examples of such modes include a third-order digital filter that can also implement a sixth-order or ninth-order filter in a time-multiplexed manner. The latter two modes are meaningful but represent functionally incorrect modes. Multiple meaningful modes can be exploited to reconfigure the filter order for different applications. Other modes may correspond to non meaningful modes. A correct key input to an FSM activates a reconfigurator. The configure data controls various modes of the circuit operation. Functional obfuscation is accomplished

2015-2016



NEXGEN TECHNOLOGY


by requiring use of the correct initialization key, and configure data. Wrong initialization key fails to enable the reconfigurator, and a wrong configure data activates either a meaningful but nonfunctional or nonmeaningful mode. Probability of activating the correct mode is significantly reduced leading to an obfuscated DSP circuit. Structural obfuscation is also achieved by the proposed methodology via high-level transformations. Experimental results show that the overhead of the proposed methodology is small, while a strong obfuscation is attained. For example, the area overhead for a (3l)th-order IIR filter benchmark is only 17.7% with a 128-bit configuration key, where 1 ≤ l ≤ 8, i.e., the order of this filter should be a multiple of 3, and can vary from 3 to 24.

9.VLSI2016_0

9

Accelerating Scalar Conversion for Koblitz Curve

Cryptoprocessors on Hardware Platforms

Koblitz curves are a class of computationally efficient elliptic curves where scalar multiplications can be accelerated using τ NAF representations of scalars. However, conversion from an integer scalar to a short τ NAF is a costly operation. In this paper, we improve the recently proposed scalar conversion scheme based on division by τ 2. We apply two levels of optimizations in the scalar conversion architecture. First, we reduce the number of long integer subtractions during the scalar conversion. This optimization reduces the computation cost and also simplifies the critical paths present in the conversion architecture. Then we implement pipelines in the architecture. The pipeline splitting increases the operating frequency without increasing the number of cycles. We have provided detailed experimental results to support our claims made in this paper.

2015-2016

10.VLSI2016_1

0

Design of Self-Timed Reconfigurable Controllers

for Parallel Synchronization via Wagging

Synchronization is an important issue in modern system design as systems-on-chips integrate more diverse technologies, operating voltages, and clock frequencies on a single substrate. This paper presents a methodology for the design and implementation of a self-timed reconfigurable control device suitable for a parallel cascaded flip-flop synchronizer based on a principle known as wagging, through the application of distributed feedback graphs. By modifying the endpoint adjacency of a common behavior graph via one-hot codes, several configurable modes can be implemented in a single design specification, thereby facilitating direct control over the synchronization time and the mean-time between failures of the parallel master-slave latches in the synchronizer. Therefore, the resulting implementation is resistant to process non-idealities, which are present in physical design layouts. This paper includes a discussion of the reconfiguration protocol, and implementations of both a sequential token ring control device, and an interrupt subsystem necessary for reconfiguration, all simulated in UMC 90-nm technology. The interrupt subsystem demonstrates operating frequencies between 505 and 818 MHz per module, withaverage power consumptions between 70.7 and 90.0 µW in the typical-typical case under a corner analysis.

2015-2016



NEXGEN TECHNOLOGY


11.VLSI2016_1

1

Level-Converting Retention Flip-Flop for Reducing

Standby Power in ZigBee SoCs

In this paper, we propose a level-converting retention flip-flop (RFF) for ZigBee systems-on-chips (SoCs). The proposed RFF allows the voltage regulator that generates the core supply voltage (VDD,core) to be turned off in the standby mode, and it thus reduces the standby power of the ZigBee SoCs. The logic states are retained in a slave latch composed of thick-oxide transistors using an I/O supply voltage (VDD,IO) that is always turned on. Level-up conversion from VDD,core to VDD,IO is achieved by an embedded nMOS pass-transistor level-conversion scheme that uses a low-only signal-transmitting technique. By embedding a retention latch and level-up converter into the data-to-output path of the proposed RFF, the RFF resolves the problems of the static RAM-based RFF, such as large dc current and low readability caused by threshold drop. The proposed RFF does not also require additional control signals for power mode transitioning. Using 0.13-µm process technology, we implemented an RFF with VDD,core and VDD,IO of 1.2 and 2.5 V, respectively. The maximum operating frequency is 300 MHz. The active energy of the RFF is 191.70 fJ, and its standby power is350.25 pW.

2015-2016

12.VLSI2016_1

2All Digital Energy Sensing for

Minimum Energy Tracking

Minimizing energy consumption is of utmost importance inan energy starved system with relaxed performance requirements. This brief presents a digital energy sensing method that requires neither a constant voltage reference nor a time reference. An energy minimizing loop uses this to find the minimum energy point and sets the supply voltage between 0.2 and 0.5 V. Energy savings up to 1 275% over existing minimum energy tracking techniques in the literature is achieved.

2015-2016

13.VLSI2016_1

3

Recursive Approach to the Design of a

Parallel Self-Timed Adder

This brief presents a parallel single-rail self-timed adder.It is based on a recursive formulation for performing multibit binary addition. The operation is parallel for those bits that do not need any carry chain propagation. Thus, the design attains logarithmic performance over random operand conditions without any special speedup circuitry or look-ahead schema. A practical implementation is provided along with a completion detection unit. The implementation is regular and does not have any practical limitations of high fanouts. A high fan-in gate is required though but this is unavoidable for asynchronous logic and is managed by connecting the transistors in parallel. Simulations have been performed using an industry standard toolkit that verify the practicality and superiority of the proposed approach over existing asynchronous adders.

2015-2016

14. VLSI2016_14

Novel Reconfigurable Hardware Architecture for

In this paper, we introduce a novel reconfigurable hardware architecture for computing the polynomial matrix

2015-



NEXGEN TECHNOLOGY


Polynomial Matrix Multiplications

multiplication (PMM) of polynomial matrices and/or polynomial vectors. The proposed algorithm exploits an extension of the fast convolution technique to multiple-input multiple-output systems. The proposed architecture is the first one devoted to the hardware implementation of PMM. Hardware implementation of the algorithm is achieved via highly pipelined, partly systolic field-programmable gate array (FPGA) architecture. The architecture, which is scalable in terms of the order of the input polynomial matrices, has been designed using the Xilinx system generator tool. We verify the algorithmic accuracy of the architecture through FPGA-in-the-loop hardware co-simulations. The application to sensor array signal processing is highlighted, in terms of strong de-correlation. The results are presented to demonstrate the accuracy and capability of the architecture. The results verify that the proposed solution gives low execution times while limiting the number of required FPGA resources.

2016

15.VLSI2016_1

5

Implementation of Subthreshold Adiabatic

Logic for Ultralow-Power Application

Behavior of adiabatic logic circuits in weak inversion or subthreshold regime is analyzed in depth for the first time in the literature to make great improvement in ultralowpower circuit design. This novel approach is efficacious in low-speed operations where power consumption and longevity are the pivotal concerns instead of performance. The schematic and layout of a 4-bit carry look ahead adder (CLA) has been implemented to show the workability of the proposed logic. The effect of temperature and process parameter variations on subthreshold adiabatic logic-based 4-bit CLA has alsobeen addressed separately. Postlayout simulations show thatsubthreshold adiabatic units can save significant energy compared with a logically equivalent static CMOS implementation.

2015-2016

16.VLSI2016_1

6

FPGA-Based Bit Error Rate Performance

Measurement of Wireless Systems

This paper presents the bit error rate (BER) performance validation of digital baseband communication systems on a field-programmable gate array (FPGA). The proposed BER tester (BERT) integrates fundamental baseband signal processing modules of a typical wireless communication system along with a realistic fading channel simulator and an accurate Gaussian noise generator onto a single FPGA to provide an accelerated and repeatable test environment in a laboratory setting. Using a developed graphical user interface, the error rate performance of single- and multiple-antenna systems over a wide range of parameters can be rapidly evaluated. The FPGA-based BERT should reduce the need for time-consuming software based simulations, hence increasing the productivity. This FPGA-based solution is significantly more cost effective than conventional performance measurements made using expensive commercially available test equipment and channel simulators.

2015-2016

17.VLSI2016_1

7

Algorithm and Architecture Design of the

H.265/HEVC Intra Encoder

Improved video coding techniques introduced in theH.265/HEVC standard allow video encoders to achieve bettercompression efficiencies. On the other hand the increased

2015-2016



NEXGEN TECHNOLOGY


complexity requires a new design methodology able to facechallenges associated with ever higher spatio-temporalresolutions. The paper presents the computationally-scalablealgorithm and its hardware architecture able to support the intra encoding up to the 2160p@30fps resolution. The scalability allows the tradeoff between the throughput and the compression efficiency. In particular, the encoder is able to check a variable number of candidate modes. The rate estimation based on bin counting and the distortion estimation in the transform domain simplify the rate-distortion analysis and enable the evaluation of a great number of candidate intra modes. The encoder preselects candidate modes by the processing of 8×8 predictions computed from original samples. The preselection shares hardware resources used for the processing of predictions generated from reconstructed samples. To support intra 4×4 modes for the 2160p@30fps resolution, the encoder incorporates a separate reconstruction loop. The processing of blocks with different sizes is interleaved to compensate the delay of reconstruction loops. Implementation results show that the encoder utilizes 1086k gates and 52 kB on-chip memories for TSMC 90nm. The main reconstruction loop can operate at 400 MHz, whereas the remaining modules work at 200 MHz. For 2160p@30fps videos, the average BD-Rate is 5.46% compared to the HM software.

18.VLSI2016_1

8

Pre-Encoded MultipliersBased on Non-Redundant Radix-4

Signed-Digit Encoding

In this paper, we introduce architecture of pre-encoded multipliers for Digital Signal Processing applications basedon off-line encoding of coefficients. To this extend, the Non-Redundant radix-4 Signed-Digit (NR4SD) encoding technique, which uses the digit values {−1, 0, +1, +2} or {−2, −1, 0, +1}, is proposed leading to a multiplier design with less complex partial products implementation. Extensive experimental analysis verifies that the proposed pre-encoded NR4SD multipliers, including the coefficients memory, are more area and power efficient than the conventional Modified Booth scheme.

2015-2016

19.VLSI2016_1

9

A High-Performance FIR Filter Architecture for

Fixed and Reconfigurable Applications

Transpose form finite-impulse response (FIR) filters are inherently pipelined and support multiple constant multiplications (MCM) technique that results in significant saving of computation. However, transpose form configuration does not directly support the block processing unlike direct form configuration. In this paper, we explore the possibility of realization of block FIR filter in transpose form configuration for area-delay efficient realization of large order FIR filters for both fixed and reconfigurable applications. Based on a detailed computational analysis of transpose form configuration of FIR filter, we have derived a flow graph for transpose form blockFIR filter with optimized register complexity. A generalizedblock formulation is presented for transpose form FIR filter.We have derived a general multiplier-based architecture forthe proposed transpose form block filter for reconfigurableapplications. A low-complexity design using the MCM

2015-2016



NEXGEN TECHNOLOGY


scheme is also presented for the block implementation of fixed FIR filters. The proposed structure involves significantly less area delay product (ADP) and less energy per sample (EPS) than the existing block implementation of direct-form structure for medium or large filter lengths, while for the short-length filters, the block implementation of direct-form FIR structure has less ADP and less EPS than the proposed structure. Application specific integrated circuit synthesis result shows that the proposed structure for block size 4 and filter length 64 involves 42% less ADP and 40% less EPS than the best available FIR filter structure proposed for reconfigurable applications. For the same filter length and the same block size, the proposed structure involves 13% less ADP and 12.8% less EPS than that of the existing direct-form block FIR structure.

20.VLSI2016_2

0

A Novel Photosensitive Tunneling Transistor

for Near-Infrared Sensing Applications:

Design, Modeling, and Simulation

In this paper, a novel device structure, operatingon the principle of band-to-band tunneling, has been designedfor near-infrared (1–1.5 µm) multispectral optical sensing applications. A drain current model based on line tunneling approach has been developed to illustrate the device operation. The results of the model are compared with the simulated data for devices with similar dimension and structure, indicating good accuracy of the developed model. Spectral response of the device is studied by estimating the relative values of its transfer—as well as output—characteristics, and also by measuring the variation of threshold voltage, VT and ON-state current, ION. VT and ION are found to be sensitive to wavelength variations at moderate gate doping levels. VT is found to increase by ∼40 mV and ION decreases by 35% for a change of illumination wavelength from 1 to 1.5 µm at a gate doping of 1 × 1018 cm−3. Peak spectral sensitivity at an illumination intensity of 0.75 W/cm2 is found to be 318.38, 2.02 × 103, and 672.2 corresponding to the change in wavelength from (1–1.2 µm), (1.2–1.45 µm), and (1.45–1.5 µm), respectively.

2015-2016

21.VLSI2016_2

1

High-Throughput LDPC-Decoder Architecture

Using Efficient Comparison Techniques & Dynamic

Multi-Frame Processing Schedule

This paper presents architecture of block-level-parallel layered decoder for irregular LDPC code. It can be reconfigured to support various block lengths and code rates of IEEE 802.11n (WiFi) wireless-communication standard. We have proposed efficient comparison techniques for both column and row layered schedule and rejection-based high-speed circuits to compute the two minimum values from multiple inputs required for row layered processing of hardware-friendly min-sum decoding algorithm. The results show good speed with lower area as compared to state-of-the-art circuits. Additionally, this work proposes dynamic multi-frame processing schedule which efficiently utilizes the layered-LDPC decoding with minimum pipeline stages. Thesuggested LDPC-decoder architecture has been synthesized and post-layout simulated in 90 nm-CMOS process. This decoder occupies 5.19 area and supports multiple code rates like 1/2, 2/3, 3/4 & 5/6 as well as block-lengths of 648, 1296 & 1944. At a clock frequency of 336 MHz, the proposed

2015-2016



NEXGEN TECHNOLOGY


LDPC-decoder has achieved better throughput of 5.13 Gbps and energy efficiency of 0.01 nJ/bits/iterations, as compared to the similar state-of-the-art works.

22.VLSI2016_2

2

A New Parallel VLSI Architecture for Real-time Electrical

Capacitance Tomography

This paper presents a fixed-point reconfigurable parallelVLSI hardware architecture for real-time ElectricalCapacitance Tomography (ECT). It is modular and consistsof a front-end module which performs precise capacitancemeasurements in a time multiplexed manner usingCapacitance to Digital Converter (CDC) technique. AnotherFPGA module performs the inverse steps of the tomographyalgorithm. A dual port built-in memory banks store thesensitivity matrix, the actual value of the capacitances, andthe actual image. A two dimensional (2D) core multiprocessing elements (PE) engine intercommunicates with these memory banks via parallel buses. A Hardware-software co-design methodology was conducted using commercially available tools in order to concurrently tune the algorithms and hardware parameters. Hence, the hardware was designed down to the bit-level in order to reduce both the hardware cost and power consumption, while satisfying real-time constraint. Quantization errors were assessed against theimage quality and bit-level simulations demonstrate thecorrectness of the design. Further simulations indicate thatthe proposed architecture achieves a speed-up of up to threeorders of magnitude over the software version when thereconstruction algorithm runs on 2.53 GHZ-based Pentiumprocessor or DSP Ti’s Delphino TMS320F32837 processor.More specifically, a throughput of 17.241 Kframes/sec forboth the Linear-Back Projection (LBP) and modifiedLandweber algorithms and 8.475 Kframes/sec for theLandweber algorithm with 200 iterations could be achieved.This performance was achieved using an array of [2×2] ×[2×2] processing units. This satisfies the real-time constraintof many industrial applications. To the best of the authors’knowledge, this is the first embedded system which exploresthe intrinsic parallelism which is available in modern FPGAfor ECT tomography.

2015-2016

23.VLSI2016_2

3

Graph-Based Transistor Network Generation

Method for Supergate Design

Transistor network optimization represents an effective way of improving VLSI circuits. This paper proposes a novel method to automatically generate networks with minimal transistor count, starting from an irredundant sum-of-products expression as the input. The method is able to deliver both series–parallel (SP) and non-SP switch arrangements, improving speed, power dissipation, and area of CMOS gates. Experimental results demonstrate expected gains in comparison with related approaches.

2015-2016

24.VLSI2016_2

4

A Relative Imaging CMOS Image Sensor for High

Dynamic Range and High Frame-Rate Machine

Vision Imaging Applications

This paper proposes an unconventional image acquisition scheme for machine vision applications, based on detecting ratios of illumination (pixel) intensities. Detecting relative ratios enables capturing the scene features and patterns almost independently from the local scene illumination resulting in potentially extremely high dynamic range.

2015-2016



NEXGEN TECHNOLOGY


Moreover, detecting signal ratios using a fully differential circuit optimally suits the intrinsic nature of VLSI design. A scalable and compact hardware implementation is proposed as a proof-of-concept towards relative image acquisition. The proposed photo-current ratio-detecting pixels completely bypass the need of conventional photo-current integration which enables high frame-rate operation of up to 24000 frames-per-second (fps). The pulse-width modulated output of the proposed pixel is captured by compact column-parallel readout circuits based on digital counters. The developed 32×32 pixel array prototype CMOS image sensor consumes 4mW of power operating at a nominal 9765 fps frame rate, and 6.8mW of power operating at a maximum 24000fps. The presented prototype design is fully scalable towards newer CMOS fabrication nodes and higher sensor resolution.

25.VLSI2016_2

5

Low-Cost High-Performance VLSI Architecture for

Montgomery Modular Multiplication

This paper proposes a simple and efficient Montgomery multiplication algorithm such that the low-cost and high-performance Montgomery modular multiplier can be implemented accordingly. The proposed multiplier receives and outputs the data with binary representation and uses only one-level carry-save adder (CSA) to avoid the carry propagation at each addition operation. This CSA is also used to perform operand pre-computation and format conversion from the carry save format to the binary representation, leading to a low hardware cost and short critical path delay at the expense of extra clock cycles for completing one modular multiplication. To overcome the weakness, a configurable CSA (CCSA), which could be one full-adder or two serial half-adders, is proposed to reduce the extra clock cycles for operand pre-computation and format conversion by half. In addition, a mechanism that can detect and skip the unnecessary carry-save addition operations in the one-level CCSA architecture while maintaining the short critical path delay is developed. As a result, the extra clock cycles for operand pre-computation and format conversion can be hidden and high throughput can be obtained. Experimental results show that the proposed Montgomery modular multiplier can achieve higher performance and significant area–time product improvement when compared with previous designs.

2015-2016

26.VLSI2016_2

6

Fully Pipelined Low-Cost and High-Quality Color

Demosaicking VLSI Design for Real-Time Video

Applications

This paper presents a fully pipelined color demosaicking design. To improve the quality of reconstructed images, a linear deviation compensation scheme was created to increase the correlation between the interpolated and neighboring pixels. Furthermore, immediately interpolated green color pixels are first to be used in hardware-oriented color demosaicking algorithms, which efficiently promoted the quality of the reconstructed image. A boundary detector and boundary mirror machine were added to improve the quality of pixels located in boundaries. In addition, a hardware sharing technique was used to reduce the hardware costs of three interpolators. The VLSI architecture in this work contains only 4.97 K gate counts and the core area is 60,229

2015-2016



NEXGEN TECHNOLOGY


um2 synthesized by using 0.18-um CMOS process. The operating frequency of this work is 200 MHz by consuming 4.76 mW. Compared with the previous low complexity designs, this work has the benefits in terms of low cost, low power consumption, and high performance.

27.VLSI2016_2

7

A Novel Area-Efficient VLSI Architecture for

Recursion Computation in LTE Turbo Decoders

Long term evolution (LTE) is aimed to achieve thepeak data rates in excess of 300 Mb/s for the next generation wireless communication systems. Turbo codes, the specified channel coding scheme in LTE, suffer from a low-decoding throughput due to its iterative decoding algorithm. One efficient approach to achieve a promising throughput is to use multiple Maximum a-Posteriori (MAP) cores in parallel, resulting in a large area overhead. The two computationally challenging units in an MAP core are α and β recursion units. Although several methods have been proposed to shorten the critical path of these recursion units, their area-efficient architecture with minimum silicon area is still missing. In this paper, a novel relation existing between α andβ metrics is introduced, leading to a novel add-compare-select (ACS) architecture. The proposed technique can be applied to both the precise approximation of log-MAP and max-log MAP ACS architectures. The proposed ACS design, implemented in a 0.13 µm CMOS technology and customized for the LTE standard, results in at most 18.1% less area compared to the reported designs to-date while maintaining the same throughput level.

2015-2016

28.VLSI2016_2

8

Comparative Performance Analysis of

the Dielectrically Modulated FullGate and Short-Gate Tunnel

FET-Based Biosensors

In this paper, a short-gate tunneling-field-effecttransistor (SG-TFET) structure has been investigated for the dielectrically modulated biosensing applications in comparison with a full-gate tunneling-field-effect-transistor structure of similar dimensions. This paper explores the underlying physics of these architectures and estimates their comparative sensing performance. The sensing performance has been evaluated for both the charged and charge-neutral biomolecules using extensive device-level simulation, and the effects of the biomolecule dielectric constant and charge density are also studied. In SG-TFET architecture, the reduction of the gate length enhances its drain control over the band-to-band tunneling process and this has been exploited for the detection, resulting to superior drain current sensitivity for biomolecule conjugation. The gate and drain biasing conditions show dominant impact on the sensitivity enhancement in the short-gate biosensors. Therefore, the gate and drain bias are identified as the effective design parameters for the efficiency optimization.

2015-2016

29.VLSI2016_2

9

An Efficient Constant Multiplier Architecture

Based on Vertical-Horizontal Binary Common

Sub-expression Elimination Algorithm for

Reconfigurable FIR Filter Synthesis

This paper proposes efficient constant multiplier architecture based on vertical-horizontal binary common sub-expression elimination (VHBCSE) algorithm for designing a reconfigurable finite impulse response (FIR) filter whose coefficients can dynamically change in real time. To design an efficient reconfigurable FIR filter, according to the proposed VHBCSE algorithm, 2-bit binary common sub-

2015-2016



NEXGEN TECHNOLOGY


expression elimination (BCSE) algorithm has been applied vertically across adjacent coefficients on the 2-D space of the coefficient matrix initially, followed by applying variable-bit BCSE algorithm horizontally within each coefficient. This technique is capable of reducing the average probability of use or the switching activity of the multiplier block adders by 6.2% and 19.6% as compared to that of two existing 2-bit and 3-bit BCSE algorithms respectively. ASIC implementation results of FIR filters using this multiplier show that the proposed VHBCSE algorithm is also successful in reducing the average power consumption by 32% and 52% along with an improvement in the area power product (APP) by 25%and 66% compared to those of the 2-bit and 3-bit BCSE algorithms respectively. As regards the implementation of FIR filter, improvements of 13% and 28% in area delay product (ADP) and 76.1% and 77.8% in power delay product (PDP) for the proposed VHBCSE algorithm have been achieved over those of the earlier multiple constant multiplication (MCM) algorithms, viz. faithfully rounded truncated multiple constant multiplication/accumulation (MCMAT) and multi-root binary partition graph (MBPG) respectively. Efficiency shown by the results of comparing the FPGA and ASIC implementations of the reconfigurable FIR filter designed using VHBCSE algorithm based constant multiplier establishes the suitability of the proposed algorithm for efficient fixed point reconfigurable FIR filter synthesis.

30.VLSI2016_3

0

VLSI-Assisted Nonrigid Registration Using

Modified Demons Algorithm

Increasing demand of high-speed portable modules for multimedia applications has motivated the development of hardware-based solutions for image processing applications. Most of the nonrigid image registration algorithms are found to be unsuitable for hardware implementation because of their nonlinearity and computationally intensive nature. In this paper, an algorithm for nonrigid image registration based on Demons approximation is proposed. The algorithm has been simulated in MATLAB and results show a 15% improvement in peaksignal-to-noise-ratio with a 17% reduction in registration time for 256 × 256 image over the original Demons algorithm. The proposed algorithm is synthesized in Virtex6-xc6vlx760-2-ff1760 and maximum synthesized frequency is found to be 174 MHz. The proposed architecture provides the low cost, high-speed solution for the registration process, which is also helpful for making a portable system.

2015-2016

31.VLSI2016_3

1

Fine-Grained Access Management in

Reconfigurable Scan Networks

Modern VLSI designs incorporate a high amount ofinstrumentation that supports post-silicon validation and debug, volume test and diagnosis, as well as in-field system monitoring and maintenance. Reconfigurable scan architectures, as allowed by the novel IEEE Std 1149.1-2013 (JTAG) and IEEE Std 1687- 2014 (IJTAG), emerge as a scalable mechanism for access to such on-chip instruments. While the on-chip instrumentation is crucial for meeting quality, dependability, and time-to-market goals, it is prone

2015-2016



NEXGEN TECHNOLOGY


to abuse and threatens system safety and security. A secureaccess management method is mandatory to assure that critical instruments be accessible to authorized entities only.This work presents a novel protection method for fine-grained access management in complex reconfigurable scan networks based on a challenge-response authentication protocol. The target scan network is extended with an authorization instrument and Secure Segment Insertion Bits (S2IB) that together control the accessibility of individual instruments. To the best of the authors’ knowledge, this is the first fine-grained access management scheme that scales well with the number of protected instruments and offers a high level of security. Compared with recent state of-the-art techniques, this scheme is more favorable with respect to implementation cost, performance overhead, and provided security level.

32.VLSI2016_3

2

A High-Throughput VLSI Architecture for Hard and

Soft SC-FDMA MIMO Detectors

This paper introduces a novel low-complexity multiple-input multiple-output (MIMO) detector tailored for single-carrier frequency division-multiple access (SC-FDMA) systems, suitable for efficient hardware implementations. The proposed detector starts with an initial estimate of the transmitted signal based on a minimum mean square error (MMSE) detector. Subsequently, it recognizes less reliable symbols for which more candidates in the constellation are browsed to improve the initial estimate. Efficient high-throughput VLSI architecture is also introduced achieving a superior performance compared to the conventional MMSE detectors with less than 28% added complexity. The performance of the proposed design is close to the existing maximum likelihood post-detection processing (ML-PDP) scheme, while resulting in a significantly lower complexity, i.e., and times fewer Euclidean distance (ED) calculations in the 16-QAM and 64-QAM schemes, respectively. The proposed design for the 16-QAM scheme is fabricated in a 0.13 CMOS technology and fully tested, achieving a 1.332 Gbps throughput, reporting the first fabricated design for SC-FDMA MIMO detectors to-date. A soft version of the proposed architecture is also introduced, which is customized for coded systems.

2015-2016

33.VLSI2016_3

3

Partially Parallel Encoder Architecture

for Long Polar Codes

Due to the channel achieving property, the polar code has become one of the most favorable error-correcting codes. Asthe polar code achieves the property asymptotically, however, it should be long enough to have a good error-correcting performance. Although the previous fully parallel encoder is intuitive and easy to implement, it is not suitable for long polar codes because of the huge hardware complexity required. In this brief, we analyze the encoding process in the viewpoint of very-large-scale integration implementation and propose a new efficient encoder architecture that is adequate for long polar codes and effective in alleviating the hardware complexity. As the proposed encoder allows high-throughput encoding with small hardware complexity, it can be systematically applied

2015-2016



NEXGEN TECHNOLOGY


to the design of any polar code and to any level of parallelism.

34.VLSI2016_3

4

Novel Block-Formulation and Area-Delay-Efficient

Reconfigurable Interpolation Filter Architecture for

Multi-Standard SDR Applications

A poly-phase based interpolation filter computation involves an input-matrix and coefficient-matrix of size each, where is the up-sampling factor and , is the filter length. The input-matrix and the coefficient-matrix resizes when changes. An analysis of interpolation filter computation for different up-sampling factors is made in this paper to identify redundant computations and removed those by reusing partialresults. Reuse of partial results eliminates the necessity of matrix resizing in interpolation filter computation. A novel block-formulation is presented to share the partial results for parallel computation of filter outputs of different up-sampling factors. Using the proposed block formulation, a parallel multiplier-based reconfigurable architecture is derived for interpolation filter. The most remarkable aspect of the proposed architecture is that, it does not require reconfiguration to compute filter outputs of an interpolationfilter for different up-sampling factor. The proposed structure has regular data-flow and it has no overhead complexity for its reconfigurable feature unlike the existing structures. Besides, the proposed structure has significantly less register complexity than the existing structure and its register complexity is independent of the block-size. Moreover, the proposed structure can support higher input-sampling frequency than the existing structure. ASIC synthesis result shows that the proposed structure for block-size 4,filter length 32, and up-sampling factor 8, involves 13.6 times more area and offers 245 times higher maximum input-sampling frequency compared with the existing multiplier-less structure. It involves 18.6 times less area-delay-product (ADP) and 9.5 times less energy per output (EPO) than the existing multiplier-less structure.

2015-2016

35.VLSI2016_3

5

One Minimum Only Trellis Decoder for Non-Binary

Low-Density Parity-Check Codes

A one minimum only decoder for Trellis-EMS (OMOT-EMS) and for Trellis-Min-max (OMO T-MM) is proposed in this paper. In this novel approach, we avoid computing the second minimum in messages of the check node processor, and propose efficient estimators to infer the second minimum value. By doing so, we greatly reduce the complexity and at the same time improve latency and throughput of the derived architectures compared to the existing implementations of EMS and Min-max decoders. This solution has been applied to various NB-LDPC codes constructed over different Galois fields and with different degree distributions showing in all cases negligible performance loss compared to the ideal EMS and Min-max algorithms. In addition, two complete decoders for OMO T-EMS and OMO T-MM were implemented for the (837,726) NB-LDPC code over GF(32) for comparison proposals. A 90 nm CMOS process was applied, achieving a throughput of 711 Mbps and 818 Mbps respectively at a clock frequency of 250 MHz, with an area of 19.02 and 16.10after place and route. To the best knowledge of the authors, the proposed decoders have higher throughput and area-time

2015-2016



NEXGEN TECHNOLOGY


efficiency than any other solution for high-rate NB-LDPC codes with high Galois field order.

36.VLSI2016_3

6

A Low-Cost Hardware Architecture for Illumination

Adjustment in Real-Time Applications

For real-time surveillance and safety applications in intelligent transportation systems, high-speed processing for image enhancement is necessary and must be considered. In this paper, we propose a fast and efficient illumination adjustment algorithm that is suitable for low-cost very large scale integration implementation. Experimental results show that the proposed method requires the least number of operations and achieves comparable visual quality as compared with previous techniques. To further meet the requirement of real-time image/video applications, the 16-stage pipelined hardware architecture of our method is implemented as an intellectual property core. Our design yields a processing rate of about 200 MHz by using TSMC 0.13-μm technology. Since it can process one pixel per clock cycle, for an image with a resolution of QSXGA (2560 × 2048), it requires about 27 ms to process one frame that is suitable for real-time applications. In some low-cost intelligent imaging systems, the processing rate can be slowed down, and our hardware core can run at very low power consumption.

2015-2016

37.VLSI2016_3

7

A 2.5-Gb/s DLL-Based Burst-Mode Clock and Data Recovery

Circuit With 4× Oversampling

In this brief, a delay-locked loop (DLL)-based burst-modeclock and data recovery (BMCDR) circuit using a 4× oversampling technique is realized for passive optical network. With the help of DLL to track the input phase, the proposed circuit can recover the burst mode data in a short acquisition time and achieve large jitter tolerance. In addition, a 2.5-GHz four-phase clock generator is embedded in the chip. Implemented with a 0.18-µm CMOS technology, experiment shows that the acquisition time can be accomplished in the time of 31 bits. Incoming 2.5-Gb/s input data of 231–1 pseudorandom binary sequence, the retimed data has a root-mean-square jitter of 8.557 ps and a peakto-peak jitter of 32.0 ps, and the measured bit error rate is less than 10−10. The area of the whole chip is 1.4 × 1.4 mm2, where the BMCDR circuit core occupies 0.81 × 0.325 mm2. The total power consumption is 130 mW from a 1.8 V supply voltage.

2015-2016

38.VLSI2016_3

8

Aging-Aware Reliable Multiplier Design With

Adaptive Hold Logic

Digital multipliers are among the most criticalarithmetic functional units. The overall performance of thesesystems depends on the throughput of the multiplier. Meanwhile, the negative bias temperature instability effect occurs when a pMOS transistor is under negative bias (Vgs = −Vdd), increasing the threshold voltage of the pMOS transistor, and reducing multiplier speed. A similar phenomenon, positive bias temperature instability, occurs when an nMOS transistor is under positive bias. Both effects degrade transistor speed, and in the long term, the system may fail due to timing violations. Therefore, it is important to design reliable high-performance multipliers.In this paper, we propose an aging-aware multiplier design

2015-2016



NEXGEN TECHNOLOGY


with a novel adaptive hold logic (AHL) circuit. The multiplier is able to provide higher throughput through the variable latency and can adjust the AHL circuit to mitigate performance degradation that is due to the aging effect. Moreover, the proposed architecture can be applied to a column- or row-bypassing multiplier. The experimental results show that our proposed architecture with 16 × 16 and 32 × 32 column-bypassing multipliers can attain up to 62.88% and 76.28% performance improvement, respectively, compared with 16×16 and 32×32 fixed-latency column-bypassing multipliers. Furthermore, our proposed architecture with 16 × 16 and 32 × 32 row-bypassing multipliers can achieve up to 80.17% and 69.40% performance improvement as compared with 16×16 and 32 × 32 fixed-latency row-bypassing multipliers.

39.VLSI2016_3

9

Reverse Converter Design via Parallel-Prefix Adders: Novel

Components,Methodology, and Implementations

In this brief, the implementation of residue number systemreverse converters based on well-known regular and modular parallel prefix adders is analyzed. The VLSI implementation results show a significant delay reduction and area × time2 improvements, all this at the cost of higher power consumption, which is the main reason preventing the use of parallel-prefix adders to achieve high-speed reverse converters in nowadays systems. Hence, to solve the high power consumption problem, novel specific hybrid parallel-prefix-based adder components that provide better tradeoff between delay and power consumption are herein presented to design reverse converters. A methodology is also described to design reverse converters based on different kinds of prefix adders. This methodology helps the designer to adjust the performance of the reverse converter based on the target application and existing constraints.

2015-2016

40.VLSI2016_4

0

Fully Reused VLSI Architecture ofFM0/Manchester Encoding Using

SOLSTechnique for DSRC Applications

The dedicated short-range communication (DSRC)is an emerging technique to push the intelligent transportationsystem into our daily life. The DSRC standards generally adopt FM0 and Manchester codes to reach dc-balance, enhancing the signal reliability. Nevertheless, the coding-diversity between the FM0 and Manchester codes seriously limits the potential to design a fully reused VLSI architecture for both. In this paper, the similarity-oriented logic simplification (SOLS) technique is proposed to overcome this limitation. The SOLS technique improves the hardware utilization rate from 57.14% to 100% for both FM0 and Manchester encodings. The performance of this paper is evaluated on the post layout simulation in Taiwan Semiconductor Manufacturing Company (TSMC) 0.18-µm 1P6M CMOS technology. The maximum operationfrequency is 2 GHz and 900 MHz for Manchester and FM0encodings, respectively. The power consumption is 1.58 mW at 2 GHz for Manchester encoding and 1.14 mW at 900 MHz for FM0 encoding. The core circuit area is 65.98 × 30.43 µm2. The encoding capability of this paper can fully support the DSRC standards of America, Europe, and Japan.

2015-2016



NEXGEN TECHNOLOGY


41VLSI2016_4

1

A Fast-Acquisition All-Digital Delay-Locked Loop Using a

Starting-Bit Prediction Algorithm for the Successive-Approximation

Register

2015-2016

42VLSI2016_4

2

A Fully Digital Front-End Architecture for ECG Acquisition

System With 0.5 V Supply

2015-2016

43VLSI2016_4

3

A Low-Power Robust Easily CascadedPentaMTJ-Based

Combinational and Sequential Circuits

2015-2016

44VLSI2016_4

4

A Mixed-Decimation MDF Architecture for Radix-2k Parallel

FFT

2015-2016

45VLSI2016_4

5

A SUC-Based Full-Binary 6-bit 3.1-GS/s 17.7-Mw Current-Steering

DAC in 0.038 mm2

2015-2016

46VLSI2016_4

6

Argo: A Real-Time Network-on-Chip Architecture With an Efficient

GALS Implementation

2015-2016

47VLSI2016_4

7

Design and Low-Complexity Implementation of Matrix–Vector Multiplier for Iterative Methods in

Communication Systems

2015-2016

48VLSI2016_4

8

Energy and Area Efficient Three-Input XOR/XNORs With Systematic Cell Design

Methodology

2015-2016

49VLSI2016_4

9

Fault Tolerant Parallel FFTs Using Error Correction Codes and

Parseval Checks

2015-2016

50VLSI2016_5

0

Graph-Based Transistor Network Generation Method for Supergate

Design

2015-2016

51VLSI2016_5

1

High-Speed and Energy-Efficient Carry Skip Adder Operating Under a Wide Range of Supply Voltage

Levels

2015-2016

52VLSI2016_5

2

High-Throughput Power-Efficient VLSI Architecture of Fractional Motion Estimation for Ultra-HD

HEVC Video Encoding

2015-2016

53VLSI2016_5

3

A Spread Spectrum Clock Generator Using a Programmable Linear Frequency Modulator for Multipurpose Electronic Devices

2015-2016

54VLSI2016_5

4

Floating-Point Butterfly Architecture Based on Binary Signed-Digit Representation

2015-2016

55VLSI2016_5

5Further Desensitized FIR Halfband

Filters2015-2016

56VLSI2016_5

6A Modified Partial Product

Generator for Redundant Binary 2015-



NEXGEN TECHNOLOGY


Multipliers 2016

57VLSI2016_5

7

Implementation of Arithmetic Operations with Time-free Spiking

Neural P Systems

2015-2016

58VLSI2016_5

8

A Clock and Data Recovery Circuit With Programmable Multi-Level

Phase Detector Characteristics and a Built-in Jitter Monitor

2015-2016

59VLSI2016_5

9Unfaithful Glitch Propagation in Existing Binary Circuit Models

2015-2016

60VLSI2016_6

0

Early Skip Mode Decision for HEVC Encoder With Emphasis on

Coding Quality

2015-2016

61VLSI2016_6

1

Two-Step Optimization Approach for the Design of Multiplierless

Linear-Phase FIR Filters

2015-2016

62VLSI2016_6

2Energy Consumption of VLSI

Decoders2015-2016

63VLSI2016_6

3Timing Error Tolerance in Small

Core Designs for SoC Applications2015-2016

64VLSI2016_6

4

40-Gb/s 0.7-V 2:1 MUX and 1:2 DEMUX with Transformer-

Coupled Technique for SerDes Interface

2015-2016

65VLSI2016_6

5Design and Analysis of Inexact

Floating-Point Adders2015-2016

66VLSI2016_6

6In-Field Test for Permanent Faults in FIFO Buffers of NoC Routers

2015-2016

67VLSI2016_6

7

Low-Cost High-Performance VLSI Architecture for Montgomery

ModularMultiplication

2015-2016

68VLSI2016_6

8


Levels

2015-2016

69VLSI2016_6

9

Dual-Phase Tapped-Delay-Line Time-to-Digital Converter With

On-the-Fly Calibration Implemented in 40 nm FPGA

2015-2016

70VLSI2016_7

0

A Low Power and High Sensing Margin Non-Volatile Full Adder

Using Racetrack Memory

2015-2016

71VLSI2016_7

1

Signal Design for Multiple Antenna Systems With Spatial Multiplexing

and Noncoherent Reception

2015-2016

72VLSI2016_7

2Synthesis of Genetic Clock with Combinational Biologic Circuits

2015-2016

73VLSI2016_7

3Aging-Aware Reliable Multiplier Design With Adaptive Hold Logic

2015-2016

74VLSI2016_7

4Fault Tolerant Parallel Filters Based

on Error Correction Codes 2015-



NEXGEN TECHNOLOGY


2016

75VLSI2016_7

5

Design and Analysis of Approximate Compressors for

Multiplication

2015-2016

76VLSI2016_7

6

Novel Design Algorithm for Low Complexity Programmable FIR

Filters Based on Extended Double Base Number Systems

2015-2016

77VLSI2016_7

7

An Accuracy-Adjustment Fixed-Width Booth Multiplier Based on Multilevel Conditional Probability

2015-2016

78VLSI2016_7

8

Floating-Point Butterfly Architecture Based on Binary Signed-Digit Representation

2015-2016

79VLSI2016_7

9

Implementation of Subthreshold Adiabatic Logic for Ultralow-

Power Application

2015-2016

80VLSI2016_8

0

Novel Block-Formulation and Area-Delay-Efficient

Reconfigurable Interpolation Filter Architecture for Multi-Standard

SDR Applications

2015-2016

81VLSI2016_8

1

A High-Performance FIR Filter Architecture for Fixed and

Reconfigurable Applications

2015-2016

82VLSI2016_8

2


Levels

2015-2016

83VLSI2016_8

3Low-Power and Area-Efficient

Shift Register Using Pulsed Latches2015-2016

84VLSI2016_8

4

Array-Based Approximate Arithmetic Computing: A General

Model and Applications to Multiplier and Squarer Design

2015-2016

85VLSI2016_8

5Recursive Approach to the Design

of a Parallel Self-Timed Adder2015-2016

86VLSI2016_8

6Further Desensitized FIR Half band

Filters2015-2016

87VLSI2016_8

7Design and Analysis of Inexact

Floating-Point Adder2015-2016

88VLSI2016_8

8

Scalable Verification of a Generic End-Around-Carry Adder for Floating-Point Units by Coq

2015-2016

89VLSI2016_8

9

An Efficient Constant Multiplier Architecture Based on Vertical-

Horizontal Binary Common Sub-expression Elimination Algorithm

for Reconfigurable FIR Filter Synthesis

2015-2016

90VLSI2016_9

0A Generalized Algorithm and

Reconfigurable Architecture for 2015-2016



NEXGEN TECHNOLOGY


Efficient and Scalable Orthogonal Approximation of DCT



Education

BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI