Bevan M. Baas

Proceedings of the 18th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2025

Regular Expression Processing on A Many-Core Platform.

[BibT_eX]

[DOI]

Sagar Sajeev

Proceedings of the 18th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2025

Optimizing Internal Data Width in FFT Accelerators for Signal-to-Noise Ratio and Area Efficiency.

[BibT_eX]

[DOI]

Daniel A. Chevy

Proceedings of the 59th Asilomar Conference on Signals, 2025

2023

Energy-efficient canonical Huffman decoders on many-core processor arrays and FPGAs.

[BibT_eX]

[DOI]

Satyabrata Sarangi

Integr., 2023

Many-Core Display Stream Compression Decoders With Simplified Pixel Prediction.

[BibT_eX]

[DOI]

Proceedings of the 66th IEEE International Midwest Symposium on Circuits and Systems, 2023

A Scalable JPEG Encoder on a Many-Core Array.

[BibT_eX]

[DOI]

Thomas Abbott

Proceedings of the 16th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2023

2022

Architecture and 28 nm CMOS Design of a 1886 MBin/sec Context-Adaptive Binary Arithmetic Coder (CABAC) Encoder.

[BibT_eX]

[DOI]

Renjie Chen

Proceedings of the 30th IFIP/IEEE 30th International Conference on Very Large Scale Integration, 2022

A Low-Overhead Method for the Accurate Estimation of the Maximum Operating Clock Frequency.

[BibT_eX]

[DOI]

Proceedings of the 30th IFIP/IEEE 30th International Conference on Very Large Scale Integration, 2022

Efficient and High-Performance Sparse Matrix-Vector Multiplication on a Many-Core Array.

[BibT_eX]

[DOI]

Peiyao Shi

Proceedings of the 15th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2022

2021

Display Stream Compression Decoders for Fine-Grained Many-Core Processor Arrays.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. II Express Briefs, 2021

DeepScaleTool: A Tool for the Accurate Estimation of Technology Scaling in the Deep-Submicron Era.

[BibT_eX]

[DOI]

Satyabrata Sarangi

Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

A 1448-Mpixel/s, 84-pJ/Pixel Display Stream Compression Encoder in 28 nm for 4K Video Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE Asian Solid-State Circuits Conference, 2021

Canonical Huffman Decoder on Fine-grain Many-core Processor Arrays.

[BibT_eX]

[DOI]

Satyabrata Sarangi

Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

2020

Correction to: Guest Editorial: JSPS Special Issue on 2018 IEEE Signal Processing Systems (SiPS) Workshop.

[BibT_eX]

[DOI]

Tokunbo Ogunfunmi

John McAllister

Mrityunjoy Chakraborty

J. Signal Process. Syst., 2020

Guest Editorial: JSPS Special Issue on 2018 IEEE Signal Processing Systems (SiPS) Workshop.

[BibT_eX]

[DOI]

Tokunbo Ogunfunmi

John McAllister

Mrityunjoy Chakraborty

J. Signal Process. Syst., 2020

Scalable energy-efficient parallel sorting on a fine-grained many-core processor array.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2020

Indexed Color History Many-Core Engines for Display Stream Compression Decoders.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Conference on Electronics, Circuits and Systems, 2020

2019

Corrigendum to "Scaling equations for the accurate prediction of CMOS device performance from 180 nm to 7 nm" [Integr. VLSI J. 58. (2017) 74-81].

[BibT_eX]

[DOI]

Integr., 2019

2018

A Low-Cost Slice Interleaving DSC Decoder Architecture for Real-Time 8K Video Decoding.

[BibT_eX]

[DOI]

Proceedings of the IEEE 61st International Midwest Symposium on Circuits and Systems, 2018

Display Stream Compression Encoder Architectures for Real-time 4K and 8K Video Encoding.

[BibT_eX]

[DOI]

Proceedings of the 52nd Asilomar Conference on Signals, Systems, and Computers, 2018

2017

Hybrid Hardware/Software Floating-Point Implementations for Optimized Area and Throughput Tradeoffs.

[BibT_eX]

[DOI]

Jon J. Pimentel

Brent Bohnenstiehl

IEEE Trans. Very Large Scale Integr. Syst., 2017

Editorial.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2017

KiloCore: A Fine-Grained 1, 000-Processor Array for Task-Parallel Applications.

[BibT_eX]

[DOI]

IEEE Micro, 2017

KiloCore: A 32-nm 1000-Processor Computational Array.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2017

Scaling equations for the accurate prediction of CMOS device performance from 180 nm to 7 nm.

[BibT_eX]

[DOI]

Integr., 2017

A configurable H.265-compatible motion estimation accelerator architecture for realtime 4K video encoding in 65 nm CMOS.

[BibT_eX]

[DOI]

Michael Braly

Proceedings of the IEEE Conference on Dependable and Secure Computing, 2017

2016

A 5.8 pJ/Op 115 billion ops/sec, to 1.78 trillion ops/sec 32nm 1000-processor array.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Symposium on VLSI Circuits, 2016

KiloCore: A 32 nm 1000-processor array.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Hot Chips 28 Symposium (HCS), 2016

2015

Optimizing power of many-core systems by exploiting dynamic voltage, frequency and core scaling.

[BibT_eX]

[DOI]

Mohammad H. Foroozannejad

Soheil Ghiasi

Proceedings of the IEEE 58th International Midwest Symposium on Circuits and Systems, 2015

Area efficient backprojection computation with reduced floating-point word width for SAR image formation.

[BibT_eX]

[DOI]

Proceedings of the 49th Asilomar Conference on Signals, Systems and Computers, 2015

A software LDPC decoder implemented on a many-core array of programmable processors.

[BibT_eX]

[DOI]

Brent Bohnenstiehl

Proceedings of the 49th Asilomar Conference on Signals, Systems and Computers, 2015

2014

Processor Tile Shapes and Interconnect Topologies for Dense On-Chip Networks.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2014

Achieving High-Performance On-Chip Networks With Shared-Buffer Routers.

[BibT_eX]

[DOI]

Mohammad H. Foroozannejad

IEEE Trans. Very Large Scale Integr. Syst., 2014

Time-Scalable Mapping for Circuit-Switched GALS Chip Multiprocessor Platforms.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

Hybrid floating-point modules with low area overhead on a fine-grained processing core.

[BibT_eX]

[DOI]

Jon J. Pimentel

Proceedings of the 48th Asilomar Conference on Signals, Systems and Computers, 2014

Scalable hardware-based power management for many-core systems.

[BibT_eX]

[DOI]

Brent Bohnenstiehl

Proceedings of the 48th Asilomar Conference on Signals, Systems and Computers, 2014

2013

LDPC Decoder with an Adaptive Wordwidth Datapath for Energy and BER Co-Optimization.

[BibT_eX]

[DOI]

Houshmand Shirani-mehr

VLSI Design, 2013

Parallel AES Encryption Engines for Many-Core Processor Arrays.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2013

2012

A Hexagonal Processor and Interconnect Topology for Many-Core Architecture with Dense On-Chip Networks.

[BibT_eX]

[DOI]

Proceedings of the VLSI-SoC: From Algorithms to Circuits and System-on-Chip Design, 2012

A hexagonal shaped processor and interconnect topology for tightly-tiled many-core architecture.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE/IFIP International Conference on VLSI and System-on-Chip, 2012

Fine-Grained Energy-Efficient Sorting on a Many-Core Processor Array.

[BibT_eX]

[DOI]

Lucas Stillmaker

Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

2011

A 1080p H.264/AVC Baseline Residual Encoder for a Fine-Grained Many-Core System.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2011

Low power LDPC decoder with efficient stopping scheme for undecodable blocks.

[BibT_eX]

[DOI]

Houshmand Shirani-mehr

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2011), 2011

RoShaQ: High-performance on-chip router with shared queues.

[BibT_eX]

[DOI]

Proceedings of the IEEE 29th International Conference on Computer Design, 2011

A fine-grained parallel implementation of a H.264/AVC encoder on a 167-processor computational platform.

[BibT_eX]

[DOI]

Stephen Le

Proceedings of the Conference Record of the Forty Fifth Asilomar Conference on Signals, 2011

A reduced routing network architecture for partial parallel LDPC decoders.

[BibT_eX]

[DOI]

Houshmand Shirani-mehr

Proceedings of the Conference Record of the Forty Fifth Asilomar Conference on Signals, 2011

A high-performance area-efficient AES cipher on a many-core platform.

[BibT_eX]

[DOI]

Proceedings of the Conference Record of the Forty Fifth Asilomar Conference on Signals, 2011

2010

A Split-Decoding Message Passing Algorithm for Low Density Parity Check Decoders.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2010

A Low-Area Multi-Link Interconnect Architecture for GALS Chip Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2010

A Low-Complexity Message-Passing Algorithm for Reduced Routing Congestion in LDPC Decoders.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. I Regul. Pap., 2010

A Reconfigurable Source-Synchronous On-Chip Network for GALS Many-Core Platforms.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2010

Circuit modeling for practical many-core architecture design exploration.

[BibT_eX]

[DOI]

Dean Truong

Proceedings of the 47th Design Automation Conference, 2010

2009

High Performance, Energy Efficiency, and Scalability With GALS Chip Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2009

A 167-Processor Computational Platform in 65 nm CMOS.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2009

A GALS many-core heterogeneous DSP platform with source-synchronous on-chip interconnection network.

[BibT_eX]

[DOI]

Dean Truong

Proceedings of the Third International Symposium on Networks-on-Chips, 2009

A Low-cost High-speed Source-synchronous Interconnection Technique for GALS Chip Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

Multi-Split-Row Threshold Decoding Implementations for LDPC Codes.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

The Design of a Reconfigurable Continuous-flow Mixed-radix FFT Processor.

[BibT_eX]

[DOI]

Anthony T. Jacobson

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

An Improved Split-Row Threshold Decoding Algorithm for LDPC Codes.

[BibT_eX]

[DOI]

Dean Truong

Proceedings of IEEE International Conference on Communications, 2009

2008

Architecture and Evaluation of an Asynchronous Array of Simple Processors.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2008

AsAP: An Asynchronous Array of Simple Processors.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2008

A low-area interconnect architecture for chip multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

Dynamic voltage and frequency scaling circuits with two supply voltages.

[BibT_eX]

[DOI]

Wayne H. Cheng

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

A high-performance parallel CAVLC encoder on a fine-grained many-core system.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Computer Design, 2008

A complete real-time 802.11a baseband receiver implemented on an array of programmable processors.

[BibT_eX]

[DOI]

Proceedings of the 42nd Asilomar Conference on Signals, Systems and Computers, 2008

A thresholding algorithm for improved Split-Row decoding of LDPC codes.

[BibT_eX]

[DOI]

Pascal Urard

Proceedings of the 42nd Asilomar Conference on Signals, Systems and Computers, 2008

2007

A Scalable Dual-Clock FIFO for Data Transfers Between Arbitrary and Haltable Clock Domains.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2007

AsAP: A Fine-Grained Many-Core Platform for DSP Applications.

[BibT_eX]

[DOI]

IEEE Micro, 2007

A Shared Memory Module for Asynchronous Arrays of Processors.

[BibT_eX]

[DOI]

Michael J. Meeuwsen

EURASIP J. Embed. Syst., 2007

High-Throughput LDPC Decoders Using A Multiple Split-Row Method.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

Performance and Power Analysis of Globally Asynchronous Locally Synchronous Multi-Processor Systems.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2006), 2006

An asynchronous array of simple processors for dsp applications.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Solid State Circuits Conference, 2006

Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006

Split-Row: A Reduced Complexity, High Throughput LDPC Decoder Architecture.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006

Hardware and applications of AsAP: An asynchronous array of simple processors.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE Hot Chips 18 Symposium (HCS), 2006

2005

A generalized cached-FFT algorithm.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

1999

A low-power, high-performance, 1024-point FFT processor.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 1999

1998

A 9.5 mW 330 μsec 1024-point FFT processor.

[BibT_eX]

[DOI]