Zhongfeng Wang

Orcid: 0000-0002-7227-4786

Affiliations:
  • Nanjing University, School of Electronic Science and Engineering, Nanjing, China
  • Broadcom Corporation, San Jose, CA, USA (former)
  • Oregon State University, Corvallis, OR, USA (former)
  • University of Minnesota, Minneapolis, MN, USA (former, PhD 2000)


According to our database1, Zhongfeng Wang authored at least 319 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
ALT: Area-Efficient and Low-Latency FPGA Design for Torus Fully Homomorphic Encryption.
IEEE Trans. Very Large Scale Integr. Syst., April, 2024

Low-Latency PAE: Permutation-Based Address Encryption Hardware Engine for IoT Real-Time Memory Protection.
IEEE Internet Things J., April, 2024

Mixed Integer Programming based Placement Refinement by RSMT Model with Movable Pins.
ACM Trans. Design Autom. Electr. Syst., March, 2024

Hardware Accelerator Design for Sparse DNN Inference and Training: A Tutorial.
IEEE Trans. Circuits Syst. II Express Briefs, March, 2024

Correlated Channel-Oriented Expectation Propagation-Based Detector for Massive MIMO Systems.
IEEE Trans. Circuits Syst. I Regul. Pap., March, 2024

WinTA: An Efficient Reconfigurable CNN Training Accelerator With Decomposition Winograd.
IEEE Trans. Circuits Syst. I Regul. Pap., February, 2024

Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., February, 2024

RISC-V Custom Instructions of Elementary Functions for IoT Endpoint Devices.
IEEE Trans. Computers, February, 2024

NASA-F: FPGA-Oriented Search and Acceleration for Multiplication-Reduced Hybrid Networks.
IEEE Trans. Circuits Syst. I Regul. Pap., January, 2024

CSI-Based MIMO Indoor Positioning Using Attention-Aided Deep Learning.
IEEE Commun. Lett., January, 2024

An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT.
CoRR, 2024

An FPGA-Based Accelerator Enabling Efficient Support for CNNs with Arbitrary Kernel Sizes.
CoRR, 2024

A Scalable RISC-V Vector Processor Enabling Efficient Multi-Precision DNN Inference.
CoRR, 2024

BETA: Binarized Energy-Efficient Transformer Accelerator at the Edge.
CoRR, 2024

2023
An Efficient Training Accelerator for Transformers With Hardware-Algorithm Co-Optimization.
IEEE Trans. Very Large Scale Integr. Syst., November, 2023

A Low-Latency Framework With Algorithm-Hardware Co-Optimization for 3-D Point Cloud.
IEEE Trans. Circuits Syst. II Express Briefs, November, 2023

ProMiSE: A High-Performance Programmable Hardware Monitor for High Security Enforcement of Software Execution.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2023

1+1 <2: Efficient Automatic Standard Cell Sharing Between Digital VLSI Designs for Area Saving.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2023

ETA: An Efficient Training Accelerator for DNNs Based on Hardware-Algorithm Co-Optimization.
IEEE Trans. Neural Networks Learn. Syst., October, 2023

A New ACD-OMP Accelerator With Clustered Computing Look-Ahead.
IEEE Trans. Very Large Scale Integr. Syst., September, 2023

A Unified Acceleration Solution Based on Deformable Network for Image Pixel Processing.
IEEE Trans. Circuits Syst. II Express Briefs, September, 2023

Automatic Model-Based Dataset Generation for High-Level Vision Tasks of Autonomous Driving in Haze Weather.
IEEE Trans. Ind. Informatics, August, 2023

Fast Hardware Implementation for Extended GCD of Large Numbers in Redundant Representation.
IEEE Trans. Circuits Syst. II Express Briefs, August, 2023

ReAFM: A Reconfigurable Nonlinear Activation Function Module for Neural Networks.
IEEE Trans. Circuits Syst. II Express Briefs, July, 2023

FTA-GAN: A Computation-Efficient Accelerator for GANs With Fast Transformation Algorithm.
IEEE Trans. Neural Networks Learn. Syst., June, 2023

Low-latency Hardware Architecture for VDF Evaluation in Class Groups.
IEEE Trans. Computers, June, 2023

An Efficient Massive MIMO Detector Based on Approximate Expectation Propagation.
IEEE Trans. Very Large Scale Integr. Syst., May, 2023

Rethinking Parallel Memory Access Pattern in Number Theoretic Transform Design.
IEEE Trans. Circuits Syst. II Express Briefs, May, 2023

3.8-Gbps Polar Belief Propagation Decoder on GPU.
IEEE Commun. Lett., May, 2023

An Adaptive Chase-Pyndiah Algorithm for Turbo Product Codes.
IEEE Commun. Lett., April, 2023

AC-PM: An Area-Efficient and Configurable Polynomial Multiplier for Lattice Based Cryptography.
IEEE Trans. Circuits Syst. I Regul. Pap., February, 2023

A High-Speed FPGA-Based Hardware Implementation for Leighton-Micali Signature.
IEEE Trans. Circuits Syst. I Regul. Pap., January, 2023

Fast Successive-Cancellation Decoding of 5G Parity-Check Polar Codes.
IEEE Commun. Lett., January, 2023

Dual-Bit-Wise Stochastic Decoding for Polar Codes.
IEEE Trans. Signal Process., 2023

Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT.
IEEE Trans. Signal Process., 2023

GANDSE: Generative Adversarial Network-based Design Space Exploration for Neural Network Accelerator Design.
ACM Trans. Design Autom. Electr. Syst., 2023

Intelligent Typography: Artistic Text Style Transfer for Complex Texture and Structure.
IEEE Trans. Multim., 2023

Low-Latency Design and Implementation of the Squaring in Class Groups for Verifiable Delay Function Using Redundant Representation.
IACR Trans. Cryptogr. Hardw. Embed. Syst., 2023

An Efficient Accelerator Based on Lightweight Deformable 3D-CNN for Video Super-Resolution.
IEEE Trans. Circuits Syst. I Regul. Pap., 2023

NASA+: Neural Architecture Search and Acceleration for Multiplication-Reduced Hybrid Networks.
IEEE Trans. Circuits Syst. I Regul. Pap., 2023

Reconfigurable and High-Efficiency Polynomial Multiplication Accelerator for CRYSTALS-Kyber.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2023

A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network.
CoRR, 2023

Efficient N: M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design.
CoRR, 2023

A Precision-Scalable RISC-V DNN Processor with On-Device Learning Capability at the Extreme Edge.
CoRR, 2023

S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution.
CoRR, 2023

An Efficient Hardware Design for Fast Implementation of HQC.
Proceedings of the 36th IEEE International System-on-Chip Conference, 2023

An FPGA-Based Reconfigurable CNN Training Accelerator Using Decomposable Winograd.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2023

Column-Weighted Probabilistic GDBF Decoder for Irregular LDPC Codes.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2023

Efficient Decryption Architecture for Classic McEliece.
Proceedings of the 24th International Symposium on Quality Electronic Design, 2023

High-Throughput Hardware Implementation for Haraka in SPHINCS+.
Proceedings of the 24th International Symposium on Quality Electronic Design, 2023

Efficient FPGA-Based Accelerator of the L-BFGS Algorithm for IoT Applications.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

Bebert: Efficient And Robust Binary Ensemble Bert.
Proceedings of the IEEE International Conference on Acoustics, 2023

ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

CEST: Computation-Efficient N:M Sparse Training for Deep Neural Networks.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

2022
A Universal Efficient Circular-Shift Network for Reconfigurable Quasi-Cyclic LDPC Decoders.
IEEE Trans. Very Large Scale Integr. Syst., 2022

THETA: A High-Efficiency Training Accelerator for DNNs With Triple-Side Sparsity Exploration.
IEEE Trans. Very Large Scale Integr. Syst., 2022

Efficient Homomorphic Convolution Designs on FPGA for Secure Inference.
IEEE Trans. Very Large Scale Integr. Syst., 2022

An Algorithm-Hardware Co-Optimized Framework for Accelerating N: M Sparse Transformers.
IEEE Trans. Very Large Scale Integr. Syst., 2022

An Efficient Reconfigurable Encoder for the IEEE 1901 Standard.
IEEE Trans. Very Large Scale Integr. Syst., 2022

An Efficient High-Throughput Structured-Light Depth Engine.
IEEE Trans. Very Large Scale Integr. Syst., 2022

Rethinking Adaptive Computing: Building a Unified Model Complexity-Reduction Framework With Adversarial Robustness.
IEEE Trans. Neural Networks Learn. Syst., 2022

FACCU: Enable Fast Accumulation for High-Speed DSP Systems.
IEEE Trans. Circuits Syst. II Express Briefs, 2022

High-Throughput LDPC-CC Decoders Based on Storage, Arithmetic, and Control Improvements.
IEEE Trans. Circuits Syst. II Express Briefs, 2022

A Flexible and Efficient FPGA Accelerator for Various Large-Scale and Lightweight CNNs.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

An Area-Efficient Message Passing Detector for Massive MIMO Systems.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

Hybrid Stochastic-Binary Computing for Low-Latency and High-Precision Inference of CNNs.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

Efficient Software Implementation of the SIKE Protocol Using a New Data Representation.
IEEE Trans. Computers, 2022

RvDfi: A RISC-V Architecture With Security Enforcement by High Performance Complete Data-Flow Integrity.
IEEE Trans. Computers, 2022

A low latency traffic sign detection model with an automatic data labeling pipeline.
Neural Comput. Appl., 2022

LDPC decoding with locally informed dynamic scheduling based on the law of large numbers.
IET Commun., 2022

Iterative Hard Thresholding Algorithm-Based Detector for Compressed OFDM-IM Systems.
IEEE Commun. Lett., 2022

Automatically search an optimal face detector for a specific deployment environment.
EURASIP J. Adv. Signal Process., 2022

Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT.
CoRR, 2022

A Reliability Profile Based Low-Complexity Dynamic Schedule LDPC Decoding.
IEEE Access, 2022

Forecasting Stock Indexes with Metabolic DWT and MWA-GM(1,1).
Proceedings of the 14th International Conference on Wireless Communications and Signal Processing, 2022

Reduction-Free Multiplication for Finite Fields and Polynomial Rings.
Proceedings of the Arithmetic of Finite Fields - 9th International Workshop, 2022

An Efficient FPGA Accelerator for Point Cloud.
Proceedings of the 35th IEEE International System-on-Chip Conference, 2022

A Modified BP Bit-Flipping Algorithm for Polar Codes.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2022

An Efficient Accelerator of Deformable 3D Convolutional Network for Video Super-Resolution.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

An RS-BCH Concatenated FEC Code for Beyond 400 Gb/s Networking.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

Accelerating NLP Tasks on FPGA with Compressed BERT and a Hardware-Oriented Early Exit Method.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

An Efficient FPGA-based Accelerator for Deep Forest.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

A Reconfigurable Approach for Deconvolutional Network Acceleration with Fast Algorithm.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

A High-Speed Codec Architecture for Lagrange Coded Computing.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Deep Neural Network Interlayer Feature Map Compression Based on Least-Squares Fitting.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Accelerate Three-Dimensional Generative Adversarial Networks Using Fast Algorithm.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

UCViT: Hardware-Friendly Vision Transformer via Unified Compression.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

An Efficient Hardware Architecture for DNN Training by Exploiting Triple Sparsity.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

An Efficient Hardware Accelerator for Sparse Transformer Neural Networks.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Magical-Decomposition: Winning Both Adversarial Robustness and Efficiency on Hardware.
Proceedings of the International Conference on Machine Learning and Cybernetics, 2022

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

Boosting Both Robustness and Hardware Efficiency via Random Pruning Mask Selection.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2022, 2022

PREFENDER: A Prefetching Defender against Cache Side Channel Attacks as A Pretender.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

FPGA-Accelerated Maze Routing Kernel for VLSI Designs.
Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

Low-Complexity Dynamic Single-Minimum Min-Sum Algorithm and Hardware Implementation for LDPC Codes.
Proceedings of the IEEE Asia Pacific Conference on Circuit and Systems, 2022

Low-Complexity Parallel Syndrome Computation for BCH Decoders Based on Cyclotomic FFT.
Proceedings of the IEEE Asia Pacific Conference on Circuit and Systems, 2022

High-Speed and Low-Complexity Modular Reduction Design for CRYSTALS-Kyber.
Proceedings of the IEEE Asia Pacific Conference on Circuit and Systems, 2022

A Novel Interleaving Scheme for Concatenated Codes on Burst-Error Channel.
Proceedings of the 27th Asia Pacific Conference on Communications, 2022

Performance Analysis of Extended Integrated Interleaved Codes.
Proceedings of the 27th Asia Pacific Conference on Communications, 2022

An Efficient CNN Training Accelerator Leveraging Transposable Block Sparsity.
Proceedings of the 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2022

2021
CLA Formula and its Acceleration of Architecture Design for Clustered Look-Ahead Pipelined Recursive Digital Filter.
J. Signal Process. Syst., 2021

Fast Modular Multipliers for Supersingular Isogeny-Based Post-Quantum Cryptography.
IEEE Trans. Very Large Scale Integr. Syst., 2021

An Efficient and Flexible Accelerator Design for Sparse Convolutional Neural Networks.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

High-Speed FPGA Implementation of SIKE Based on an Ultra-Low-Latency Modular Multiplier.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

Generalized Analog-to-Information Converter With Analysis Sparse Prior.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

Low-Latency Hardware Accelerator for Improved Engle-Granger Cointegration in Pairs Trading.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

Design of High-Performance and Area-Efficient Decoder for 5G LDPC Codes.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

Evaluations on Deep Neural Networks Training Using Posit Number System.
IEEE Trans. Computers, 2021

Low-complexity sphere decoding for MIMO-SCMA systems.
IET Commun., 2021

An Improved Reliability-Based Decoding Algorithm for NB-LDPC Codes.
IEEE Commun. Lett., 2021

An Improved Method for Performance Analysis of Generalized Integrated Interleaved Codes.
IEEE Commun. Lett., 2021

A High-Speed Architecture for the Reduction in VDF Based on a Class Group.
IACR Cryptol. ePrint Arch., 2021

Automatic Generation of Dynamic Inference Architecture for Deep Neural Networks.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2021

A Memory-Efficient Hardware Architecture for Deformable Convolutional Networks.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2021

A Stage-wise Conversion Strategy for Low-Latency Deformable Spiking CNN.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2021

A Reconfigurable Accelerator for Generative Adversarial Network Training Based on FPGA.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2021

An FPGA-Based Reconfigurable Accelerator for Low-Bit DNN Training.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2021

PipeBSW: A Two-Stage Pipeline Structure for Banded Smith-Waterman Algorithm on FPGA.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2021

Counter Random Gradient Descent Bit-Flipping Decoder for LDPC Codes.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2021

LSMQ: A Layer-Wise Sensitivity-Based Mixed-Precision Quantization Method for Bit-Flexible CNN Accelerator.
Proceedings of the 18th International SoC Design Conference, 2021

Low-Latency Architecture for the Parallel Extended GCD Algorithm of Large Numbers.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

High-Speed and Scalable FPGA Implementation of the Key Generation for the Leighton-Micali Signature Protocol.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Transform-Based Feature Map Compression for CNN Inference.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Accelerating 3D Convolutional Neural Networks Using 3D Fast Fourier Transform.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

A DNN Optimization Framework with Unlabeled Data for Efficient and Accurate Reconfigurable Hardware Inference.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

LITNet: A Light-weight Image Transform Net for Image Style Transfer.
Proceedings of the International Joint Conference on Neural Networks, 2021

Elbert: Fast Albert with Confidence-Window Based Early Exit.
Proceedings of the IEEE International Conference on Acoustics, 2021

DARM: A Low-Complexity and Fast Modular Multiplier for Lattice-Based Cryptography.
Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

Flexible-width Bit-level Compressor for Convolutional Neural Network.
Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

Federated Regularization Learning: an Accurate and Safe Method for Federated Learning.
Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

2020
GH CORDIC-Based Architecture for Computing $N$ th Root of Single-Precision Floating-Point Number.
IEEE Trans. Very Large Scale Integr. Syst., 2020

F-DNA: Fast Convolution Architecture for Deconvolutional Network Acceleration.
IEEE Trans. Very Large Scale Integr. Syst., 2020

Information Storage Bit-Flipping Decoder for LDPC Codes.
IEEE Trans. Very Large Scale Integr. Syst., 2020

Efficient Precision-Adjustable Architecture for Softmax Function in Deep Learning.
IEEE Trans. Circuits Syst., 2020

Optimized Trellis-Based Min-Max Decoder for NB-LDPC Codes.
IEEE Trans. Circuits Syst. II Express Briefs, 2020

A Novel Iterative Reliability-Based Majority-Logic Decoder for NB-LDPC Codes.
IEEE Trans. Circuits Syst. II Express Briefs, 2020

A Novel Approximation Methodology and Its Efficient VLSI Implementation for the Sigmoid Function.
IEEE Trans. Circuits Syst., 2020

A Precision-Scalable Energy-Efficient Convolutional Neural Network Accelerator.
IEEE Trans. Circuits Syst., 2020

Fine-Grained Bit-Flipping Decoding for LDPC Codes.
IEEE Trans. Circuits Syst. II Express Briefs, 2020

A lightweight face detector by integrating the convolutional neural network with the image pyramid.
Pattern Recognit. Lett., 2020

Calibration of timing mismatch in TIADC based on monotonicity detecting of sampled data.
IEICE Electron. Express, 2020

Multi-Layer Generalized Integrated Interleaved Codes.
IEEE Commun. Lett., 2020

Faster Software Implementation of the SIKE Protocol Based on A New Data Representation.
IACR Cryptol. ePrint Arch., 2020

Ultra-Fast Modular Multiplication Implementation for Isogeny-Based Post-Quantum Cryptography.
IACR Cryptol. ePrint Arch., 2020

A Universal Approximation Method and Optimized Hardware Architectures for Arithmetic Functions Based on Stochastic Computing.
IEEE Access, 2020

Efficient Inference of Large-Scale and Lightweight Convolutional Neural Networks on FPGA.
Proceedings of the 33rd IEEE International System-on-Chip Conference, 2020

A Reconfigurable Permutation Based Address Encryption Architecture for Memory Security.
Proceedings of the 33rd IEEE International System-on-Chip Conference, 2020

Hardware Accelerator for Multi-Head Attention and Position-Wise Feed-Forward in the Transformer.
Proceedings of the 33rd IEEE International System-on-Chip Conference, 2020

A Configurable FPGA Accelerator of Bi-LSTM Inference with Structured Sparsity.
Proceedings of the 33rd IEEE International System-on-Chip Conference, 2020

Temporal Residual Feature Learning for Efficient 3D Convolutional Neural Network on Action Recognition Task.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2020

A Reconfigurable DNN Training Accelerator on FPGA.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2020

Financial Time Series Forecasting Model Based on EMD and Rolling Grey Model.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2020

A Serial Maximum-likelihood Detection Algorithm for Massive MIMO Systems.
Proceedings of the 18th IEEE International New Circuits and Systems Conference, 2020

A Computation-Efficient Solution for Acceleration of Generative Adversarial Network.
Proceedings of the 18th IEEE International New Circuits and Systems Conference, 2020

Exploring Quantization in Few-Shot Learning.
Proceedings of the 18th IEEE International New Circuits and Systems Conference, 2020

Efficient Hardware Post Processing of Anchor-Based Object Detection on FPGA.
Proceedings of the 2020 IEEE Computer Society Annual Symposium on VLSI, 2020

A Novel Modular Multiplier for Isogeny-Based Post-Quantum Cryptography.
Proceedings of the 2020 IEEE Computer Society Annual Symposium on VLSI, 2020

An Implementation of Pre-Quantized Random Demodulator Based on Amplitude-to-Pulse Converter.
Proceedings of the 2020 IEEE Computer Society Annual Symposium on VLSI, 2020

A Three-Level Scoring System for Fast Similarity Evaluation Based on Smith-Waterman Algorithm.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

An Optimized Compression Strategy for Compressor-Based Approximate Multiplier.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

Hardware Accelerator for Engle-Granger Cointegration in Pairs Trading.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

LSTM-Based Quantitative Trading Using Dynamic K-Top and Kelly Criterion.
Proceedings of the 2020 International Joint Conference on Neural Networks, 2020

Optimizing Stochastic Computing for Low Latency Inference of Convolutional Neural Networks.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

In-Memory Computing: The Next-Generation AI Computing Paradigm.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

An Efficient Accelerator of the Squaring for the Verifiable Delay Function Over a Class Group.
Proceedings of the 2020 IEEE Asia Pacific Conference on Circuits and Systems, 2020

An Efficient FPGA Accelerator Optimized for High Throughput Sparse CNN Inference.
Proceedings of the 2020 IEEE Asia Pacific Conference on Circuits and Systems, 2020

LBFP: Logarithmic Block Floating Point Arithmetic for Deep Neural Networks.
Proceedings of the 2020 IEEE Asia Pacific Conference on Circuits and Systems, 2020

Fast Permutation Architecture on Encrypted Data for Secure Neural Network Inference.
Proceedings of the 2020 IEEE Asia Pacific Conference on Circuits and Systems, 2020

Efficient FPGA design for Convolutions in CNN based on FFT-pruning.
Proceedings of the 2020 IEEE Asia Pacific Conference on Circuits and Systems, 2020

2019
Analysis and Design of a Large Dither Injection Circuit for Improving Linearity in Pipelined ADCs.
IEEE Trans. Very Large Scale Integr. Syst., 2019

Corrections to "Generalized Hyperbolic CORDIC and Its Logarithmic and Exponential Computation With Arbitrary Fixed Base".
IEEE Trans. Very Large Scale Integr. Syst., 2019

Generalized Hyperbolic CORDIC and Its Logarithmic and Exponential Computation With Arbitrary Fixed Base.
IEEE Trans. Very Large Scale Integr. Syst., 2019

Background Calibration of Comparator Offsets in SHA-Less Pipelined ADCs.
IEEE Trans. Circuits Syst. II Express Briefs, 2019

A New Clock Phase Calibration Method in High-Speed and High-Resolution DACs.
IEEE Trans. Circuits Syst. II Express Briefs, 2019

A High-Speed Successive-Cancellation Decoder for Polar Codes Using Approximate Computing.
IEEE Trans. Circuits Syst. II Express Briefs, 2019

FPAP: A Folded Architecture for Energy-Quality Scalable Convolutional Neural Networks.
IEEE Trans. Circuits Syst. I Regul. Pap., 2019

A 124-Gb/s Decoder for Generalized Integrated Interleaved Codes.
IEEE Trans. Circuits Syst. I Regul. Pap., 2019

An Improved Gradient Descent Bit-Flipping Decoder for LDPC Codes.
IEEE Trans. Circuits Syst. I Regul. Pap., 2019

An Efficient Post-Processor for Lowering the Error Floor of LDPC Codes.
IEEE Trans. Circuits Syst. II Express Briefs, 2019

A channel multiplexing digital calibration technique for timing mismatch of time-interleaved ADCs.
IEICE Electron. Express, 2019

Improved Fast-SSC-Flip Decoding of Polar Codes.
IEEE Commun. Lett., 2019

Modified GII-BCH Codes for Low-Complexity and Low-Latency Encoders.
IEEE Commun. Lett., 2019

High-Speed Modular Multipliers for Isogeny-Based Post-Quantum Cryptography.
IACR Cryptol. ePrint Arch., 2019

E-LSTM: An Efficient Hardware Architecture for Long Short-Term Memory.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2019

Design Light-weight 3D Convolutional Networks for Video Recognition Temporal Residual, Fully Separable Block, and Fast Algorithm.
CoRR, 2019

Efficient T-EMS Based Decoding Algorithms for High-Order LDPC Codes.
IEEE Access, 2019

A Hardware-Oriented and Memory-Efficient Method for CTC Decoding.
IEEE Access, 2019

Improved Decoding Algorithms of LDPC Codes Based on Reliability Metrics of Variable Nodes.
IEEE Access, 2019

Training Deep Neural Networks Using Posit Number System.
Proceedings of the 32nd IEEE International System-on-Chip Conference, 2019

Hybrid Preconditioned CG Detection with Sequential Update for Massive MIMO Systems.
Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems, 2019

A Low-Complexity Error-and-Erasure Decoding Algorithm for t=2 RS Codes.
Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems, 2019

DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks.
Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems, 2019

CLA Formula Aided Fast Architecture Design for Clustered Look-Ahead Pipelined IIR Digital Filter.
Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems, 2019

A Low-Latency and Low-Complexity Hardware Architecture for CTC Beam Search Decoding.
Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems, 2019

EAGLE: Exploiting Essential Address in Both Weight and Activation to Accelerate CNN Computing.
Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems, 2019

A New Inversionless Berlekamp-Massey Algorithm with Efficient Architecture.
Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems, 2019

Hardware Implementation of Improved Fast-SSC-Flip Decoder for Polar Codes.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

A Low-Complexity RS Decoder for Triple-Error-Correcting RS Codes.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

Fast-ABC: A Fast Architecture for Bottleneck-Like Based Convolutional Neural Networks.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

A Decomposition Mapping based Quantized Belief Propagation Decoding for 5G LDPC Codes.
Proceedings of the 19th International Symposium on Communications and Information Technologies, 2019

A Novel Low-Complexity Joint Coding and Decoding Algorithm for NB-LDPC Codes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

Methodology for Efficient Reconfigurable Architecture of Generative Neural Network.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

USCA: A Unified Systolic Convolution Array Architecture for Accelerating Sparse Neural Network.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

A New Probabilistic Gradient Descent Bit Flipping Decoder for LDPC Codes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

TIE: energy-efficient tensor train-based inference engine for deep neural network.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

A New Fast-SSC-Flip Decoding of Polar Codes.
Proceedings of the 2019 IEEE International Conference on Communications, 2019

A Low-latency Sparse-Winograd Accelerator for Convolutional Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2019

Redundancy-Aided Iterative Reliability-Based Majority-Logic Decoding for NB-LDPC Codes.
Proceedings of the 13th IEEE International Conference on ASIC, 2019

An Enhanced Offset Min-Sum decoder for 5G LDPC Codes.
Proceedings of the 25th Asia-Pacific Conference on Communications, 2019

2018
Low Complexity Message Passing Detection Algorithm for Large-Scale MIMO Systems.
IEEE Wirel. Commun. Lett., 2018

A Stage-Combined Belief Propagation Decoder for Polar Codes.
J. Signal Process. Syst., 2018

An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks.
IEEE Trans. Very Large Scale Integr. Syst., 2018

Design of Binary LDPC Codes With Parallel Vector Message Passing.
IEEE Trans. Commun., 2018

An Improved Gauss-Seidel Algorithm and Its Efficient Architecture for Massive MIMO Systems.
IEEE Trans. Circuits Syst. II Express Briefs, 2018

Efficient Hardware Architectures for Deep Convolutional Neural Network.
IEEE Trans. Circuits Syst. I Regul. Pap., 2018

A 21.66 Gbps Nonbinary LDPC Decoder for High-Speed Communications.
IEEE Trans. Circuits Syst. II Express Briefs, 2018

CORDIC-Based Architecture for Computing Nth Root and Its Implementation.
IEEE Trans. Circuits Syst. I Regul. Pap., 2018

Hardware-Oriented Compression of Long Short-Term Memory for Efficient Inference.
IEEE Signal Process. Lett., 2018

Stuck-at-close defect propagation and its blocking technique in CMOL cell mapping.
Microelectron. J., 2018

SGAD: Soft-Guided Adaptively-Dropped Neural Network.
CoRR, 2018

Approximate Comparator: Design and Analysis.
Proceedings of the 2018 IEEE International Workshop on Signal Processing Systems, 2018

Bandwidth Efficient Architectures for Convolutional Neural Network.
Proceedings of the 2018 IEEE International Workshop on Signal Processing Systems, 2018

FPAP: A Folded Architecture for Efficient Computing of Convolutional Neural Networks.
Proceedings of the 2018 IEEE Computer Society Annual Symposium on VLSI, 2018

An Optimized Architecture For Decomposed Convolutional Neural Networks.
Proceedings of the 2018 IEEE Computer Society Annual Symposium on VLSI, 2018

An Efficient Convolution Core Architecture for Privacy-Preserving Deep Learning.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

A New Soft-input Hard-output decoding algorithm for Turbo Product Codes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

An Efficient NB-LDPC Decoding Algorithm for Next-Generation Memories.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Eadnet: Efficient Architecture for Decomposed Convolutional Neural Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Fast and Low-Complexity Decoding Algorithm and Architecture for Quadruple-Error-Correcting RS codes.
Proceedings of the 2018 IEEE Asia Pacific Conference on Circuits and Systems, 2018

A High-Speed and Low-Complexity Architecture for Softmax Function in Deep Learning.
Proceedings of the 2018 IEEE Asia Pacific Conference on Circuits and Systems, 2018

Analysis of the Dual-Threshold-Based Shrinking Scheme for Efficient NB-LDPC Decoding.
Proceedings of the 2018 IEEE Asia Pacific Conference on Circuits and Systems, 2018

A High-Throughout Real-Time Prewitt Operator on Embedded NEON+ARM System.
Proceedings of the 2018 IEEE Asia Pacific Conference on Circuits and Systems, 2018

A Novel Compiler for Regular Expression Matching Engine Construction.
Proceedings of the 2018 IEEE Asia Pacific Conference on Circuits and Systems, 2018

Efficient Reconfigurable Hardware Core for Convolutional Neural Networks.
Proceedings of the 52nd Asilomar Conference on Signals, Systems, and Computers, 2018

2017
Accelerating Recurrent Neural Networks: A Memory-Efficient Approach.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Efficient Soft Cancelation Decoder Architectures for Polar Codes.
IEEE Trans. Very Large Scale Integr. Syst., 2017

High-Speed Parallel LFSR Architectures Based on Improved State-Space Transformations.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Dynamical Textures Modeling via Joint Video Dictionary Learning.
IEEE Trans. Image Process., 2017

Compressed Level Crossing Sampling for Ultra-Low Power IoT Devices.
IEEE Trans. Circuits Syst. I Regul. Pap., 2017

Fully-Parallel Area-Efficient Deep Neural Network Design Using Stochastic Computing.
IEEE Trans. Circuits Syst. II Express Briefs, 2017

Advanced Baseband Processing Algorithms, Circuits, and Implementations for 5G Communication.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2017

Guest Editorial Advanced Baseband Processing Circuits and Systems for 5G Communications.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2017

Analysis and Comparison of FEC Schemes for 200GbE and 400GbE.
IEEE Commun. Stand. Mag., 2017

Reduced complexity message passing detection algorithm in large-scale MIMO systems.
Proceedings of the 9th International Conference on Wireless Communications and Signal Processing, 2017

Low-complexity detection algorithms based on matrix partition for massive MIMO.
Proceedings of the 9th International Conference on Wireless Communications and Signal Processing, 2017

Efficient approximate layered LDPC decoder.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

Algorithm and architecture for joint detection and decoding for MIMO with LDPC codes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

Energy efficient SVM classifier using approximate computing.
Proceedings of the 12th IEEE International Conference on ASIC, 2017

Efficient fast convolution architectures for convolutional neural network.
Proceedings of the 12th IEEE International Conference on ASIC, 2017

Segmented successive cancellation list polar decoding with joint BCH-CRC codes.
Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017

2016
Area-Efficient Scaling-Free DFT/FFT Design Using Stochastic Computing.
IEEE Trans. Circuits Syst. II Express Briefs, 2016

An Efficient FPGA Implementation for 2-D MUSIC Algorithm.
Circuits Syst. Signal Process., 2016

Efficient convolution architectures for convolutional neural network.
Proceedings of the 8th International Conference on Wireless Communications & Signal Processing, 2016

Intra-layer nonuniform quantization of convolutional neural network.
Proceedings of the 8th International Conference on Wireless Communications & Signal Processing, 2016

Compressed Power Spectral Density Estimation via Group-Based Total Variation Minimization.
Proceedings of the 2016 IEEE International Workshop on Signal Processing Systems, 2016

An Efficient Hardware Architecture for Lossless Data Compression in Data Center.
Proceedings of the 2016 IEEE International Workshop on Signal Processing Systems, 2016

Beyond 100Gbps Encoder Design for Staircase Codes.
Proceedings of the 2016 IEEE International Workshop on Signal Processing Systems, 2016

Stage-combined belief propagation decoding of polar codes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

A high throughput belief propagation decoder architecture for polar codes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

Area-Efficient Error-Resilient Discrete Fourier Transformation Design using Stochastic Computing.
Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016

2015
A stage-reduced low-latency successive cancellation decoder for polar codes.
Proceedings of the 2015 IEEE International Conference on Digital Signal Processing, 2015

2014
Efficient symbol reliability based decoding for QCNB-LDPC codes.
Proceedings of the IEEE International Symposium on Circuits and Systemss, 2014

Multilevel error correction scheme for MLC flash memory.
Proceedings of the IEEE International Symposium on Circuits and Systemss, 2014

2013
Memory efficient EMS decoding for non-binary LDPC codes.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

2012
Unified Architecture for Reed-Solomon Decoder Combined With Burst-Error Correction.
IEEE Trans. Very Large Scale Integr. Syst., 2012

Nonbinary LDPC Code Decoder Architecture With Efficient Check Node Processing.
IEEE Trans. Circuits Syst. II Express Briefs, 2012

Efficient EMS decoding for non-binary LDPC codes.
Proceedings of the International SoC Design Conference, 2012

Memory efficient column-layered decoder design for non-binary LDPC codes.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

2011
Reduced-complexity column-layered decoding and implementation for LDPC codes.
IET Commun., 2011

Memory efficient decoder design of nonbinary LDPC codes.
Proceedings of the International SoC Design Conference, 2011

2010
Flexible LDPC Decoder Design for Multigigabit-per-Second Applications.
IEEE Trans. Circuits Syst. I Regul. Pap., 2010

An Efficient VLSI Architecture for Nonbinary LDPC Decoders.
IEEE Trans. Circuits Syst. II Express Briefs, 2010

Efficient Decoder Design for Nonbinary Quasicyclic LDPC Codes.
IEEE Trans. Circuits Syst. I Regul. Pap., 2010

Layered decoding for non-binary LDPC codes.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Low power decoder design for QC-LDPC codes.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

2009
Multi-Gb/s LDPC Code Design and Implementation.
IEEE Trans. Very Large Scale Integr. Syst., 2009

High-Throughput Layered LDPC Decoding Architecture.
IEEE Trans. Very Large Scale Integr. Syst., 2009

An improved scaled DCT architecture.
IEEE Trans. Consumer Electron., 2009

LDPC decoder design for high rate wireless personal area networks.
IEEE Trans. Consumer Electron., 2009

Area-efficient reed-solomon decoder design for optical communications.
IEEE Trans. Circuits Syst. II Express Briefs, 2009

Decoder Design for RS-Based LDPC Codes.
IEEE Trans. Circuits Syst. II Express Briefs, 2009

Efficient Shuffle Network Architecture and Application for WiMAX LDPC Decoders.
IEEE Trans. Circuits Syst. II Express Briefs, 2009

An improved min-sum based column-layered decoding algorithm for LDPC codes.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2009

Area-efficient Reed-Solomon Decoder Design for 10-100 Gb/s Applications.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

LDPC Decoder Design for IEEE 802.15 Standard.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

Towards an Optimal Trade-off of Viterbi Decoder Design.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

2008
Improved low-complexity low-density parity-check decoding.
IET Commun., 2008

Low-complexity high-speed 4-D TCM decoder.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2008

A low-complexity high-performance noncoherent receiver for GFSK signals.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

Extended layered decoding of LDPC codes.
Proceedings of the 18th ACM Great Lakes Symposium on VLSI 2008, 2008

Low-complexity shift-LDPC decoder for high-speed communication systems.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008

Efficient architecture for the Tate pairing in characteristic three.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008

Efficient radius and list updating units design for list sphere decoders.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008

Fast point operation architecture for Elliptic Curve Cryptography.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008

Efficient decoder design for high-throughput LDPC decoding.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2008

2007
A Memory Efficient Partially Parallel Decoder Architecture for Quasi-Cyclic LDPC Codes.
IEEE Trans. Very Large Scale Integr. Syst., 2007

Low-Complexity High-Speed Decoder Design for Quasi-Cyclic LDPC Codes.
IEEE Trans. Very Large Scale Integr. Syst., 2007

High-Speed Recursion Architectures for MAP-Based Turbo Decoders.
IEEE Trans. Very Large Scale Integr. Syst., 2007

Low-Latency Factorization Architecture for Algebraic Soft-Decision Decoding of Reed-Solomon Codes.
IEEE Trans. Very Large Scale Integr. Syst., 2007

Very Low-Complexity Hardware Interleaver for Turbo Decoding.
IEEE Trans. Circuits Syst. II Express Briefs, 2007

Fast EBCOT Encoder Architecture for JPEG 2000.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2007

Early-Pruning K-Best Sphere Decoder for MIMO Systems.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2007

Studies on Practical Low Complexity Decoding of Low-Density Parity-Check Codes.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2007

Design of Low-Power Memory-Efficient Viterbi Decoder.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2007

Direct Root Computation Architecture for Algebraic Soft-Decision Decoding of Reed-Solomon Codes.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2007), 2007

Efficient Message Passing Architecture for High Throughput LDPC Decoder.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2007), 2007

FPGA Implementation of an Interpolation Processor for Soft-Decision Decoding of Reed-Solomon Codes.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2007), 2007

Factorization Architecture by Direct Root Computation for Algebraic Soft-Decision Decoding of Reed-Solomon Codes.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
High-Speed Interpolation Architecture for Soft-Decision Decoding of Reed-Solomon Codes.
IEEE Trans. Very Large Scale Integr. Syst., 2006

Efficient fast interpolation architecture for soft-decision decoding of Reed-Solomon codes.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Reencoder design for soft-decision decoding of an (255, 239) Reed-Solomon code.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Improved k-best sphere decoding algorithms for MIMO systems.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Area-efficient parallel decoder architecture for high rate QC-LDPC codes.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

A 170 Mbps (8176, 7156) quasi-cyclic LDPC decoder implementation with FPGA.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Study of Early Stopping Criteria for Turbo Decoding and Their Applications in WCDMA Systems.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

An FPGA Implementation of Array LDPC Decoder.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems 2006, 2006

2004
Area efficient decoding of quasi-cyclic low density parity check codes.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004


  Loading...