Ray C. C. Cheung

Orcid: 0000-0002-6764-0729

Affiliations:
  • City University of Hong Kong, Department of Electrical Engineering


According to our database1, Ray C. C. Cheung authored at least 170 papers between 2000 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
An Efficient FPGA-based Depthwise Separable Convolutional Neural Network Accelerator with Hardware Pruning.
ACM Trans. Reconfigurable Technol. Syst., March, 2024

Efficient Blind Hyperspectral Unmixing Framework Based on CUR Decomposition (CUR-HU).
Remote. Sens., March, 2024

REALISE-IoT: RISC-V-Based Efficient and Lightweight Public-Key System for IoT Applications.
IEEE Internet Things J., January, 2024

A Highly-efficient Lattice-based Post-Quantum Cryptography Processor for IoT Applications.
IACR Trans. Cryptogr. Hardw. Embed. Syst., 2024

Revisiting Keccak and Dilithium Implementations on ARMv7-M.
IACR Trans. Cryptogr. Hardw. Embed. Syst., 2024

2023
Efficient Multiple Channels EEG Signal Classification Based on Hierarchical Extreme Learning Machine.
Sensors, November, 2023

High-performance and Configurable SW/HW Co-design of Post-quantum Signature CRYSTALS-Dilithium.
ACM Trans. Reconfigurable Technol. Syst., September, 2023

Efficient and Automatic Breast Cancer Early Diagnosis System Based on the Hierarchical Extreme Learning Machine.
Sensors, September, 2023

Design of a Hippocampal Cognitive Prosthesis Chip.
IEICE Trans. Electron., July, 2023

MUREN: MUltistage Recursive Enhanced Network for Coal-Fired Power Plant Detection.
Remote. Sens., April, 2023

Algorithm-Hardware Co-Design of Split-Radix Discrete Galois Transformation for KyberKEM.
IEEE Trans. Emerg. Top. Comput., 2023

Yet another Improvement of Plantard Arithmetic for Faster Kyber on Low-end 32-bit IoT Devices.
CoRR, 2023

CO-Detector: Towards Complex Object Detection with Cross-Part Feature Learning in Remote Sensing.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2023

Image Super-Resolution and FPGA Hardware Design.
Proceedings of the IEEE International Conference on Signal Processing, 2023

Homomorphic Encryption-Based System Design for Secure Data Processing.
Proceedings of the IEEE International Conference on Signal Processing, 2023

A Platform for Adaptive Interference Mitigation and Intent Analysis Using OpenLANE.
Proceedings of the IEEE International Conference on Signal Processing, 2023

Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

In-Network Aggregation with Transport Transparency for Distributed Training.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
Improved Plantard Arithmetic for Lattice-based Cryptography.
IACR Trans. Cryptogr. Hardw. Embed. Syst., 2022

PipeNTT: A Pipelined Number Theoretic Transform Architecture.
IEEE Trans. Circuits Syst. II Express Briefs, 2022

High Throughput Hardware/Software Heterogeneous System for RRPN-Based Scene Text Detection.
IEEE Trans. Computers, 2022

Machine Learning Based Hardware Architecture for DOA Measurement From Mice EEG.
IEEE Trans. Biomed. Eng., 2022

Reconfigurable content-addressable memory (CAM) on FPGAs: A tutorial and survey.
Future Gener. Comput. Syst., 2022

Comp-TCAM: An Adaptable Composite Ternary Content-Addressable Memory on FPGAs.
IEEE Embed. Syst. Lett., 2022

A Versatility-Performance Balanced Hardware Architecture for Scene Text Detection.
Proceedings of the IEEE Smartworld, 2022

Melting Glacier: A 37-Year (1984-2020) High-Resolution Glacier-Cover Record of MT. Kilimanjaro.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2022

Message from the General Chair and Program Co-Chairs.
Proceedings of the International Conference on Field-Programmable Technology, 2022

Preface.
Proceedings of the International Conference on Field-Programmable Technology, 2022

A High-Performance FPGA Accelerator for CUR Decomposition.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

2021
An Efficient Parallel Processor for Dense Tensor Computation.
IEEE Trans. Very Large Scale Integr. Syst., 2021

Elastic Net Constraint-Based Tensor Model for High-Order Graph Matching.
IEEE Trans. Cybern., 2021

Scalable Fully Pipelined Hardware Architecture for In-Network Aggregated AllReduce Communication.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

A systematic review of blockchain scalability: Issues, solutions, analysis and future research.
J. Netw. Comput. Appl., 2021

Accelerated Updating Mechanisms for FPGA-Based Ternary Content-Addressable Memory.
IEEE Embed. Syst. Lett., 2021

A survey of breakthrough in blockchain technology: Adoptions, applications, challenges and future research.
Comput. Commun., 2021

Efficient High-Performance FPGA-Redis Hybrid NoSQL Caching System for Blockchain Scalability.
Comput. Commun., 2021

LoRaWAN-based Camera with (CIRA) Compression and Image Recovery Algorithm.
Proceedings of the 7th IEEE World Forum on Internet of Things, 2021

Aero-Hydroponic Agriculture IoT System.
Proceedings of the 7th IEEE World Forum on Internet of Things, 2021

Design of a Battery Carrying Barge for Enhancing Autonomous Sailboat's Endurance Capacity.
Proceedings of the IEEE International Conference on Real-time Computing and Robotics, 2021

An FPGA-based MobileNet Accelerator Considering Network Structure Characteristics.
Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

On the Suitability of Read only Memory for FPGA-Based CAM Emulation Using Partial Reconfiguration.
Proceedings of the Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2021

2020
RPE-TCAM: Reconfigurable Power-Efficient Ternary Content-Addressable Memory on FPGAs.
IEEE Trans. Very Large Scale Integr. Syst., 2020

Compact Code-Based Signature for Reconfigurable Devices With Side Channel Resilience.
IEEE Trans. Circuits Syst. I Regul. Pap., 2020

Binary convolutional neural network acceleration framework for rapid system prototyping.
J. Syst. Archit., 2020

NetReduce: RDMA-Compatible In-Network Reduction for Distributed DNN Training Acceleration.
CoRR, 2020

A Highly Parallel Constant-Time Almost-Inverse Algorithm.
Proceedings of the IEEE International Conference on Signal Processing, 2020

Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Feature Selection Based on Tensor Decomposition and Object Proposal for Night-Time Multiclass Vehicle Detection.
IEEE Trans. Syst. Man Cybern. Syst., 2019

A robust background initialization algorithm with superpixel motion detection.
Signal Process. Image Commun., 2019

A high performance hardware architecture for non-negative tensor factorization.
Microelectron. J., 2019

High performance hardware architecture for singular spectrum analysis of Hankel tensors.
Microprocess. Microsystems, 2019

D-TCAM: A High-Performance Distributed RAM Based TCAM Architecture on FPGAs.
IEEE Access, 2019

High Performance Power-Efficient Gate-Based CAM for Reconfigurable Computing.
Proceedings of the 15th International Conference on Mobile Ad-Hoc and Sensor Networks, 2019

Reconfigurable RISC-V Secure Processor And SoC Integration.
Proceedings of the IEEE International Conference on Industrial Technology, 2019

Optimized Polynomial Multiplier Over Commutative Rings on FPGAs: A Case Study on BIKE.
Proceedings of the International Conference on Field-Programmable Technology, 2019

Accurate and Compact Convolutional Neural Networks with Trained Binarization.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

An Efficient Application Specific Instruction Set Processor (ASIP) for Tensor Computation.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

Bank-selective Strategy for Gate-based Ternary Content-addressable Memory on FPGAs.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

2018
High-Speed Discrete Gaussian Sampler With Heterodyne Chaotic Laser Inputs.
IEEE Trans. Circuits Syst. II Express Briefs, 2018

FFT-Based McLaughlin's Montgomery Exponentiation without Conditional Selections.
IEEE Trans. Computers, 2018

A fast inter CU decision algorithm for HEVC.
Signal Process. Image Commun., 2018

ASIC Implementation of a Nonlinear Dynamical Model for Hippocampal Prosthesis.
Neural Comput., 2018

Spectral arithmetic in Montgomery modular multiplication.
J. Cryptogr. Eng., 2018

Dynamic Virtual Page-Based Flash Translation Layer With Novel Hot Data Identification and Adaptive Parallelism Management.
IEEE Access, 2018

Lightweight Secure Processor Prototype on FPGA.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

2017
A Fully Pipelined Hardware Architecture for Intra Prediction of HEVC.
IEEE Trans. Circuits Syst. Video Technol., 2017

A Bias-Bounded Digital True Random Number Generator Architecture.
IEEE Trans. Circuits Syst. I Regul. Pap., 2017

Compact Constant Weight Coding Engines for the Code-Based Cryptography.
IEEE Trans. Circuits Syst. II Express Briefs, 2017

Toward Practical Code-Based Signature: Implementing Fast and Compact QC-LDGM Signature Scheme on Embedded Hardware.
IEEE Trans. Circuits Syst. I Regul. Pap., 2017

Area-Time Efficient Computation of Niederreiter Encryption on QC-MDPC Codes for Embedded Hardware.
IEEE Trans. Computers, 2017

Area-Time Efficient Architecture of FFT-Based Montgomery Multiplication.
IEEE Trans. Computers, 2017

A low power V-band LC VCO with high Q varactor technique in 40 nm CMOS process.
Sci. China Inf. Sci., 2017

Fast HEVC intra coding decision based on statistical cost and corner detection.
Proceedings of the International Conference on Systems, Signals and Image Processing, 2017

High DC gain and wide output swing class-C inverter.
Proceedings of the International SoC Design Conference, 2017

2016
Parameter Space for the Architecture of FFT-Based Montgomery Modular Multiplication.
IEEE Trans. Computers, 2016

An FPGA-Based High-Performance Neural Ensemble Spiking Activity Simulator Utilizing Generalized Volterra Kernel and Complexity Analysis.
J. Circuits Syst. Comput., 2016

FPGA-Based High-Performance Collision Detection: An Enabling Technique for Image-Guided Robotic Surgery.
Frontiers Robotics AI, 2016

2015
Z-TCAM: An SRAM-based Architecture for TCAM.
IEEE Trans. Very Large Scale Integr. Syst., 2015

A Fast CU Size Decision Algorithm for the HEVC Intra Encoder.
IEEE Trans. Circuits Syst. Video Technol., 2015

An Application Specific Instruction Set Processor (ASIP) for Adaptive Filters in Neural Prosthetics.
IEEE ACM Trans. Comput. Biol. Bioinform., 2015

Configurable Architectures for Multi-Mode Floating Point Adders.
IEEE Trans. Circuits Syst. I Regul. Pap., 2015

Fast and Generic Inversion Architectures Over GF(2<sup>m</sup>) Using Modified Itoh-Tsujii Algorithms.
IEEE Trans. Circuits Syst. II Express Briefs, 2015

Architecture Support for Task Out-of-Order Execution in MPSoCs.
IEEE Trans. Computers, 2015

Efficient Pairing Computation on Huff Curves.
Cryptologia, 2015

2014
Design Exploration of Geometric Biclustering for Microarray Data Analysis in Data Mining.
IEEE Trans. Parallel Distributed Syst., 2014

Unified Architecture for Double/Two-Parallel Single Precision Floating Point Adder.
IEEE Trans. Circuits Syst. II Express Briefs, 2014

Novel RNS Parameter Selection for Fast Modular Multiplication.
IEEE Trans. Computers, 2014

An FPGA based scalable architecture of a stochastic state point process filter (SSPPF) to track the nonlinear dynamics underlying neural spiking.
Microelectron. J., 2014

GPU-based biclustering for microarray data analysis in neurocomputing.
Neurocomputing, 2014

A perfectly current matched charge pump with wide dynamic range for ultra low voltage applications.
IEICE Electron. Express, 2014

High-speed Polynomial Multiplication Architecture for Ring-LWE and SHE Cryptosystems.
IACR Cryptol. ePrint Arch., 2014

E-TCAM: An Efficient SRAM-Based Architecture for TCAM.
Circuits Syst. Signal Process., 2014

Series Expansion based Efficient Architectures for Double Precision Floating Point Division.
Circuits Syst. Signal Process., 2014

A low-power inverter-based ΣΔ analog-to-digital converter for audio applications.
Sci. China Inf. Sci., 2014

Configurable Architecture for Double/Two-Parallel Single Precision Floating Point Division.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2014

Zero collision attack and its countermeasures on Residue Number System multipliers.
Proceedings of the 2014 International Symposium on Integrated Circuits (ISIC), 2014

VLSI architecture of a high-performance neural spiking activity simulator based on generalized Volterra kernel.
Proceedings of the 2014 International Symposium on Integrated Circuits (ISIC), 2014

Time-efficient computation of digit serial Montgomery multiplication.
Proceedings of the 2014 International Symposium on Integrated Circuits (ISIC), 2014

Trade-offs between the sensitivity and the speed of the FPGA-based sequence aligner.
Proceedings of the 2014 International Symposium on Integrated Circuits (ISIC), 2014

A complementary architecture for high-speed true random number generator.
Proceedings of the 2014 International Conference on Field-Programmable Technology, 2014

Big data genome sequencing on Zynq based clusters (abstract only).
Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

Laguerre-volterra model and architecture for MIMO system identification and output prediction.
Proceedings of the 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2014

2013
A Flexible and Customizable Architecture for the Relaxation Labeling Algorithm.
IEEE Trans. Circuits Syst. II Express Briefs, 2013

Real-Time Prediction of Neuronal Population Spiking Activity Using FPGA.
IEEE Trans. Biomed. Circuits Syst., 2013

HEALPIX DCT technique for compressing PCA-based illumination adjustable images.
Neural Comput. Appl., 2013

Parallel architecture for DNA sequence inexact matching with Burrows-Wheeler Transform.
Microelectron. J., 2013

Area-efficient architectures for double precision multiplier on FPGA, with run-time-reconfigurable dual single precision support.
Microelectron. J., 2013

A 0.8-V 230-µW 98-dB DR Inverter-Based ΣΔ Modulator for Audio Applications.
IEEE J. Solid State Circuits, 2013

A σδ modulator using gain-Boost Class-C Inverter for Audio Applications.
J. Circuits Syst. Comput., 2013

A scalable RNS Montgomery multiplier over F<sub>2<sup>m</sup></sub>.
IEICE Electron. Express, 2013

VLSI Implementation of Double-Precision Floating-Point Multiplier Using Karatsuba Technique.
Circuits Syst. Signal Process., 2013

A memory-based NFA regular expression match engine for signature-based intrusion detection.
Comput. Commun., 2013

Design Automation Framework for Reconfigurable Interconnection Networks.
Comput. J., 2013

A reconfigurable architecture for real-time prediction of neural activity.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

Noise filtering and occurrence identification of mouse ultrasonic vocalization call.
Proceedings of the International Conference on Machine Learning and Cybernetics, 2013

Fast simulation of Digital Spiking Silicon Neuron model employing reconfigurable dataflow computing.
Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013

FPGA IP protection by binding Finite State Machine to Physical Unclonable Function.
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

Design space explorations of Hybrid-Partitioned TCAM (HP-TCAM).
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

Genome sequencing using mapreduce on FPGA with multiple hardware accelerators (abstract only).
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

Binding Hardware IPs to Specific FPGA Device via Inter-twining the PUF Response with the FSM of Sequential Circuits.
Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013

A customizable Stochastic State Point Process Filter (SSPPF) for neural spiking activity.
Proceedings of the 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2013

2012
Hypergraph based geometric biclustering algorithm.
Pattern Recognit. Lett., 2012

Subthreshold CMOS voltage reference circuit with body bias compensation for process variation.
IET Circuits Devices Syst., 2012

Faster Pairing Coprocessor Architecture.
Proceedings of the Pairing-Based Cryptography - Pairing 2012, 2012

An FPGA-based acceleration platform for auction algorithm.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

FPGA Implementation of SRAM-based Ternary Content Addressable Memory.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Area-Efficient FPGA Implementation of Quadruple Precision Floating Point Multiplier.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

GPU-Based Biclustering for Neural Information Processing.
Proceedings of the Neural Information Processing - 19th International Conference, 2012

Low complexity and hardware-friendly spectral modular multiplication.
Proceedings of the 2012 International Conference on Field-Programmable Technology, 2012

Area-Efficient Architectures for Large Integer and Quadruple Precision Floating Point Multipliers.
Proceedings of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines, 2012

A dual mode FPGA design for the hippocampal prosthesis.
Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

High Performance Reconfigurable Architecture for Double Precision Floating Point Division.
Proceedings of the Reconfigurable Computing: Architectures, Tools and Applications, 2012

2011
A High Speed Pairing Coprocessor Using RNS and Lazy Reduction.
IACR Cryptol. ePrint Arch., 2011

High-Performance and Scalable System Architecture for the Real-Time Estimation of Generalized Laguerre-Volterra MIMO Model From Neural Population Spiking Activity.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2011

Rapid single-chip secure processor prototyping on the OpenSPARC FPGA platform.
Proceedings of the 22nd IEEE International Symposium on Rapid System Prototyping, 2011

Hydrate: Hybrid Reconfigurable Architecture Expressions.
Proceedings of the 2011 International Conference on Field-Programmable Technology, 2011

FPGA Architecture of Generalized Laguerre-Volterra MIMO Model for Neural Population Activities.
Proceedings of the International Conference on Field Programmable Logic and Applications, 2011

FPGA Architecture of Generalized Laguerre-Volterra MIMO Model for Neural Population Spiking Activities.
Proceedings of the IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines, 2011

A hardware-based computational platform for Generalized Laguerre-Volterra MIMO model for neural activities.
Proceedings of the 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2011

FPGA Implementation of Pairings Using Residue Number System and Lazy Reduction.
Proceedings of the Cryptographic Hardware and Embedded Systems - CHES 2011 - 13th International Workshop, Nara, Japan, September 28, 2011

2010
Counter Embedded Memory architecture for trusted computing platform.
Proceedings of the 21st IEEE International Symposium on Rapid System Prototyping, 2010

Reconfigurable Number Theoretic Transform architectures for cryptographic applications.
Proceedings of the International Conference on Field-Programmable Technology, 2010

Design Automation for Reconfigurable Interconnection Networks.
Proceedings of the Reconfigurable Computing: Architectures, 2010

2009
Hierarchical Segmentation for Hardware Function Evaluation.
IEEE Trans. Very Large Scale Integr. Syst., 2009

A High-Performance Hardware Architecture for Spectral Hash Algorithm.
Proceedings of the 20th IEEE International Conference on Application-Specific Systems, 2009

2008
Hardware Implementation Trade-Offs of Polynomial Approximations and Interpolations.
IEEE Trans. Computers, 2008

2007
A Flexible Architecture for Precise Gamma Correction.
IEEE Trans. Very Large Scale Integr. Syst., 2007

Hardware Generation of Arbitrary Random Number Distributions From Uniform Distributions Via the Inversion Method.
IEEE Trans. Very Large Scale Integr. Syst., 2007

The exact channel density and compound design for generic universal switch blocks.
ACM Trans. Design Autom. Electr. Syst., 2007

Instrumented Multi-Stage Word-Length Optimization.
Proceedings of the 2007 International Conference on Field-Programmable Technology, 2007

Automatic Accuracy-Guaranteed Bit-Width Optimization for Fixed and Floating-Point Systems.
Proceedings of the FPL 2007, 2007

2006
Accuracy-Guaranteed Bit-Width Optimization.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2006

Decomposition Design Theory and Methodology for Arbitrary-Shaped Switch Boxes.
IEEE Trans. Computers, 2006

Inversion-based hardware gaussian random number generator: A case study of function evaluation via hierarchical segmentation.
Proceedings of the 2006 IEEE International Conference on Field Programmable Technology, 2006

2005
Customizable elliptic curve cryptosystems.
IEEE Trans. Very Large Scale Integr. Syst., 2005

Reconfigurable Acceleration for Monte Carlo Based Financial Simulation.
Proceedings of the 2005 IEEE International Conference on Field-Programmable Technology, 2005

Ziggurat-based Hardware Gaussian Random Number Generator.
Proceedings of the 2005 International Conference on Field Programmable Logic and Applications (FPL), 2005

Reconfigurable Elliptic Curve Cryptosystems on a Chip.
Proceedings of the 2005 Design, 2005

Automating custom-precision function evaluation for embedded processors.
Proceedings of the 2005 International Conference on Compilers, 2005

2004
Customising Hardware Designs for Elliptic Curve Cryptography.
Proceedings of the Computer Systems: Architectures, 2004

A scalable hardware architecture for prime number validation.
Proceedings of the 2004 IEEE International Conference on Field-Programmable Technology, 2004

On Optimal Irregular Switch Box Designs.
Proceedings of the Field Programmable Logic and Application, 2004

A System on Chip Design Framework for Prime Number Validation Using Reconfigurable Hardware.
Proceedings of the Field Programmable Logic and Application, 2004

2003
Further improve circuit partitioning using GBAW logic perturbation techniques.
IEEE Trans. Very Large Scale Integr. Syst., 2003

On optimal hyperuniversal and rearrangeable switch box designs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2003

An FPGA-based re-configurable 24-bit 96kHz sigma-delta audio DAC.
Proceedings of the 2003 IEEE International Conference on Field-Programmable Technology, 2003

2002
On Optimum Designs of Universal Switch Blocks.
Proceedings of the Field-Programmable Logic and Applications, 2002

2001
Further improve circuit partitioning using GBAW logic perturbation techniques.
Proceedings of the Conference on Design, Automation and Test in Europe, 2001

On Optimum Switch Box Designs for 2-D FPGAs.
Proceedings of the 38th Design Automation Conference, 2001

2000
On improved graph-based alternative wiring scheme for multi-level logic optimization.
Proceedings of the 2000 7th IEEE International Conference on Electronics, 2000


  Loading...