Dongxu Lyu

Orcid: 0000-0001-6826-2670

According to our database1, Dongxu Lyu authored at least 20 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of five.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
OFQ-LLM: Outlier-Flexing Quantization for Efficient Low-Bit Large Language Model Acceleration.
IEEE Trans. Circuits Syst. I Regul. Pap., August, 2025

Bridge-NDP: Efficient Communication-Computation Overlap in Near Data Processing System.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., August, 2025

An Efficient Multi-View Cross-Attention Accelerator for Vision-Centric 3D Perception in Autonomous Driving.
IEEE Trans. Circuits Syst. I Regul. Pap., July, 2025

Neural Rendering Acceleration With Deferred Neural Decoding and Voxel-Centric Data Flow.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., July, 2025

Efficient Hardware Architecture Design for Rotary Position Embedding of Large Language Models.
IEEE J. Emerg. Sel. Topics Circuits Syst., June, 2025

Adaptive Two-Range Quantization and Hardware Co-Design for Large Language Model Acceleration.
IEEE J. Emerg. Sel. Topics Circuits Syst., June, 2025

FATE: Boosting the Performance of Hyper-Dimensional Computing Intelligence with Flexible Numerical DAta TypE.
Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

AsyncDIMM: Achieving Asynchronous Execution in DIMM-Based Near-Memory Processing.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

HyperDyn: Dynamic Dimensional Masking for Efficient Hyper-Dimensional Computing.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

TAIL: Exploiting Temporal Asynchronous Execution for Efficient Spiking Neural Networks with Inter-Layer Parallelism.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

BitPattern: Enabling Efficient Bit-Serial Acceleration of Deep Neural Networks through Bit-Pattern Pruning.
Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025

KVO-LLM: Boosting Long-Context Generation Throughput for Batched LLM Inference.
Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025

AttenPIM: Accelerating LLM Attention with Dual-mode GEMV in Processing-in-Memory.
Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025

2024
M2M: A Fine-Grained Mapping Framework to Accelerate Multiple DNNs on a Multi-Chiplet Architecture.
IEEE Trans. Very Large Scale Integr. Syst., October, 2024

BSViT: A Bit-Serial Vision Transformer Accelerator Exploiting Dynamic Patch and Weight Bit-Group Quantization.
IEEE Trans. Circuits Syst. I Regul. Pap., September, 2024

A Broad-Spectrum and High-Throughput Compression Engine for Neural Network Processors.
IEEE Trans. Circuits Syst. II Express Briefs, July, 2024

Hardware-oriented algorithms for softmax and layer normalization of large language models.
Sci. China Inf. Sci., 2024

DEFA: Efficient Deformable Attention Acceleration via Pruning-Assisted Grid-Sampling and Multi-Scale Parallel Processing.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

2023
SpOctA: A 3D Sparse Convolution Accelerator with Octree-Encoding-Based Map Search and Inherent Sparsity-Aware Processing.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

FLNA: An Energy-Efficient Point Cloud Feature Learning Accelerator with Dataflow Decoupling.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023


  Loading...