Xiaqing Li

Orcid: 0000-0002-7748-7967

According to our database1, Xiaqing Li authored at least 22 papers between 2008 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
NestQuant: Post-Training Integer-Nesting Quantization for On-Device DNN.
CoRR, June, 2025

Efficient and Fast High-Performance Library Generation for Deep Learning Accelerators.
IEEE Trans. Computers, January, 2025

2024
FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning With Partitioning and Parallelism of Search Space.
IEEE Trans. Parallel Distributed Syst., July, 2024

AGON: Automated Design Framework for Customizing Processors from ISA Documents.
CoRR, 2024

DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Cambricon-C: Efficient 4-Bit Matrix Unit via Primitivization.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Cambricon-D: Full-Network Differential Acceleration for Diffusion Models.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

Explainable and Layout-Aware Timing Prediction.
Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024

Revisiting Automatic Pipelining: Gate-level Forwarding and Speculation.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

TensorTEE: Unifying Heterogeneous TEE Granularity for Efficient Secure Collaborative Tensor Computing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Chip design with machine learning: a survey from algorithm perspective.
Sci. China Inf. Sci., November, 2023

Cambricon-U: A Systolic Random Increment Memory Architecture for Unary Computing.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

BALTO: fast tensor program optimization with diversity-based active learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Heron: Automatically Constrained High-Performance Library Generation for Deep Learning Accelerators.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
Cambricon-P: A Bitflow Architecture for Arbitrary Precision Computing.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

BabelTower: Learning to Auto-parallelized Program Translation.
Proceedings of the International Conference on Machine Learning, 2022

2021
SmartTuning: Selecting Hyper-Parameters of a ConvNet System for Fast Training and Small Working Memory.
IEEE Trans. Parallel Distributed Syst., 2021

MSCU: Accelerating CNN Inference with Multiple Sizes of Compute Unit on FPGAs.
Proceedings of the 14th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2021

2019
HyConv: Accelerating Multi-Phase CNN Computation by Fine-Grained Policy Selection.
IEEE Trans. Parallel Distributed Syst., 2019

2018
Research on Chinese-Tibetan Neural Machine Translation.
Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, 2018

2016
Performance Analysis of GPU-Based Convolutional Neural Networks.
Proceedings of the 45th International Conference on Parallel Processing, 2016

2008
A new asynchronous parallel load flow calculation algorithm.
Proceedings of the 2008 IEEE Conference on Robotics, Automation and Mechatronics, 2008


  Loading...