Chi-Chih Chang

Orcid: 0009-0001-4011-0867

According to our database1, Chi-Chih Chang authored at least 26 papers between 2004 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
DisagMoE: Computation-Communication overlapped MoE Training via Disaggregated AF-Pipe Parallelism.
CoRR, May, 2026

DARE: Diffusion Language Model Activation Reuse for Efficient Inference.
CoRR, May, 2026

Faster LLM Inference via Sequential Monte Carlo.
CoRR, April, 2026

Bit-Serial Acceleration of LLM Inference With Mixture-of-Datatype Quantization.
IEEE Trans. Computers, February, 2026

SRT: Accelerating Reinforcement Learning via Speculative Rollout with Tree-Structured Cache.
CoRR, January, 2026

2025
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs.
CoRR, December, 2025

Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding.
CoRR, December, 2025

Speculate Deep and Accurate: Lossless and Training-Free Acceleration for Offloaded LLMs via Substitute Speculative Decoding.
CoRR, September, 2025

SplitReason: Learning To Offload Reasoning.
CoRR, April, 2025

xKV: Cross-Layer SVD for KV-Cache Compression.
CoRR, March, 2025

TokenButler: Token Importance is Predictable.
CoRR, March, 2025

SparAMX: Accelerating Compressed LLMs Token Generation on AMX-powered CPUs.
CoRR, February, 2025

The Power of Negative Zero: Datatype Customization for Quantized Large Language Models.
CoRR, January, 2025

Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Quamba: A Post-Training Quantization Recipe for Selective State Space Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Palu: KV-Cache Compression with Low-Rank Projection.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Systolic Sparse Tensor Slices: FPGA Building Blocks for Sparse and Dense AI Acceleration.
Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025

FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024
V"Mean"ba: Visual State Space Models only need 1 hidden dimension.
CoRR, 2024

ELSA: Exploiting Layer-wise N:M Sparsity for Vision Transformer Acceleration.
CoRR, 2024

Palu: Compressing KV-Cache with Low-Rank Projection.
CoRR, 2024

FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Transformer and Its Variants for Identifying Good Dice in Bad Neighborhoods.
Proceedings of the 42nd IEEE VLSI Test Symposium, 2024

ELSA: Exploiting Layer-wise N: M Sparsity for Vision Transformer Acceleration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Q-YOLOP: Quantization-Aware You Only Look Once for Panoptic Driving Perception.
Proceedings of the IEEE International Conference on Multimedia and Expo Workshops, 2023

2004
Embedding information within dynamic visual patterns.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004


  Loading...