Chen Zhang
Orcid: 0000-0003-2762-2726Affiliations:
- Shanghai Jiao Tong University, China
- Alibaba DAMO Academy, Shanghai, China (former)
- Microsoft Research Asia (former)
- Peking University, Center for Energy-Efficient Computing and Applications (CECA), Beijing, China (former)
According to our database1,
Chen Zhang
authored at least 40 papers
between 2013 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
IEEE Trans. Parallel Distributed Syst., July, 2024
Proceedings of the 2024 USENIX Annual Technical Conference, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR.
CoRR, 2023
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023
Cambricon-R: A Fully Fused Accelerator for Real-Time Learning of Neural Scene Representation.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
RM-STC: Row-Merge Dataflow Inspired GPU Sparse Tensor Core for Energy-Efficient Sparse Acceleration.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
Proceedings of the ACM Turing Award Celebration Conference - China 2023, 2023
2022
Critique of "MemXCT: Memory-Centric X-Ray CT Reconstruction With Massive Parallelization" by SCC Team From Tsinghua University.
IEEE Trans. Parallel Distributed Syst., 2022
Proceedings of the SC22: International Conference for High Performance Computing, 2022
PerFlow: a domain specific framework for automatic performance analysis of parallel applications.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
FreeTensor: a free-form DSL with holistic optimizations for irregular tensor programs.
Proceedings of the PLDI '22: 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, San Diego, CA, USA, June 13, 2022
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022
2021
Critique of "Planetary Normal Mode Computation: Parallel Algorithms, Performance, and Reproducibility" by SCC Team From Tsinghua University.
IEEE Trans. Parallel Distributed Syst., 2021
CoRR, 2021
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021
2020
SCYLLA: QoE-aware Continuous Mobile Vision with FPGA-based Dynamic Deep Neural Network Reconfiguration.
Proceedings of the 39th IEEE Conference on Computer Communications, 2020
Proceedings of the 28th International Conference on Computational Linguistics, 2020
2019
Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019
Proceedings of the 2019 Workshop on Hot Topics in Video Analytics and Intelligent Edges, 2019
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019
SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
2017
Using Data Compression for Optimizing FPGA-Based Convolutional Neural Network Accelerators.
Proceedings of the Advanced Parallel Processing Technologies, 2017
2016
Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016
Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016
2015
Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015
2014
An efficient design and implementation of LSM-tree based key-value store on open-channel SSD.
Proceedings of the Ninth Eurosys Conference 2014, 2014
2013
Automatic multidimensional memory partitioning for FPGA-based accelerators (abstract only).
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013
Proceedings of the 50th Annual Design Automation Conference 2013, 2013