Byeongwook Kim

According to our database¹, Byeongwook Kim authored at least 21 papers between 2018 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

To FP8 and Back Again: Quantifying the Effects of Reducing Precision on LLM Training Stability.

[BibT_eX]

[DOI]

CoRR, 2024

No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization.

[BibT_eX]

[DOI]

CoRR, 2024

DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation.

[BibT_eX]

[DOI]

CoRR, 2024

LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Winning Both the Accuracy of Floating Point Activation and the Simplicity of Integer Arithmetic.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models.

[BibT_eX]

[DOI]

CoRR, 2022

Encoding Weights of Irregular Sparsity for Fixed-to-Fixed Model Compression.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021

Modulating Regularization Frequency for Efficient Compression-Aware Model Training.

[BibT_eX]

[DOI]

CoRR, 2021

Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity.

[BibT_eX]

[DOI]

CoRR, 2021

Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization.

[BibT_eX]

[DOI]

CoRR, 2021

2020

BiQGEMM: matrix multiplication with lookup table for binary-coding-based quantized DNNs.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2020

FleXOR: Trainable Fractional Quantization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Structured Compression by Weight Encryption for Unstructured Pruning and Quantization.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Learning Low-Rank Approximation for CNNs.

[BibT_eX]

[DOI]

CoRR, 2019

Structured Compression by Unstructured Pruning for Sparse Quantized Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2019

Network Pruning for Low-Rank Binary Indexing.

[BibT_eX]

[DOI]

CoRR, 2019

2018

DeepTwist: Learning Model Compression via Occasional Weight Distortion.

[BibT_eX]

[DOI]

Dongsoo Lee

Parichay Kapoor

Byeongwook Kim

CoRR, 2018

Retraining-Based Iterative Weight Quantization for Deep Neural Networks.

[BibT_eX]

[DOI]

Dongsoo Lee

Byeongwook Kim

CoRR, 2018

Byeongwook Kim

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...