Se Jung Kwon

Orcid: 0000-0003-3456-9038

According to our database1, Se Jung Kwon authored at least 41 papers between 2012 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Diffusion Adaptive Text Embedding for Text-to-Image Diffusion Models.
CoRR, October, 2025

AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs.
CoRR, October, 2025

SelfJudge: Faster Speculative Decoding via Self-Supervised Judge Verification.
CoRR, October, 2025

Faster Inference of LLMs using FP8 on the Intel Gaudi.
CoRR, March, 2025

An Investigation of FP8 Across Accelerators for LLM Inference.
CoRR, February, 2025

Debunking the CUDA Myth Towards GPU-based AI Systems.
CoRR, January, 2025

LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Debunking the CUDA Myth Towards GPU-based AI Systems: Evaluation of the Performance and Programmability of Intel's Gaudi NPU for AI Model Serving.
Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
To FP8 and Back Again: Quantifying the Effects of Reducing Precision on LLM Training Stability.
CoRR, 2024

HyperCLOVA X Technical Report.
CoRR, 2024

No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization.
CoRR, 2024

DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Label-Noise Robust Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Sparsity-Aware Memory Interface Architecture using Stacked XORNet Compression for Accelerating Pruned-DNN Models.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization.
Proceedings of the International Conference on Machine Learning, 2023

Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models.
Proceedings of the International Conference on Machine Learning, 2023

Winning Both the Accuracy of Floating Point Activation and the Simplicity of Integer Arithmetic.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

TF-MVP: Novel Sparsity-Aware Transformer Accelerator with Mixed-Length Vector Pruning.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022
nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models.
CoRR, 2022

Maximum Likelihood Training of Implicit Nonlinear Diffusion Models.
CoRR, 2022

Maximum Likelihood Training of Implicit Nonlinear Diffusion Model.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Encoding Weights of Irregular Sparsity for Fixed-to-Fixed Model Compression.
Proceedings of the Tenth International Conference on Learning Representations, 2022

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Modulating Regularization Frequency for Efficient Compression-Aware Model Training.
CoRR, 2021

Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity.
CoRR, 2021

Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization.
CoRR, 2021

2020
Adaptive Discrete Event Simulation Systems to Embrace Changes of Requirements Using Event Control Models.
IEEE Trans. Syst. Man Cybern. Syst., 2020

BiQGEMM: matrix multiplication with lookup table for binary-coding-based quantized DNNs.
Proceedings of the International Conference for High Performance Computing, 2020

FleXOR: Trainable Fractional Quantization.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Structured Compression by Weight Encryption for Unstructured Pruning and Quantization.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Learning Low-Rank Approximation for CNNs.
CoRR, 2019

Structured Compression by Unstructured Pruning for Sparse Quantized Neural Networks.
CoRR, 2019

Network Pruning for Low-Rank Binary Indexing.
CoRR, 2019

2018
Simulation-Based Optimization on the System-of-Systems Model via Model Transformation and Genetic Algorithm: A Case Study of Network-Centric Warfare.
Complex., 2018

2013
Integrated hybrid systems modeling and simulation methodology based on HDEVS formalism.
Proceedings of the 2013 Summer Simulation Multiconference, 2013

2012
Design and implementation of event-based DEVS execution environment for faster execution of iterative simulation.
Proceedings of the 2012 Spring Simulation Multiconference, 2012


  Loading...