Xiaoxia Wu

Fuliang Lu

Lianzhu Zhang

Graphs Comb., May, 2024

Mojito: Motion Trajectory and Intensity Control for Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

GRIN: GRadient-INformed MoE.

[BibT_eX]

[DOI]

CoRR, 2024

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design.

[BibT_eX]

[DOI]

CoRR, 2024

Quant-LLM: Accelerating the Serving of Large Language Models via FP6-Centric Algorithm-System Co-Design on Modern GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2024 USENIX Annual Technical Conference, 2024

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

ZeRO++: Extremely Efficient Collective Communication for Large Model Training.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

A novel bias-alleviated hybrid ensemble model based on over-sampling and post-processing for fair classification.

[BibT_eX]

[DOI]

Connect. Sci., December, 2023

ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks.

[BibT_eX]

[DOI]

CoRR, 2023

ZeroQuant-HERO: Hardware-Enhanced Robust Optimized Post-Training Quantization Framework for W8A8 Transformers.

[BibT_eX]

[DOI]

CoRR, 2023

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies.

[BibT_eX]

[DOI]

Cindy Orozco Bohorquez

Massimiliano Lupo Pasini

CoRR, 2023

DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention.

[BibT_eX]

[DOI]

CoRR, 2023

DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales.

[BibT_eX]

[DOI]

CoRR, 2023

ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats.

[BibT_eX]

[DOI]

Yuxiong He

CoRR, 2023

A Comprehensive Study on Post-Training Quantization for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases.

[BibT_eX]

[DOI]

Cheng Li

Yuxiong He

CoRR, 2023

Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases.

[BibT_eX]

[DOI]

Cheng Li

Yuxiong He

Proceedings of the International Conference on Machine Learning, 2023

2022

Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers.

[BibT_eX]

[DOI]

CoRR, 2022

Extreme Compression for Pre-trained Transformers Made Simple and Efficient.

[BibT_eX]

[DOI]

CoRR, 2022

ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

XTC: Extreme Compression for Pre-trained Transformers Made Simple and Efficient.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Adaptive Differentially Private Empirical Risk Minimization.

[BibT_eX]

[DOI]

CoRR, 2021

Hierarchical Learning for Generation with Long Source Sequences.

[BibT_eX]

[DOI]

Tobias Rohde

Yinhan Liu

CoRR, 2021

When Do Curricula Work?

[BibT_eX]

[DOI]

Ethan Dyer

Behnam Neyshabur

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Implicit Regularization and Convergence for Weight Normalization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Linear Convergence of Adaptive Stochastic Gradient Descent.

[BibT_eX]

[DOI]

Yuege Xie

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Choosing the Sample with Lowest Loss makes SGD Robust.

[BibT_eX]

[DOI]

Vatsal Shah

Sujay Sanghavi

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

Implicit Regularization of Normalization Methods.

[BibT_eX]

[DOI]

CoRR, 2019

Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network.

[BibT_eX]

[DOI]

Simon S. Du

CoRR, 2019

On structural properties of ABC-minimal chemical trees.

[BibT_eX]

[DOI]

Lianzhu Zhang

Appl. Math. Comput., 2019

AdaGrad stepsizes: sharp convergence over nonconvex landscapes.

[BibT_eX]

[DOI]

Léon Bottou

Proceedings of the 36th International Conference on Machine Learning, 2019

Research on the Development Trend and Coping Strategies of Internet Finance.

[BibT_eX]

[DOI]

Heyu Xiao

Proceedings of the Cyber Security Intelligence and Analytics, 2019

2018

Methionine-Capped Gold Nanoclusters as a Fluorescence-Enhanced Probe for Cadmium(II) Sensing.

[BibT_eX]

[DOI]

Sensors, 2018

Toward Transport Ecosystem Interoperability Enabled by Vendor-Diverse Coherent Optical Sources Over an Open Line System.

[BibT_eX]

[DOI]

Mark Filer

Hacene Chaouch

JOCN, 2018

AdaGrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization.

[BibT_eX]

[DOI]

Léon Bottou

CoRR, 2018

WNGrad: Learn the Learning Rate in Gradient Descent.

[BibT_eX]

[DOI]

Léon Bottou

CoRR, 2018

2017

Spanning trees and recurrent configurations of a graph.

[BibT_eX]

[DOI]

Lianzhu Zhang

Haiyan Chen

Appl. Math. Comput., 2017

Interoperation of layer-2/3 modular switches with 8QAM/16QAM integrated coherent optics over 2000 km open line system.

[BibT_eX]

[DOI]

Proceedings of the Optical Fiber Communications Conference and Exhibition, 2017

2014

A 130.7-mm2 2-Layer 32-Gb ReRAM Memory Device in 24-nm Technology.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2014

Height Probabilities in the Abelian Sandpile Model on the Generalized Trees.

[BibT_eX]

Ars Comb., 2014

2013

A 130.7mm2 2-layer 32Gb ReRAM memory device in 24nm technology.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Solid-State Circuits Conference, 2013

2012

Electrical Characterization for Intertier Connections and Timing Analysis for 3-D ICs.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2012

Estimating the Proportion of True Null Hypotheses in Nonparametric Exponential Mixture Model with Appication to the Leukemia Gene Expression Data.

[BibT_eX]

[DOI]

Commun. Stat. Simul. Comput., 2012

Small Randic Index Ordering of Trees with k Pendant Vertices.

[BibT_eX]

Lian-zhu Zhang

Ars Comb., 2012

2011

Variation-Aware Task and Communication Mapping for MPSoC Architecture.

[BibT_eX]

[DOI]

Yibo Chen

Chrysostomos Nicopoulos

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2011

Stacking magnetic random access memory atop microprocessors: an architecture-level evaluation.

[BibT_eX]

[DOI]

IET Comput. Digit. Tech., 2011

Optical logic elementary circuits.

[BibT_eX]

[DOI]

IET Circuits Devices Syst., 2011

2010

Design exploration of hybrid caches with disparate memory technologies.

[BibT_eX]

[DOI]

Ramakrishnan Rajamony

ACM Trans. Archit. Code Optim., 2010

Cost-driven 3D integration with interconnect layers.

[BibT_eX]

[DOI]

Proceedings of the 47th Design Automation Conference, 2010

2009

Scan-chain design and optimization for three-dimensional integrated circuits.

[BibT_eX]

[DOI]

Paul Falkenstern

Krishnendu Chakrabarty

ACM J. Emerg. Technol. Comput. Syst., 2009

Exploration of 3D stacked L2 cache design for high performance and efficient thermal control.

[BibT_eX]

[DOI]

Guangyu Sun

Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009

Hybrid cache architecture with disparate memory technologies.

[BibT_eX]

[DOI]

Ramakrishnan Rajamony

Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Power and performance of read-write aware Hybrid Caches with non-volatile memories.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2009

2008

Test-Access Solutions for Three-Dimensional SOCs.

[BibT_eX]

[DOI]

Yibo Chen

Krishnendu Chakrabarty

Proceedings of the 2008 IEEE International Test Conference, 2008

Test-access mechanism optimization for core-based three-dimensional SOCs.

[BibT_eX]

[DOI]

Yibo Chen

Krishnendu Chakrabarty

Proceedings of the 26th International Conference on Computer Design, 2008

Comparative analysis of NBTI effects on low power and high performance flip-flops.

[BibT_eX]

[DOI]

Krishnan Ramakrishnan

Proceedings of the 26th International Conference on Computer Design, 2008

Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement.

[BibT_eX]

[DOI]

Proceedings of the 45th Design Automation Conference, 2008

Variability-driven module selection with joint design time optimization and post-silicon tuning.

[BibT_eX]

[DOI]

Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008

2007

On-chip bus thermal analysis and optimisation.

[BibT_eX]

[DOI]

Mary Jane Irwin

IET Comput. Digit. Tech., 2007

Scan chain design for three-dimensional integrated circuits (3D ICs).

[BibT_eX]

[DOI]

Paul Falkenstern

Proceedings of the 25th International Conference on Computer Design, 2007

Variation-aware task allocation and scheduling for MPSoC.

[BibT_eX]

[DOI]

Chrysostomos Nicopoulos

Proceedings of the 2007 International Conference on Computer-Aided Design, 2007

2006

Analysis of Subthreshold Finfet Circuits for Ultra-Low Power Design.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International SOC Conference, Austin, Texas, USA, 2006

Guaranteeing performance yield in high-level synthesis.

[BibT_eX]

[DOI]

Wei-Lun Hung