Binhang Yuan

Orcid: 0000-0002-3188-2769

According to our database1, Binhang Yuan authored at least 71 papers between 2014 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Re:Form - Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny.
CoRR, July, 2025

Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification.
CoRR, June, 2025

Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning.
CoRR, June, 2025

Cascadia: A Cascade Serving System for Large Language Models.
CoRR, June, 2025

TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network.
CoRR, June, 2025

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning.
CoRR, May, 2025

HEXGEN-TEXT2SQL: Optimizing LLM Inference Request Scheduling for Agentic Text-to-SQL Workflow.
CoRR, May, 2025

CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Language Models.
CoRR, February, 2025

AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science.
CoRR, February, 2025

Demystifying Cost-Efficiency in LLM Serving over Heterogeneous GPUs.
CoRR, February, 2025

Top Ten Challenges Towards Agentic Neural Graph Databases.
IEEE Data Eng. Bull., 2025

Toppings: CPU-Assisted, Rank-Aware Adapter Serving for LLM Inference.
Proceedings of the 2025 USENIX Annual Technical Conference, 2025

Prompt Inversion Attack Against Collaborative Inference of Large Language Models.
Proceedings of the IEEE Symposium on Security and Privacy, 2025

DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

HexGen-2: Disaggregated Generative Inference of LLMs in Heterogeneous Environment.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Ratel: Optimizing Holistic Data Movement to Fine-tune 100B Model on a Consumer GPU.
Proceedings of the 41st IEEE International Conference on Data Engineering, 2025

MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage.
Proceedings of the 41st IEEE International Conference on Data Engineering, 2025

Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems.
VLDB J., September, 2024

TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with Scalable Context and Symbolic Extension.
CoRR, 2024

Zero-Indexing Internet Search Augmented Generation for Large Language Models.
CoRR, 2024

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.
CoRR, 2024

Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads.
CoRR, 2024

FlashFlex: Accommodating Large Language Model Training over Heterogeneous Environment.
CoRR, 2024

On the Opportunities of (Re)-Exploring Atmospheric Science by Foundation Models: A Case Study.
CoRR, 2024

A Survey of Multimodal Large Language Model from A Data-centric Perspective.
CoRR, 2024

Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model.
CoRR, 2024

DeFT: Flash Tree-attention with IO-Awareness for Efficient Tree-search-based LLM Inference.
CoRR, 2024

Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU.
CoRR, 2024

CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference.
CoRR, 2024

Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Position: Exploring the Robustness of Pipeline-Parallelism-Based Decentralized Training.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

HexGen: Generative Inference of Large Language Model over Heterogeneous Environment.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Serving Deep Learning Models from Relational Databases.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

2023
Holistic Evaluation of Language Models.
Trans. Mach. Learn. Res., 2023

Exploring the Robustness of Decentralized Training for Large Language Models.
CoRR, 2023

HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment.
CoRR, 2023

Serving Deep Learning Model in Relational Databases.
CoRR, 2023

High-throughput Generative Inference of Large Language Models with a Single GPU.
CoRR, 2023

CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks.
Proceedings of the International Conference on Machine Learning, 2023

Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning.
Proceedings of the International Conference on Machine Learning, 2023

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time.
Proceedings of the International Conference on Machine Learning, 2023

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU.
Proceedings of the International Conference on Machine Learning, 2023

2022
Bayesian Hierarchical Model for Change Point Detection in Multivariate Sequences.
Technometrics, 2022

Distributed Learning of Fully Connected Neural Networks using Independent Subnet Training.
Proc. VLDB Endow., 2022

A community effort to assess and improve computerized interpretation of 12-lead resting electrocardiogram.
Medical Biol. Eng. Comput., 2022

Stochastic Gradient Descent without Full Data Shuffle.
CoRR, 2022

Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees.
CoRR, 2022

In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data Shuffle.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Decentralized Training of Foundation Models in Heterogeneous Environments.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Efficient flow scheduling in distributed deep learning training with echelon formation.
Proceedings of the 21st ACM Workshop on Hot Topics in Networks, 2022

2021
A Feature Fusion Framework and Its Application to Automatic Seizure Detection.
IEEE Signal Process. Lett., 2021

Lachesis: Automated Partitioning for UDF-Centric Analytics.
Proc. VLDB Endow., 2021

Tensor Relational Algebra for Distributed Machine Learning System Design.
Proc. VLDB Endow., 2021

Distributed Numerical and Machine Learning Computations via Two-Phase Execution of Aggregated Join Trees.
Proc. VLDB Endow., 2021

BAGUA: Scaling up Distributed Learning with System Relaxations.
Proc. VLDB Endow., 2021

Automatic Optimization of Matrix Implementations for Distributed Machine Learning and Linear Algebra.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

2020
Declarative Recursive Computation on an RDBMS: or, Why You Should Use a Database For Distributed Machine Learning.
SIGMOD Rec., 2020

Tensor Relational Algebra for Machine Learning System Design.
CoRR, 2020

Lachesis: Automated Generation of Persistent Partitionings for Big Data Applications.
CoRR, 2020

A Federated Learning Framework for Healthcare IoT devices.
CoRR, 2020

2019
Declarative Recursive Computation on an RDBMS.
Proc. VLDB Endow., 2019

Distributed Learning of Deep Neural Networks using Independent Subnet Training.
CoRR, 2019

WaveletFCNN: A Deep Time Series Classification Model for Wind Turbine Blade Icing Detection.
CoRR, 2019

Diagnosing Cardiac Abnormalities from 12-Lead Electrocardiograms Using Enhanced Deep Convolutional Neural Networks.
Proceedings of the Machine Learning and Medical Engineering for Cardiovascular Health and Intravascular Imaging and Computer Assisted Stenting, 2019

2018
PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development.
Proceedings of the 2018 International Conference on Management of Data, 2018

2017
Abridging source code.
Proc. ACM Program. Lang., 2017

2016
Generating a 3D Normative Infant Cranial Model.
Proceedings of the International Conference on Computational Science 2016, 2016

2014
Effective Video Retargeting With Jittery Assessment.
IEEE Trans. Multim., 2014


  Loading...