Shwai He

According to our database1, Shwai He authored at least 40 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Cognibit: From Digital Exhaustion to Real-World Connection Through Gamified Territory Control and LLM-Powered Twin Networking.
CoRR, April, 2026

Demystifying When Pruning Works via Representation Hierarchies.
CoRR, March, 2026

ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model.
CoRR, March, 2026

MoEless: Efficient MoE LLM Serving via Serverless Computing.
CoRR, March, 2026

Uncovering the Redundancy in Transformers via a Unified Study of Layer Dropping.
Trans. Mach. Learn. Res., 2026

2025
Making Large Language Models Efficient Dense Retrievers.
CoRR, December, 2025

Understanding and Harnessing Sparsity in Unified Multimodal Models.
CoRR, December, 2025

Dense Video Understanding with Gated Residual Tokenization.
CoRR, September, 2025

DualSparse-MoE: Coordinating Tensor/Neuron-Level Sparsity with Expert Partition and Reconstruction.
CoRR, August, 2025

CogniPair: From LLM Chatbots to Conscious AI Agents - GNWT-Based Multi-Agent Digital Twins for Social Pairing - Dating & Hiring Applications.
CoRR, June, 2025

CoIn: Counting the Invisible Reasoning Tokens in Commercial Opaque LLM APIs.
CoRR, May, 2025

SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning.
CoRR, April, 2025

Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts.
CoRR, March, 2025

Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques.
Trans. Mach. Learn. Res., 2025

Towards counterfactual fairness through auxiliary variables.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Router-Tuning: A Simple and Effective Approach for Dynamic Depth.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024
Accurate prediction of antibody function and structure using bio-inspired antibody language model.
Briefings Bioinform., July, 2024

Towards counterfactual fairness thorough auxiliary variables.
CoRR, 2024

Fair Diagnosis: Leveraging Causal Modeling to Mitigate Medical Bias.
CoRR, 2024

Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers.
CoRR, 2024

What Matters in Transformers? Not All Attention is Needed.
CoRR, 2024

Demystifying the Compression of Mixture-of-Experts Through a Unified Framework.
CoRR, 2024

RESSA: Repair Sparse Vision-Language Models via Sparse Cross-Modality Adaptation.
CoRR, 2024

Loki: Low-rank Keys for Efficient Sparse Attention.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Reformatted Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning.
CoRR, 2023

MerA: Merging Pretrained Adapters For Few-Shot Learning.
CoRR, 2023

SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes.
Proceedings of the International Conference on Machine Learning, 2023

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

PAD-Net: An Efficient Framework for Dynamic Networks.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Cherry Hypothesis: Identifying the Cherry on the Cake for Dynamic Networks.
CoRR, 2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.
CoRR, 2022

Vega-MT: The JD Explore Academy Translation System for WMT22.
CoRR, 2022

When Sparsity Meets Dynamic Convolution.
CoRR, 2022

Vega-MT: The JD Explore Academy Machine Translation System for WMT22.
Proceedings of the Seventh Conference on Machine Translation, 2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Multi-modal Attention Network for Stock Movements Prediction.
CoRR, 2021


  Loading...