Shwai He

According to our database1, Shwai He authored at least 30 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
CogniPair: From LLM Chatbots to Conscious AI Agents - GNWT-Based Multi-Agent Digital Twins for Social Pairing - Dating & Hiring Applications.
CoRR, June, 2025

CoIn: Counting the Invisible Reasoning Tokens in Commercial Opaque LLM APIs.
CoRR, May, 2025

SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning.
CoRR, April, 2025

Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts.
CoRR, March, 2025

Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques.
Trans. Mach. Learn. Res., 2025

Towards counterfactual fairness through auxiliary variables.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Towards counterfactual fairness thorough auxiliary variables.
CoRR, 2024

Fair Diagnosis: Leveraging Causal Modeling to Mitigate Medical Bias.
CoRR, 2024

Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers.
CoRR, 2024

What Matters in Transformers? Not All Attention is Needed.
CoRR, 2024

Demystifying the Compression of Mixture-of-Experts Through a Unified Framework.
CoRR, 2024

RESSA: Repair Sparse Vision-Language Models via Sparse Cross-Modality Adaptation.
CoRR, 2024

Accurate prediction of antibody function and structure using bio-inspired antibody language model.
Briefings Bioinform., 2024

Loki: Low-rank Keys for Efficient Sparse Attention.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Reformatted Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning.
CoRR, 2023

MerA: Merging Pretrained Adapters For Few-Shot Learning.
CoRR, 2023

SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes.
Proceedings of the International Conference on Machine Learning, 2023

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

PAD-Net: An Efficient Framework for Dynamic Networks.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Cherry Hypothesis: Identifying the Cherry on the Cake for Dynamic Networks.
CoRR, 2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.
CoRR, 2022

Vega-MT: The JD Explore Academy Translation System for WMT22.
CoRR, 2022

When Sparsity Meets Dynamic Convolution.
CoRR, 2022

Vega-MT: The JD Explore Academy Machine Translation System for WMT22.
Proceedings of the Seventh Conference on Machine Translation, 2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Multi-modal Attention Network for Stock Movements Prediction.
CoRR, 2021


  Loading...