Xinghao Chen

Orcid: 0000-0002-2102-8235

Affiliations:
  • Huawei Noah's Ark Lab, Beijing, China
  • Tsinghua University, Department of Electronic Engineering, Beijing, China (former)


According to our database1, Xinghao Chen authored at least 96 papers between 2016 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Thinking-while-speaking: A Controlled, Interleaved Reasoning Method for Real-Time Speech Generation.
CoRR, May, 2026

ElasticDiT: Efficient Diffusion Transformers via Elastic Architecture and Sparse Attention for High-Resolution Image Generation on Mobile Devices.
CoRR, May, 2026

Near-Policy: Accelerating On-Policy Distillation via Asynchronous Generation and Selective Packing.
CoRR, May, 2026

Allo{SR}<sup>2</sup>: Rectifying One-Step Super-Resolution to Stay Real via Allomorphic Generative Flows.
CoRR, April, 2026

SJD-PAC: Accelerating Speculative Jacobi Decoding via Proactive Drafting and Adaptive Continuation.
CoRR, March, 2026

An Empirical Study of World Model Quantization.
CoRR, February, 2026

GenVidBench: A 6-Million Benchmark for AI-Generated Video Detection.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
From Sequential to Spatial: Reordering Autoregression for Efficient Visual Generation.
CoRR, December, 2025

VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse.
CoRR, December, 2025

Towards Lossless Ultimate Vision Token Compression for VLMs.
CoRR, December, 2025

From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs.
CoRR, December, 2025

Nexus: Higher-Order Attention Mechanisms in Transformers.
CoRR, December, 2025

VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm.
CoRR, December, 2025

Ada-MoGE: Adaptive Mixture of Gaussian Expert Model for Time Series Forecasting.
CoRR, December, 2025

ROOT: Robust Orthogonalized Optimizer for Neural Network Training.
CoRR, November, 2025

Positional Preservation Embedding for Multimodal Large Language Models.
CoRR, October, 2025

Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing.
CoRR, October, 2025

Revealing the Power of Post-Training for Small Language Models via Knowledge Distillation.
CoRR, September, 2025

Expert Merging: Model Merging with Unsupervised Expert Alignment and Importance-Guided Layer Chunking.
CoRR, September, 2025

RDDM: Practicing RAW Domain Diffusion Model for Real-world Image Restoration.
CoRR, August, 2025

OmniEval: A Benchmark for Evaluating Omni-modal Models with Visual, Auditory, and Textual Inputs.
CoRR, June, 2025

RealSR-R1: Reinforcement Learning for Real-World Image Super-Resolution with Vision-Language Chain-of-Thought.
CoRR, June, 2025

EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization.
CoRR, June, 2025

Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition.
CoRR, May, 2025

Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity.
CoRR, May, 2025

Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs.
CoRR, May, 2025

GenVidBench: A Challenging Benchmark for Detecting AI-Generated Video.
CoRR, January, 2025

Full-Stage Pseudo Label Quality Enhancement for Weakly-Supervised Temporal Action Localization.
IEEE Trans. Circuits Syst. Video Technol., 2025

ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

SlimLLM: Accurate Structured Pruning for Large Language Models.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

DECO: Unleashing the Potential of ConvNets for Query-based Detection and Segmentation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model Enhancement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Single Domain Generalization for Few-Shot Counting via Universal Representation Matching.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

TinySAM: Pushing the Envelope for Efficient Segment Anything Model.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model.
CoRR, 2024

No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding.
CoRR, 2024

Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation.
CoRR, 2024

SSA-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
Improving Lightweight AdderNet via Distillation From ℓ<sub>2</sub> to ℓ<sub>1</sub>-norm.
IEEE Trans. Image Process., 2023

A Survey on Vision Transformer.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

DECO: Query-Based End-to-End Object Detection with ConvNets.
CoRR, 2023

PPT: Token Pruning and Pooling for Efficient Vision Transformers.
CoRR, 2023

Less is More: Focus Attention for Efficient DETR.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Random Normalization Aggregation for Adversarial Defense.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Spatial-Channel Token Distillation for Vision MLPs.
Proceedings of the International Conference on Machine Learning, 2022

MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

MTP: Multi-Task Pruning for Efficient Semantic Segmentation Networks.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets.
Proceedings of the Computer Vision - ECCV 2022, 2022

Searching for Energy-Efficient Hybrid Adder-Convolution Neural Networks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Hire-MLP: Vision MLP via Hierarchical Rearrangement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CMT: Convolutional Neural Networks Meet Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

AutoLoss-GMS: Searching Generalized Margin-based Softmax Loss Function for Person Re-identification.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Multimodal Token Fusion for Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Handling Long-tailed Feature Distribution in AdderNets.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Towards Stable and Robust AdderNets.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

An Empirical Study of Adder Neural Networks for Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Winograd Algorithm for AdderNet.
Proceedings of the 38th International Conference on Machine Learning, 2021

Data-Free Knowledge Distillation for Image Super-Resolution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Positive-Unlabeled Data Purification in the Wild for Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Distilling Object Detectors via Decoupled Features.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Bi-Stream Pose-Guided Region Ensemble Network for Fingertip Localization From Stereo Images.
IEEE Trans. Neural Networks Learn. Syst., 2020

Pose guided structured region ensemble network for cascaded hand pose estimation.
Neurocomputing, 2020

A Survey on Visual Transformer.
CoRR, 2020

VEGA: Towards an End-to-End Configurable AutoML Pipeline.
CoRR, 2020

Multi-Task Pruning for Semantic Segmentation Networks.
CoRR, 2020

HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens.
CoRR, 2020

Kernel Based Progressive Distillation for Adder Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Weakly Supervised Segmentation Guided Hand Pose Estimation During Interaction with Unknown Objects.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer.
Proceedings of the Computer Vision - ECCV 2020, 2020

CARS: Continuous Evolution for Efficient Neural Architecture Search.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
MFA-Net: Motion Feature Augmented Network for Dynamic Hand Gesture Recognition from Skeletal Data.
Sensors, 2019

Blurring-Effect-Free CNN Network of Structural Edge for Focus Stacking.
IEEE Access, 2019

Blurring-Effect-Free CNN for Optimization of Structural Edges in Focus Stacking.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

2018
Region ensemble network: Towards good practices for deep 3D hand pose estimation.
J. Vis. Commun. Image Represent., 2018

SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds.
IEEE Access, 2018

Interactive Hand Pose Estimation: Boosting accuracy in localizing extended finger joints.
Proceedings of the Visual Information Processing and Communication IX, Burlingame, CA, USA, 28 January 2018, 2018

Spatial-Temporal Attention Res-TCN for Skeleton-Based Dynamic Hand Gesture Recognition.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Bi-stream Region Ensemble Network: Promoting Accuracy in Fingertip Localization from Stereo Images.
Proceedings of the British Machine Vision Conference 2018, 2018

2017
3D Hand Pose Estimation: From Current Achievements to Future Goals.
CoRR, 2017

Towards Good Practices for Deep 3D Hand Pose Estimation.
CoRR, 2017

Two-stream binocular network: Accurate near field finger detection based on binocular images.
Proceedings of the 2017 IEEE Visual Communications and Image Processing, 2017

Region ensemble network: Improving convolutional network for hand pose estimation.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Motion feature augmented recurrent neural network for skeleton-based dynamic hand gesture recognition.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

2016
Two-stream convolutional neural network for accurate RGB-D fingertip detection using depth and edge information.
CoRR, 2016

Accurate fingertip detection from binocular mask images.
Proceedings of the 2016 Visual Communications and Image Processing, 2016

Static hand gesture recognition based on finger root-center-angle and length weighted Mahalanobis distance.
Proceedings of the Real-Time Image and Video Processing 2016, 2016


  Loading...