Xinghao Chen

Orcid: 0000-0002-2102-8235

Affiliations:

Huawei Noah's Ark Lab, Beijing, China
Tsinghua University, Department of Electronic Engineering, Beijing, China (former)

According to our database¹, Xinghao Chen authored at least 96 papers between 2016 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Thinking-while-speaking: A Controlled, Interleaved Reasoning Method for Real-Time Speech Generation.

[BibT_eX]

[DOI]

CoRR, May, 2026

ElasticDiT: Efficient Diffusion Transformers via Elastic Architecture and Sparse Attention for High-Resolution Image Generation on Mobile Devices.

[BibT_eX]

[DOI]

CoRR, May, 2026

Near-Policy: Accelerating On-Policy Distillation via Asynchronous Generation and Selective Packing.

[BibT_eX]

[DOI]

CoRR, May, 2026

Allo{SR}2: Rectifying One-Step Super-Resolution to Stay Real via Allomorphic Generative Flows.

[BibT_eX]

[DOI]

CoRR, April, 2026

SJD-PAC: Accelerating Speculative Jacobi Decoding via Proactive Drafting and Adaptive Continuation.

[BibT_eX]

[DOI]

CoRR, March, 2026

An Empirical Study of World Model Quantization.

[BibT_eX]

[DOI]

CoRR, February, 2026

GenVidBench: A 6-Million Benchmark for AI-Generated Video Detection.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

From Sequential to Spatial: Reordering Autoregression for Efficient Visual Generation.

[BibT_eX]

[DOI]

CoRR, December, 2025

VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse.

[BibT_eX]

[DOI]

CoRR, December, 2025

Towards Lossless Ultimate Vision Token Compression for VLMs.

[BibT_eX]

[DOI]

CoRR, December, 2025

From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs.

[BibT_eX]

[DOI]

CoRR, December, 2025

Nexus: Higher-Order Attention Mechanisms in Transformers.

[BibT_eX]

[DOI]

CoRR, December, 2025

VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm.

[BibT_eX]

[DOI]

CoRR, December, 2025

Ada-MoGE: Adaptive Mixture of Gaussian Expert Model for Time Series Forecasting.

[BibT_eX]

[DOI]

CoRR, December, 2025

ROOT: Robust Orthogonalized Optimizer for Neural Network Training.

[BibT_eX]

[DOI]

CoRR, November, 2025

Positional Preservation Embedding for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing.

[BibT_eX]

[DOI]

CoRR, October, 2025

Revealing the Power of Post-Training for Small Language Models via Knowledge Distillation.

[BibT_eX]

[DOI]

CoRR, September, 2025

Expert Merging: Model Merging with Unsupervised Expert Alignment and Importance-Guided Layer Chunking.

[BibT_eX]

[DOI]

CoRR, September, 2025

RDDM: Practicing RAW Domain Diffusion Model for Real-world Image Restoration.

[BibT_eX]

[DOI]

CoRR, August, 2025

OmniEval: A Benchmark for Evaluating Omni-modal Models with Visual, Auditory, and Textual Inputs.

[BibT_eX]

[DOI]

CoRR, June, 2025

RealSR-R1: Reinforcement Learning for Real-World Image Super-Resolution with Vision-Language Chain-of-Thought.

[BibT_eX]

[DOI]

CoRR, June, 2025

EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization.

[BibT_eX]

[DOI]

CoRR, June, 2025

Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition.

[BibT_eX]

[DOI]

CoRR, May, 2025

Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity.

[BibT_eX]

[DOI]

CoRR, May, 2025

Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs.

[BibT_eX]

[DOI]

CoRR, May, 2025

GenVidBench: A Challenging Benchmark for Detecting AI-Generated Video.

[BibT_eX]

[DOI]

CoRR, January, 2025

Full-Stage Pseudo Label Quality Enhancement for Weakly-Supervised Temporal Action Localization.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2025

ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

SlimLLM: Accurate Structured Pruning for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

DECO: Unleashing the Potential of ConvNets for Query-based Detection and Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba.

[BibT_eX]

[DOI]

Xiaowen Ma

Zhenliang Ni

Xinghao Chen

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Single Domain Generalization for Few-Shot Counting via Universal Representation Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

TinySAM: Pushing the Envelope for Efficient Segment Anything Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation.

[BibT_eX]

[DOI]

Xiaowen Ma

Zhenliang Ni

Xinghao Chen

CoRR, 2024

SSA-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation.

[BibT_eX]

[DOI]

Xiaowen Ma

Zhenliang Ni

Xinghao Chen

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

Improving Lightweight AdderNet via Distillation From ℓ2 to ℓ1-norm.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

A Survey on Vision Transformer.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2023

DECO: Query-Based End-to-End Object Detection with ConvNets.

[BibT_eX]

[DOI]

CoRR, 2023

PPT: Token Pruning and Pooling for Efficient Vision Transformers.

[BibT_eX]

[DOI]

CoRR, 2023

Less is More: Focus Attention for Efficient DETR.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons.

[BibT_eX]

[DOI]

Yixing Xu

Xinghao Chen

Yunhe Wang

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Random Normalization Aggregation for Adversarial Defense.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Spatial-Channel Token Distillation for Vision MLPs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

MTP: Multi-Task Pruning for Efficient Semantic Segmentation Networks.

[BibT_eX]

[DOI]

Xinghao Chen

Yiman Zhang

Yunhe Wang

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Searching for Energy-Efficient Hybrid Adder-Convolution Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Hire-MLP: Vision MLP via Hierarchical Rearrangement.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CMT: Convolutional Neural Networks Meet Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

AutoLoss-GMS: Searching Generalized Margin-based Softmax Loss Function for Person Re-identification.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Multimodal Token Fusion for Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Handling Long-tailed Feature Distribution in AdderNets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Towards Stable and Robust AdderNets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

An Empirical Study of Adder Neural Networks for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Winograd Algorithm for AdderNet.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Data-Free Knowledge Distillation for Image Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Positive-Unlabeled Data Purification in the Wild for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Distilling Object Detectors via Decoupled Features.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Bi-Stream Pose-Guided Region Ensemble Network for Fingertip Localization From Stereo Images.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2020

Pose guided structured region ensemble network for cascaded hand pose estimation.

[BibT_eX]

[DOI]

Neurocomputing, 2020

A Survey on Visual Transformer.

[BibT_eX]

[DOI]

CoRR, 2020

VEGA: Towards an End-to-End Configurable AutoML Pipeline.

[BibT_eX]

[DOI]

CoRR, 2020

Multi-Task Pruning for Semantic Segmentation Networks.

[BibT_eX]

[DOI]

CoRR, 2020

HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens.

[BibT_eX]

[DOI]

CoRR, 2020

Kernel Based Progressive Distillation for Adder Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Weakly Supervised Segmentation Guided Hand Pose Estimation During Interaction with Unknown Objects.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

CARS: Continuous Evolution for Efficient Neural Architecture Search.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

MFA-Net: Motion Feature Augmented Network for Dynamic Hand Gesture Recognition from Skeletal Data.

[BibT_eX]

[DOI]

Sensors, 2019

Blurring-Effect-Free CNN Network of Structural Edge for Focus Stacking.

[BibT_eX]

[DOI]

IEEE Access, 2019

Blurring-Effect-Free CNN for Optimization of Structural Edges in Focus Stacking.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

2018

Region ensemble network: Towards good practices for deep 3D hand pose estimation.

[BibT_eX]

[DOI]

J. Vis. Commun. Image Represent., 2018

SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds.

[BibT_eX]

[DOI]

IEEE Access, 2018

Interactive Hand Pose Estimation: Boosting accuracy in localizing extended finger joints.

[BibT_eX]

[DOI]

Proceedings of the Visual Information Processing and Communication IX, Burlingame, CA, USA, 28 January 2018, 2018

Spatial-Temporal Attention Res-TCN for Skeleton-Based Dynamic Hand Gesture Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Bi-stream Region Ensemble Network: Promoting Accuracy in Fingertip Localization from Stereo Images.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference 2018, 2018

2017

3D Hand Pose Estimation: From Current Achievements to Future Goals.

[BibT_eX]

[DOI]

CoRR, 2017

Towards Good Practices for Deep 3D Hand Pose Estimation.

[BibT_eX]

[DOI]

CoRR, 2017

Two-stream binocular network: Accurate near field finger detection based on binocular images.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Visual Communications and Image Processing, 2017

Region ensemble network: Improving convolutional network for hand pose estimation.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Motion feature augmented recurrent neural network for skeleton-based dynamic hand gesture recognition.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

2016

Two-stream convolutional neural network for accurate RGB-D fingertip detection using depth and edge information.

[BibT_eX]

[DOI]

Hengkai Guo

Guijin Wang

Xinghao Chen

CoRR, 2016

Accurate fingertip detection from binocular mask images.

[BibT_eX]

[DOI]

Xinghao Chen

Guijin Wang

Hengkai Guo

Proceedings of the 2016 Visual Communications and Image Processing, 2016

Static hand gesture recognition based on finger root-center-angle and length weighted Mahalanobis distance.

[BibT_eX]

[DOI]

Xinghao Chen

Chenbo Shi

Bo Liu

Proceedings of the Real-Time Image and Video Processing 2016, 2016

Xinghao Chen

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...