Xianzhi Du

According to our database¹, Xianzhi Du authored at least 52 papers between 2011 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

LensVLM: Selective Context Expansion for Compressed Visual Representation of Text.

[BibT_eX]

[DOI]

CoRR, May, 2026

CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer.

[BibT_eX]

[DOI]

CoRR, September, 2025

AXLearn: Modular Large Model Training on Heterogeneous Infrastructure.

[BibT_eX]

[DOI]

CoRR, July, 2025

Finding Fantastic Experts in MoEs: A Unified Study for Expert Dropping Strategies and Observations.

[BibT_eX]

[DOI]

CoRR, April, 2025

IDEA Prune: An Integrated Enlarge-and-Prune Pipeline in Generative Language Model Pretraining.

[BibT_eX]

[DOI]

CoRR, March, 2025

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning.

[BibT_eX]

[DOI]

Jean-Philippe Fauconnier

Zhengfeng Lai

Haoxuan You

Zirui Wang

et al.

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

2024

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning.

[BibT_eX]

[DOI]

CoRR, 2024

Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training.

[BibT_eX]

[DOI]

CoRR, 2024

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training.

[BibT_eX]

[DOI]

Brandon McKinzie

Zhe Gan

Jean-Philippe Fauconnier

CoRR, 2024

Empowering Unsupervised Domain Adaptation with Large-scale Pre-trained Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Ferret: Refer and Ground Anything Anywhere at Any Granularity.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

MOFI: Learning Image Representations from Noisy Entity Annotated Images.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Compressing LLMs: The Truth is Rarely Pure and Never Simple.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Guiding Instruction-based Image Editing via Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

MM1: Methods, Analysis and Insights from Multimodal LLM Pre-training.

[BibT_eX]

[DOI]

Brandon McKinzie

Zhe Gan

Jean-Philippe Fauconnier

Proceedings of the Computer Vision - ECCV 2024, 2024

VeCLIP: Improving CLIP Training via Visual-Enriched Captions.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

From Scarcity to Efficiency: Improving CLIP Training via Visual-enriched Captions.

[BibT_eX]

[DOI]

CoRR, 2023

Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts.

[BibT_eX]

[DOI]

CoRR, 2023

MOFI: Learning Image Representations from Noisy Entity Annotated Images.

[BibT_eX]

[DOI]

CoRR, 2023

Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness.

[BibT_eX]

[DOI]

CoRR, 2023

ISAA: Boost Repair Process by Constructing the Degree Constrained Optimal Repair Tree for Erasure-coded Systems.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

Optimizing Anchor-based Detectors for Autonomous Driving Scenes.

[BibT_eX]

[DOI]

Xianzhi Du

Wei-Chih Hung

Tsung-Yi Lin

CoRR, 2022

Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropagation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Auto-scaling Vision Transformers without Training.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

A Genetic Algorithm-based Construction of Fractional Repetition Codes.

[BibT_eX]

[DOI]

Proceedings of the IEEE Global Communications Conference, 2022

A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2021

Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text.

[BibT_eX]

[DOI]

CoRR, 2021

Revisiting 3D ResNets for Video Recognition.

[BibT_eX]

[DOI]

CoRR, 2021

Simple Training Strategies and Model Scaling for Object Detection.

[BibT_eX]

[DOI]

CoRR, 2021

Dilated SpineNet for Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2021

Revisiting ResNets: Improved Training and Scaling Strategies.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020

Multitask Deep Neural Networks for Tele-Wide Stereo Matching.

[BibT_eX]

[DOI]

IEEE Access, 2020

FBA-AMNET: Foreground-Background Aware Atrous Multiscale Networks for Stereo Disparity Estimation.

[BibT_eX]

[DOI]

Xianzhi Du

Mostafa El-Khamy

Jungwon Lee

Proceedings of the 2020 IEEE International Conference on Consumer Electronics (ICCE), 2020

Efficient Scale-Permuted Backbone with Learned Resource Distribution.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

TW-SMNet: Deep Multitask Learning of Tele-Wide Stereo Matching.

[BibT_eX]

[DOI]

CoRR, 2019

AMNet: Deep Atrous Multiscale Stereo Disparity Estimation Networks.

[BibT_eX]

[DOI]

Xianzhi Du

Mostafa El-Khamy

Jungwon Lee

CoRR, 2019

Multi-Task Learning of Depth from Tele and Wide Stereo Image Pairs.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Boundary-sensitive Network for Portrait Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Conference on Automatic Face & Gesture Recognition, 2019

2018

Fused Deep Neural Networks for Efficient Pedestrian Detection.

[BibT_eX]

[DOI]

CoRR, 2018

2017

Computer Vision and Deep Learning with Applications to Object Detection, Segmentation, and Document Analysis.

[BibT_eX]

[DOI]

Xianzhi Du

PhD thesis, 2017

Fused DNN: A Deep Neural Network Fusion Approach to Fast and Robust Pedestrian Detection.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017

Cyber-physical system enabled nearby traffic flow modelling for autonomous vehicles.

[BibT_eX]

[DOI]

Proceedings of the 36th IEEE International Performance Computing and Communications Conference, 2017

2015

A graphical model approach for matching partial signatures.

[BibT_eX]

[DOI]

Xianzhi Du

David S. Doermann

Wael Abd-Almageed

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014

Signature Matching Using Supervised Topic Models.

[BibT_eX]

[DOI]

Xianzhi Du

David S. Doermann

Wael Abd-Almageed

Proceedings of the 22nd International Conference on Pattern Recognition, 2014

2013

Large-Scale Signature Matching Using Multi-stage Hashing.

[BibT_eX]

[DOI]

Xianzhi Du

Wael Abd-Almageed

David S. Doermann

Proceedings of the 12th International Conference on Document Analysis and Recognition, 2013

2011

A Novel Wideband Spatial Power Combining Amplifier Based on Turnstile-Junction Waveguide Divider/Combiner.

[BibT_eX]

[DOI]

IEICE Trans. Electron., 2011

Xianzhi Du

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...