Xitong Yang

Orcid: 0009-0001-9085-0345

According to our database¹, Xitong Yang authored at least 45 papers between 2015 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

HyIMC: A 16 TOPS/W Hybrid Analog-Digital In-Memory Computing SoC With Model Compression and Recurrent Optimization for Deep Learning-Based Speech Enhancement.

[BibT_eX]

[DOI]

IEEE J. Emerg. Sel. Topics Circuits Syst., June, 2026

SAM 3D Body: Robust Full-Body Human Mesh Recovery.

[BibT_eX]

[DOI]

CoRR, February, 2026

Robust Adaptive Beamforming for Radar Target Polarization Scattering Matrix Estimation.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2026

2025

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.

[BibT_eX]

[DOI]

Triantafyllos Afouras

Santhosh Kumar Ramakrishnan

Oluwatumininu Oguntola

Kiran K. Somasundaram

Giovanni Maria Farinella

Int. J. Comput. Vis., December, 2025

EACAS: An Efficient Anonymous Cross-Domain Authentication Scheme in Internet of Vehicles.

[BibT_eX]

[DOI]

IEEE Internet Things J., April, 2025

HyIMC: Analog-Digital Hybrid In-Memory Computing SoC for High-Quality Low-Latency Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference, 2025

Progress-Aware Video Frame Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

An End-to-End In-Memory Computing System Based on a 40-nm eFlash-Based IMC SoC: Circuits, Toolchains, and Systems Co-Design Framework.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., June, 2024

Building an Open-Vocabulary Video CLIP Model With Better Architectures, Optimization and Data.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2024

Unlocking Exocentric Video-Language Data for Egocentric Video Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2024

GenRec: Unifying Video Generation and Recognition with Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

EgoSG: Learning 3D Scene Graphs from Egocentric RGB-D Sequences.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Learning to Segment Referred Objects from Narrated Egocentric Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Video ReCap: Recursive Captioning of Hour-Long Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.

[BibT_eX]

[DOI]

Triantafyllos Afouras

Santhosh Kumar Ramakrishnan

Oluwatumininu Oguntola

Kiran K. Somasundaram

Giovanni Maria Farinella

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Integrated immunological analysis of single-cell and bulky tissue transcriptomes reveals the role of interactions between M0 macrophages and naïve CD4<sup>+</sup> T cells in the immunosuppressive microenvironment of cervical cancer.

[BibT_eX]

[DOI]

Comput. Biol. Medicine, September, 2023

MINOTAUR: Multi-task Video Grounding From Multimodal Queries.

[BibT_eX]

[DOI]

CoRR, 2023

Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization.

[BibT_eX]

[DOI]

CoRR, 2023

Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Relational Space-Time Query in Long-Form Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Vision Transformers are Good Mask Auto-Labelers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Scalable Neural Representation for Diverse Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Semi-supervised Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Efficient Video Transformers with Spatial-Temporal Token Selection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Long-Term Temporal Modeling for video Action Understanding.

[BibT_eX]

[DOI]

Xitong Yang

PhD thesis, 2021

Efficient Video Transformers with Spatial-Temporal Token Selection.

[BibT_eX]

[DOI]

CoRR, 2021

Beyond Short Clips: End-to-End Video-Level Learning With Collaborative Memories.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Hierarchical Contrastive Motion Learning for Video Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

GTA: Global Temporal Attention for Video Action Understanding.

[BibT_eX]

[DOI]

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020

A Generic Visualization Approach for Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

2019

An Interactive Greedy Approach to Group Sparsity in High Dimensions.

[BibT_eX]

[DOI]

Technometrics, 2019

Exploring Uncertainty in Conditional Multi-Modal Retrieval Systems.

[BibT_eX]

[DOI]

CoRR, 2019

Cross-X Learning for Fine-Grained Visual Categorization.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

STEP: Spatio-Temporal Progressive Learning for Video Action Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Deep Temporal Multimodal Fusion for Medical Procedure Monitoring Using Wearable Sensors.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2018

Two Stream Self-Supervised Learning for Action Recognition.

[BibT_eX]

[DOI]

CoRR, 2018

The Effectiveness of Instance Normalization: a Strong Baseline for Single Image Dehazing.

[BibT_eX]

[DOI]

CoRR, 2018

Strong Baseline for Single Image Dehazing with Deep Features and Instance Normalization.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference 2018, 2018

Towards Perceptual Image Dehazing by Physics-Based Disentanglement and Adversarial Training.

[BibT_eX]

[DOI]

Xitong Yang

Zheng Xu

Jiebo Luo

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Tracking Illicit Drug Dealing and Abuse on Instagram Using Multimodal Analysis.

[BibT_eX]

[DOI]

Xitong Yang

Jiebo Luo

ACM Trans. Intell. Syst. Technol., 2017

Deep Multimodal Representation Learning from Temporal Data.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2015

Pinterest Board Recommendation for Twitter Users.

[BibT_eX]

[DOI]

Xitong Yang

Yuncheng Li

Jiebo Luo

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Semantic Video Entity Linking Based on Visual Content and Metadata.

[BibT_eX]

[DOI]

Yuncheng Li

Xitong Yang

Jiebo Luo

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Xitong Yang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...