Shentong Mo

Orcid: 0000-0003-3308-9585

According to our database1, Shentong Mo authored at least 48 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Context Autoencoder for Self-supervised Representation Learning.
Int. J. Comput. Vis., January, 2024

BSTG-Trans: A Bayesian Spatial-Temporal Graph Transformer for Long-Term Pose Forecasting.
IEEE Trans. Multim., 2024

DailyMAE: Towards Pretraining Masked Autoencoders in One Day.
CoRR, 2024

Text-to-Audio Generation Synchronized with Videos.
CoRR, 2024

Audio-Synchronized Visual Animation.
CoRR, 2024

LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning.
CoRR, 2024

We Choose to Go to Space: Agent-driven Human and Multi-Robot Collaboration in Microgravity.
CoRR, 2024

2023
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation.
CoRR, 2023

Beyond Accuracy: Statistical Measures and Benchmark for Evaluation of Representation from Self-Supervised Learning.
CoRR, 2023

Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling.
CoRR, 2023

MultiIoT: Towards Large-scale Multisensory Learning for the Internet of Things.
CoRR, 2023

Exploring Data Augmentations on Self-/Semi-/Fully- Supervised Pre-trained Models.
CoRR, 2023

Tree of Uncertain Thoughts Reasoning for Large Language Models.
CoRR, 2023

Masked Momentum Contrastive Learning for Zero-shot Semantic Understanding.
CoRR, 2023

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation.
CoRR, 2023

DiffAVA: Personalized Text-to-Audio Generation with Visual Alignment.
CoRR, 2023

AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation.
CoRR, 2023

CAVL: Learning Contrastive and Adaptive Representations of Vision and Language.
CoRR, 2023

Variantional autoencoder with decremental information bottleneck for disentanglement.
CoRR, 2023

Multi-level Contrastive Learning for Self-Supervised Vision Transformers.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Representation Disentanglement in Generative Models with Contrastive Learning.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Weakly-Supervised Audio-Visual Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DiffComplete: Diffusion-based Generative 3D Shape Completion.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition.
Proceedings of the International Conference on Machine Learning, 2023

Audio-Visual Class-Incremental Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Class-Incremental Grouping Network for Continual Audio-Visual Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Audio-Visual Grouping Network for Sound Localization from Mixtures.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Variational Autoencoders with Decremental Information Bottleneck for Disentanglement.
Proceedings of the 34th British Machine Vision Conference 2023, 2023

2022
Object-wise Masked Autoencoders for Fast Pre-training.
CoRR, 2022

Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding.
CoRR, 2022

HighMMT: Towards Modality and Task Generalization for High-Modality Representation Learning.
CoRR, 2022

Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Closer Look at Weakly-Supervised Audio-Visual Source Localization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Localizing Visual Sounds the Easy Way.
Proceedings of the Computer Vision - ECCV 2022, 2022

Unitail: Detecting, Reading, and Matching in Retail Scene.
Proceedings of the Computer Vision - ECCV 2022, 2022

Rethinking Prototypical Contrastive Learning through Alignment, Uniformity and Correlation.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021
Multi-modal Self-supervised Pre-training for Regulatory Genome Across Cell Types.
CoRR, 2021

Learning by Examples Based on Multi-level Optimization.
CoRR, 2021

An Empirical Study of Uncertainty Gap for Disentangling Factors.
Proceedings of the Trustworthy AI'21: Proceedings of the 1st International Workshop on Trustworthy AI for Multimedia Computing, 2021

OsGG-Net: One-step Graph Generation Network for Unbiased Head Pose Estimation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Long-Term Head Pose Forecasting Conditioned on the Gaze-Guiding Prior.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Point3D: tracking actions as moving points with 3D CNNs.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

Siamese Prototypical Contrastive Learning.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
Towards Improving Spatiotemporal Action Recognition in Videos.
CoRR, 2020

Automatic Speech Verification Spoofing Detection.
CoRR, 2020

2018
SERS spectrum of RHB solution measured on different patterns.
Dataset, November, 2018


  Loading...