Songyang Zhang

Orcid: 0000-0003-4316-3320

Affiliations:
  • University of Rochester, Rochester, NY, USA


According to our database1, Songyang Zhang authored at least 24 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Cross Modality Bias in Visual Question Answering: A Causal View With Possible Worlds VQA.
IEEE Trans. Multim., 2024

2023
Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA.
CoRR, 2023

Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation.
CoRR, 2023

Make-A-Video: Text-to-Video Generation without Text-Video Data.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Multi-Scale 2D Temporal Adjacency Networks for Moment Localization With Natural Language.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Learning a Grammar Inducer from Massive Uncurated Instructional Videos.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Expanding Language-Image Pretrained Models for General Video Recognition.
Proceedings of the Computer Vision - ECCV 2022, 2022

MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration.
Proceedings of the Computer Vision - ECCV 2022, 2022

The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Rethinking the Evaluation of Unbiased Scene Graph Generation.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021
Video-aided Unsupervised Grammar Induction.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Instance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based Explanation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

SAT: 2D Semantics Assisted Training for 3D Visual Grounding.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Mi YouTube es Su YouTube? Analyzing the Cultures using YouTube Thumbnails of Popular Videos.
Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021

Boundary Proposal Network for Two-stage Natural Language Video Localization.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language.
CoRR, 2020

Global Image Sentiment Transfer.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Content-based Analysis of the Cultural Differences between TikTok and Douyin.
Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Explorations of skeleton features for LSTM-based action recognition.
Multim. Tools Appl., 2019

Learning Sparse 2D Temporal Adjacent Networks for Temporal Action Localization.
CoRR, 2019

Exploiting Temporal Relationships in Video Moment Localization with Natural Language.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

2018
Fusing Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks.
IEEE Trans. Multim., 2018

2017
On Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks.
Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017


  Loading...