Songyang Zhang

Orcid: 0000-0002-1902-1720

Affiliations:
  • ShanghaiTech University, Shanghai, China


According to our database1, Songyang Zhang authored at least 73 papers between 2017 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency.
CoRR, August, 2025

Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis.
CoRR, August, 2025

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward.
CoRR, August, 2025

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.
CoRR, July, 2025

CompassJudger-2: Towards Generalist Judge Model via Verifiable Rewards.
CoRR, July, 2025

Rethinking Verification for LLM Code Generation: From Generation to Testing.
CoRR, July, 2025

Coding Triangle: How Does Large Language Model Understand Code?
CoRR, July, 2025

Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective.
CoRR, May, 2025

Evaluating Large Language Model with Knowledge Oriented Language Specific Simple Question Answering.
CoRR, May, 2025

PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model.
CoRR, March, 2025

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning.
CoRR, February, 2025

LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation.
CoRR, January, 2025

Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement.
CoRR, January, 2025

NeedleBench: Evaluating LLM Retrieval and Reasoning Across Varying Information Densities.
Trans. Mach. Learn. Res., 2025

InternLM-Law: An Open-Sourced Chinese Legal Large Language Model.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

OpenHuEval: Evaluating Large Language Model on Hungarian Specifics.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
SGTR+: End-to-End Scene Graph Generation With Transformer.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling.
Trans. Mach. Learn. Res., 2024

Are Your LLMs Capable of Stable Reasoning?
CoRR, 2024

CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution.
CoRR, 2024

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models.
CoRR, 2024

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?
CoRR, 2024

CIBench: Evaluating Your LLMs with a Code Interpreter Plugin.
CoRR, 2024

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.
CoRR, 2024

InternLM-Law: An Open Source Chinese Legal Large Language Model.
CoRR, 2024

FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models.
CoRR, 2024

Adapting LLaMA Decoder to Vision Transformer.
CoRR, 2024

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning.
CoRR, 2024

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.
CoRR, 2024

HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance.
CoRR, 2024

GTA: A Benchmark for General Tool Agents.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Fake Alignment: Are LLMs Really Aligned Well?
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

LawBench: Benchmarking Legal Knowledge of Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

MMBench: Is Your Multi-modal Model an All-Around Player?
Proceedings of the Computer Vision - ECCV 2024, 2024

From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
T-Eval: Evaluating the Tool Utilization Capability Step by Step.
CoRR, 2023

LawBench: Benchmarking Legal Knowledge of Large Language Models.
CoRR, 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.
CoRR, 2023

Learning Referring Video Object Segmentation from Weak Annotation.
CoRR, 2023

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer.
CoRR, 2023

Temporal Segment Transformer for Action Segmentation.
CoRR, 2023

TG-VQA: Ternary Game of Video Question Answering.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Improving Pixel-based MIM by Reducing Wasted Modeling Capability.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

RIFormer: Keep Your Vision Backbone Effective But Removing Token Mixer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Budget-aware Few-shot Learning via Graph Convolutional Network.
CoRR, 2022

Robust Temporally-Coherent Strategy for Few-shot Video Instance Segmentation.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

Learning Semantic Correspondence with Sparse Annotations.
Proceedings of the Computer Vision - ECCV 2022, 2022

Action Quality Assessment with Temporal Parsing Transformer.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
Workshop on Autonomous Driving at CVPR 2021: Technical Report for Streaming Perception Challenge.
CoRR, 2021

Dynamic Grained Encoder for Vision Transformers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

An EM Framework for Online Incremental Learning of Semantic Segmentation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Learning Implicit Temporal Alignment for Few-shot Video Classification.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Distribution Alignment: A Unified Framework for Long-Tail Visual Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene Graph Generation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Transformer with Bidirectional Decoder for Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Part-Aware Prototype Network for Few-Shot Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
LatentGNN: Learning Efficient Non-local Relations for Visual Recognition.
Proceedings of the 36th International Conference on Machine Learning, 2019

Dynamic Context Correspondence Network for Semantic Alignment.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Dual Attention Network with Semantic Embedding for Few-Shot Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2017
Generalization Tower Network: A Novel Deep Neural Network Architecture for Multi-Task Learning.
CoRR, 2017

Predicting Salient Face in Multiple-Face Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017


  Loading...