Sitong Wu

Orcid: 0000-0002-2830-2831

According to our database1, Sitong Wu authored at least 35 papers between 2018 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
SearchGym: Bootstrapping Real-World Search Agents via Cost-Effective and High-Fidelity Environment Simulation.
CoRR, January, 2026

2025
DreamOmni3: Scribble-based Editing and Generation.
CoRR, December, 2025

Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds.
CoRR, December, 2025

Repulsor: Accelerating Generative Modeling with a Contrastive Memory Bank.
CoRR, December, 2025

Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning.
CoRR, October, 2025

SmartSwitch: Advancing LLM Reasoning by Overcoming Underthinking via Promoting Deeper Thought Exploration.
CoRR, October, 2025

DreamOmni2: Multimodal Instruction-based Editing and Generation.
CoRR, October, 2025

Understanding Data Influence with Differential Approximation.
CoRR, August, 2025

Demystify Transformers & Convolutions in Modern Image Deep Networks.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2025

YOLO-MFD: Object Detection for Multi-Scenario Fires.
Inf., 2025

CR2PQ: Continuous Relative Rotary Positional Query for Dense Visual Representation Learning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Data Pruning by Information Maximization.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

LYRA: An Efficient and Speech-Centric Framework for Omni-Cognition.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Mixture-of-Scores: Robust Image-Text Data Valuation via Three Lines of Code.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Logits-Based Finetuning.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

QuickLLaMA: Query-aware Inference Acceleration for Large Language Models.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

2024
Ensemble Quadratic Assignment Network for Graph Matching.
Int. J. Comput. Vis., September, 2024

RoboCoder: Robotic Learning from Basic Skills to General Tasks with Large Language Models.
CoRR, 2024

SaCo Loss: Sample-Wise Affinity Consistency for Vision-Language Pre-Training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
PRSeg: A Lightweight Patch Rotate MLP Decoder for Semantic Segmentation.
IEEE Trans. Circuits Syst. Video Technol., November, 2023

StructToken: Rethinking Semantic Segmentation With Structural Prior.
IEEE Trans. Circuits Syst. Video Technol., October, 2023

RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic and Regional Comprehension.
CoRR, 2023

AxWin Transformer: A Context-Aware Vision Transformer Backbone with Axial Windows.
CoRR, 2023

Data Pruning via Moving-one-Sample-out.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

UniNeXt: Exploring A Unified Architecture for Vision Recognition.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

2022
Demystify Transformers & Convolutions in Modern Image Deep Networks.
CoRR, 2022

Feature Selective Transformer for Semantic Image Segmentation.
CoRR, 2022

Semantic Diffusion Network for Semantic Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

CATrans: Context and Affinity Transformer for Few-Shot Segmentation.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Full-Scale Selective Transformer for Semantic Segmentation.
Proceedings of the Computer Vision - ACCV 2022, 2022

Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Fully Transformer Networks for Semantic Image Segmentation.
CoRR, 2021

Proxy Graph Matching with Proximal Matching Networks.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2019
Learning the implicit strain reconstruction in ultrasound elastography using privileged information.
Medical Image Anal., 2019

2018
Direct Reconstruction of Ultrasound Elastography Using an End-to-End Deep Neural Network.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, 2018


  Loading...