Hongsuck Seo

Orcid: 0000-0002-6747-1101

According to our database1, Hongsuck Seo authored at least 50 papers between 2012 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
CRIT: Graph-Based Automatic Data Synthesis to Enhance Cross-Modal Multi-Hop Reasoning.
CoRR, April, 2026

2025
Direct Diffusion Score Preference Optimization via Stepwise Contrastive Policy-Pair Supervision.
CoRR, December, 2025

Robust Image Self-Recovery against Tampering using Watermark Generation with Pixel Shuffling.
CoRR, November, 2025

Breaking the Visual Shortcuts in Multimodal Knowledge-Based Visual Question Answering.
CoRR, November, 2025

Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos.
CoRR, November, 2025

GOAT: A Training Framework for Goal-Oriented Agent with Tools.
CoRR, October, 2025

Seg4Diff: Unveiling Open-Vocabulary Segmentation in Text-to-Image Diffusion Transformers.
CoRR, September, 2025

Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression.
CoRR, April, 2025

Spectral-Adaptive Modulation Networks for Visual Perception.
CoRR, March, 2025

LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Bridging Audio and Vision: Zero-Shot Audiovisual Segmentation by Connecting Pretrained Models.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

DGMO: Training-Free Audio Source Separation through Diffusion-Guided Mask Optimization.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Cross-Modal Watermarking for Authentic Audio Recovery and Tamper Localization in Synthesized Audiovisual Forgeries.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

DialNav: Multi-Turn Dialog Navigation with a Remote Guide.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ReTAG: Retrieval-Enhanced, Topic-Augmented Graph-Based Global Sensemaking.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Random Conditioning for Diffusion Model Compression with Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Multi-Granularity Video Object Segmentation.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

TrackIME: Enhanced Video Point Tracking via Instance Motion Estimation.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Pseudo-RIS: Distinctive Pseudo-Supervision Generation for Referring Image Segmentation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Learning Correlation Structures for Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation.
CoRR, 2023

IFSeg: Image-free Semantic Segmentation via Vision-Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Zero-shot Referring Image Segmentation with Global-Local Context Features.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
AVATAR submission to the Ego4D AV Transcription Challenge.
CoRR, 2022

AVATAR: Unconstrained Audiovisual Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Learning Audio-Video Modalities from Image Captions.
Proceedings of the Computer Vision - ECCV 2022, 2022

End-to-end Generative Pretraining for Multimodal Video Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Look Before You Speak: Visually Contextualized Utterances.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Reinforcing an Image Caption Generator Using Off-Line Human Feedback.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Combinatorial Inference against Label Noise.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Learning for Single-Shot Confidence Calibration in Deep Neural Networks Through Stochastic Inferences.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Regularizing Neural Networks via Stochastic Branch Layers.
Proceedings of The 11th Asian Conference on Machine Learning, 2019

2018
Confidence Calibration in Deep Neural Networks through Stochastic Inferences.
CoRR, 2018

CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps.
Proceedings of the Computer Vision - ECCV 2018, 2018

Attentive Semantic Alignment with Offset-Aware Correlation Kernels.
Proceedings of the Computer Vision - ECCV 2018, 2018

Progressive Attention Networks for Visual Attribute Prediction.
Proceedings of the British Machine Vision Conference 2018, 2018

2017
Visual Reference Resolution using Attention Memory for Visual Dialog.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

MarioQA: Answering Questions by Watching Gameplay Videos.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2016
Hierarchical Attention Networks.
CoRR, 2016

Image Question Answering Using Convolutional Neural Network with Dynamic Parameter Prediction.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Conversational Knowledge Teaching Agent that uses a Knowledge Base.
Proceedings of the SIGDIAL 2015 Conference, 2015

2014
Grammatical error correction based on learner comprehension model in oral conversation.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

2012
Generating grammar questions using corpus data in L2 learning.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Grammatical Error Annotation for Korean Learners of Spoken English.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

A Meta Learning Approach to Grammatical Error Correction.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012


  Loading...