Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Bridging Audio and Vision: Zero-Shot Audiovisual Segmentation by Connecting Pretrained Models.

[BibT_eX]

[DOI]

Seung-jae Lee

Paul Hongsuck Seo

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

DGMO: Training-Free Audio Source Separation through Diffusion-Guided Mask Optimization.

[BibT_eX]

[DOI]

Geonyoung Lee

Geonhee Han

Paul Hongsuck Seo

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Cross-Modal Watermarking for Authentic Audio Recovery and Tamper Localization in Synthesized Audiovisual Forgeries.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

DialNav: Multi-Turn Dialog Navigation with a Remote Guide.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ReTAG: Retrieval-Enhanced, Topic-Augmented Graph-Based Global Sensemaking.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Random Conditioning for Diffusion Model Compression with Distillation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Multi-Granularity Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

TrackIME: Enhanced Video Point Tracking via Instance Motion Estimation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Pseudo-RIS: Distinctive Pseudo-Supervision Generation for Referring Image Segmentation.

[BibT_eX]

[DOI]

Seonghoon Yu

Paul Hongsuck Seo

Jeany Son

Proceedings of the Computer Vision - ECCV 2024, 2024

Learning Correlation Structures for Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

IFSeg: Image-free Semantic Segmentation via Vision-Language Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Zero-shot Referring Image Segmentation with Global-Local Context Features.

[BibT_eX]

[DOI]

Seonghoon Yu

Paul Hongsuck Seo

Jeany Son

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR.

[BibT_eX]

[DOI]

Paul Hongsuck Seo

Arsha Nagrani

Cordelia Schmid

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

AVATAR submission to the Ego4D AV Transcription Challenge.

[BibT_eX]

[DOI]

Paul Hongsuck Seo

Arsha Nagrani

Cordelia Schmid

CoRR, 2022

AVATAR: Unconstrained Audiovisual Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Learning Audio-Video Modalities from Image Captions.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

End-to-end Generative Pretraining for Multimodal Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Look Before You Speak: Visually Contextualized Utterances.

[BibT_eX]

[DOI]

Paul Hongsuck Seo

Arsha Nagrani

Cordelia Schmid

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Reinforcing an Image Caption Generator Using Off-Line Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Combinatorial Inference against Label Noise.

[BibT_eX]

[DOI]

Paul Hongsuck Seo

Geeho Kim

Bohyung Han

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Learning for Single-Shot Confidence Calibration in Deep Neural Networks Through Stochastic Inferences.

[BibT_eX]

[DOI]

Seonguk Seo

Paul Hongsuck Seo

Bohyung Han

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Regularizing Neural Networks via Stochastic Branch Layers.

[BibT_eX]

[DOI]

Proceedings of The 11th Asian Conference on Machine Learning, 2019

2018

Confidence Calibration in Deep Neural Networks through Stochastic Inferences.

[BibT_eX]

[DOI]

Seonguk Seo

Paul Hongsuck Seo

Bohyung Han

CoRR, 2018

CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Attentive Semantic Alignment with Offset-Aware Correlation Kernels.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Progressive Attention Networks for Visual Attribute Prediction.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference 2018, 2018

2017

Visual Reference Resolution using Attention Memory for Visual Dialog.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

MarioQA: Answering Questions by Watching Gameplay Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

2016

Hierarchical Attention Networks.

[BibT_eX]

[DOI]

CoRR, 2016

Image Question Answering Using Convolutional Neural Network with Dynamic Parameter Prediction.

[BibT_eX]

[DOI]

Hyeonwoo Noh

Paul Hongsuck Seo

Bohyung Han

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015

Conversational Knowledge Teaching Agent that uses a Knowledge Base.

[BibT_eX]

[DOI]

Proceedings of the SIGDIAL 2015 Conference, 2015

2014

Grammatical error correction based on learner comprehension model in oral conversation.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

2012

Generating grammar questions using corpus data in L2 learning.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Grammatical Error Annotation for Korean Learners of Spoken English.

[BibT_eX]

[DOI]

Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

A Meta Learning Approach to Grammatical Error Correction.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

Hongsuck Seo

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...