Yu Wang

IEEE ACM Trans. Audio Speech Lang. Process., 2024

DialogMCF: Multimodal Context Flow for Audio Visual Scene-Aware Dialog.

[BibT_eX]

[DOI]

Zhe Chen

Hongcheng Liu

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Navigating Chemical-Linguistic Sharing Space with Heterogeneous Molecular Encoding.

[BibT_eX]

[DOI]

CoRR, 2024

Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal.

[BibT_eX]

[DOI]

CoRR, 2024

AuscultaBase: A Foundational Step Towards AI-Powered Body Sound Diagnostics.

[BibT_eX]

[DOI]

CoRR, 2024

Guide for Defense (G4D): Dynamic Guidance for Robust and Balanced Defense in Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Multi-granularity Score-based Generative Framework Enables Efficient Inverse Design of Complex Organics.

[BibT_eX]

[DOI]

CoRR, 2024

Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm.

[BibT_eX]

[DOI]

CoRR, 2024

Decoding Linguistic Representations of Human Brain.

[BibT_eX]

[DOI]

CoRR, 2024

Reconstruct the Pruned Model without Any Retraining.

[BibT_eX]

[DOI]

CoRR, 2024

MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation.

[BibT_eX]

[DOI]

CoRR, 2024

SubGDiff: A Subgraph Diffusion Model to Improve Molecular Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2024

MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts.

[BibT_eX]

[DOI]

CoRR, 2024

M3AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.

[BibT_eX]

[DOI]

CoRR, 2024

Automatic Interactive Evaluation for Large Language Models with State Aware Patient Simulator.

[BibT_eX]

[DOI]

CoRR, 2024

Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview.

[BibT_eX]

[DOI]

Heyang Liu

CoRR, 2024

M2K-VDG: Model-Adaptive Multimodal Knowledge Anchor Enhanced Video-grounded Dialogue Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Peer-review-in-LLMs: Automatic Evaluation Method for LLMs in Open-environment.

[BibT_eX]

[DOI]

CoRR, 2024

Annotation-free Audio-Visual Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

SubgDiff: A Subgraph Diffusion Model to Improve Molecular Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

TAIA: Large Language Models are Out-of-Distribution Data Learners.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Interaction-based Retrieval-augmented Diffusion Models for Protein-specific 3D Molecule Generation.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Protein-Ligand Interaction Prior for Binding-aware 3D Molecule Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

MSG-BART: Multi-Granularity Scene Graph-Enhanced Encoder-Decoder Language Model for Video-Grounded Dialogue Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

RA2FD: Distilling Faithfulness into Efficient Dialogue Systems.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

HSDreport: Heart Sound Diagnosis with Echocardiography Reports.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

ParCo: Part-Coordinating Text-to-Motion Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Pruning before Fine-tuning: A Retraining-free Compression Framework for Pre-trained Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

CE-VDG: Counterfactual Entropy-based Bias Reduction for Video-grounded Dialogue Generation.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

GP-nano: a geometric graph network for nanobody polyreactivity prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2024

SDA: Semantic Discrepancy Alignment for Text-conditioned Image Retrieval.

[BibT_eX]

[DOI]

Yuchen Yang

Proceedings of the Findings of the Association for Computational Linguistics, 2024

MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

DictLLM: Harnessing Key-Value Data Structures with Large Language Models for Enhanced Medical Diagnostics.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

M³AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Binding-Adaptive Diffusion Models for Structure-Based Drug Design.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Running ahead of evolution - AI-based simulation for predicting future high-risk SARS-CoV-2 variants.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., November, 2023

Self-Supervised Masking for Unsupervised Anomaly Detection and Localization.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning.

[BibT_eX]

[DOI]

CoRR, 2023

An Automatic Evaluation Framework for Multi-turn Medical Consultations Capabilities of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

LibriSQA: Advancing Free-form and Open-ended Spoken Question Answering with a Novel Dataset and Framework.

[BibT_eX]

[DOI]

CoRR, 2023

Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

SelfEvolve: A Code Evolution Framework via Large Language Models.

[BibT_eX]

[DOI]

Shuyang Jiang

Yuhao Wang

CoRR, 2023

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery.

[BibT_eX]

[DOI]

CoRR, 2023

Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition.

[BibT_eX]

[DOI]

Zihan Zhao

CoRR, 2023

Discover and Align Taxonomic Context Priors for Open-world Semi-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Uncertainty-Guided End-to-End Audio-Visual Speaker Diarization for Far-Field Recordings.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Contrastive Learning Based ASR Robust Knowledge Selection For Spoken Dialogue System.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Self-Improvement of Non-autoregressive Model via Sequence-Level Distillation.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Fuzzy Positive Learning for Semi-Supervised Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Enhanced Multimodal Representation Learning with Cross-modal KD.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

PCR: Pessimistic Consistency Regularization for Semi-Supervised Segmentation.

[BibT_eX]

[DOI]

CoRR, 2022

Alignment-free metal ion-binding site prediction from protein sequence through pretrained language model and multi-task learning.

[BibT_eX]

[DOI]

Briefings Bioinform., 2022

Unsupervised Ensemble Distillation for Multi-Organ Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Symposium on Biomedical Imaging, 2022

Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion Recognition.

[BibT_eX]

[DOI]

Zihan Zhao

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

LAR-SR: A Local Autoregressive Model for Image Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Efficient Use of End-to-End Data in Spoken Language Processing.

[BibT_eX]

[DOI]

Yiting Lu

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Spoken Language 'Grammatical Error Correction'.

[BibT_eX]

[DOI]

Yiting Lu

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Non-Native Children's Automatic Speech Recognition: The INTERSPEECH 2020 Shared Task ALTA Systems.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

General Sequence Teacher-Student Learning.

[BibT_eX]

[DOI]

Mark John Francis Gales

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Non-native Speaker Verification for Spoken Language Assessment.

[BibT_eX]

[DOI]

Linlin Wang

CoRR, 2019

Disfluency Detection for Spoken Learner English.

[BibT_eX]

[DOI]

Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

Impact of ASR Performance on Spoken Grammatical Error Detection.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Non-Intrusive POLQA Estimation of Speech Quality using Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 27th European Signal Processing Conference, 2019

Learning Between Different Teacher and Student Models in ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Model-Based Speech Enhancement in the Modulation Domain.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Towards automatic assessment of spontaneous spoken English.

[BibT_eX]

[DOI]

Konstantinos Kyriakopoulos

Andrey Malinin

Rogier C. van Dalen

M. Rashid

Speech Commun., 2018

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2018

Sequence Teacher-Student Training of Acoustic Models for Automatic Free Speaking Language Assessment.

[BibT_eX]

[DOI]

Anton Ragni

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Impact of ASR Performance on Free Speaking Language Assessment.

[BibT_eX]

[DOI]

Konstantinos Kyriakopoulos

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Future Word Contexts in Neural Network Language Models.

[BibT_eX]

[DOI]

CoRR, 2017

An attention based model for off-topic spontaneous spoken response detection: An Initial Study.

[BibT_eX]

[DOI]

Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Use of Graphemic Lexicons for Spoken Language Assessment.

[BibT_eX]

[DOI]

Konstantinos Kyriakopoulos

Anton Ragni

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

A data-driven non-intrusive measure of speech quality and intelligibility.

[BibT_eX]

[DOI]

Speech Commun., 2016

Speech enhancement using an MMSE spectral amplitude estimator based on a modulation domain Kalman filter with a Gamma prior.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Off-topic Response Detection for Spontaneous Spoken English Assessment.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2014

Speech enhancement usinga modulation domain Kalman filter post-processor with a Gaussian Mixture noise model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Speech enhancement using a robust Kalman filter post-processor in the modulation domain.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

A subspace method for speech enhancement in the modulation domain.

[BibT_eX]

[DOI]