Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study.

[BibT_eX]

[DOI]

Menglong Cui

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

MobileSteward: Integrating Multiple App-Oriented Agents with Self-Evolution to Automate Cross-App Instructions.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

X-ARES: A Comprehensive Framework for Assessing Audio Encoder Performance.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Text-Enhanced Audio Encoder for Large Language Model based Speech Recognition via Cross-Modality Pre-training with Unpaired Audio-Text Data.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

StarVC: A Unified Auto-Regressive Framework for Joint Text and Speech Generation in Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

GLCLAP: A Novel Contrastive Learning Pre-trained Model for Contextual Biasing in ASR.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Q-Frame: Query-Aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

KG-Retriever: Efficient Knowledge Indexing for Retrieval-Augmented Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Knowledge Graph (ICKG), 2025

LLaVA-SG: Leveraging Scene Graphs as Visual Semantic Expression in Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Let Your Car Listen to Your Respiration Contactlessly with Ubiquitous Acoustic Signals.

[BibT_eX]

[DOI]

Proceedings of the Companion of the 2025 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2025

BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

SPO: Self Preference Optimization with Self Regularization.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

MAKAR: a Multi-Agent framework based Knowledge-Augmented Reasoning for Grounded Multimodal Named Entity Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

DiffRhythm+: Controllable and Flexible Full-Length Song Generation with Preference Optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Weaving Context Across Images: Improving Vision-Language Models through Focus-Centric Visual Chains.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Browsing Like Human: A Multimodal Web Agent with Experiential Fast-and-Slow Thinking.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Global Eye: Breaking the "Fixed Thinking Pattern" during the Instruction Expansion Process.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Demystifying Small Language Models for Edge Deployment.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security.

[BibT_eX]

[DOI]

CoRR, 2024

MobileVLM: A Vision-Language Model for Better Intra- and Inter-UI Understanding.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Mixture of Diverse Size Experts.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024

SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM.

[BibT_eX]

[DOI]

Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024

ToolRerank: Adaptive and Hierarchy-Aware Reranking for Tool Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

A Comprehensive Evaluation of Quantization Strategies for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

CBSiMT: Mitigating Hallucination in Simultaneous Machine Translation with Weighted Prefix-to-Prefix Training.

[BibT_eX]

[DOI]

CoRR, 2023

From Indeterminacy to Determinacy: Augmenting Logical Reasoning Capabilities with Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

CMATH: Can Your Language Model Pass Chinese Elementary School Math Test?

[BibT_eX]

[DOI]

CoRR, 2023

UniMC: A Unified Framework for Long-Term Memory Conversation via Relevance Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Overview of the NLPCC 2023 Shared Task 9: User Feedback Prediction and Response Generation.

[BibT_eX]

[DOI]

Proceedings of the Natural Language Processing and Chinese Computing, 2023

The Xiaomi AI Lab's Speech Translation Systems for IWSLT 2023 Offline Task, Simultaneous Task and Speech-to-Speech Task.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Spoken Language Translation, 2023

Improving Bilingual TTS Using Language And Phonology Embedding With Embedding Strength Modulator.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

LightClone: Speaker-guided Parallel Subnet Selection for Few-shot Voice Cloning.

[BibT_eX]

[DOI]

Jie Wu

Jian Luan

Yujun Wang

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Rethinking the Reasonability of the Test Set for Simultaneous Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Joint Training and Decoding for Multilingual End-to-End Simultaneous Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Exploring All-In-One Knowledge Distillation Framework for Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Exploring Better Text Image Translation with Multimodal Codebook.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

BERT-ERC: Fine-Tuning BERT Is Enough for Emotion Recognition in Conversation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Improve Bilingual TTS Using Dynamic Language and Phonology Embedding.

[BibT_eX]

[DOI]

Fengyu Yang

Jian Luan

Yujun Wang

CoRR, 2022

J-TranPSP: A Joint Transition-based Model for Prosodic Structure Prediction, Word Segmentation and PoS Tagging.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation.

[BibT_eX]

[DOI]

Fengyu Yang

Jian Luan

Yujun Wang

Proceedings of the IEEE International Conference on Acoustics, 2022

MSDTRON: A High-Capability Multi-Speaker Speech Synthesis System for Diverse Data Using Characteristic Information.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

PAMA-TTS: Progression-Aware Monotonic Attention for Stable SEQ2SEQ TTS with Accurate Phoneme Duration Control.

[BibT_eX]

[DOI]

Yunchao He

Jian Luan

Yujun Wang

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Effective and Differentiated Use of Control Information for Multi-speaker Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2021

Noise Robust Singing Voice Synthesis Using Gaussian Mixture Variational Autoencoder.

[BibT_eX]

[DOI]

Proceedings of the ICMI '21 Companion: Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada, October 18, 2021

2020

HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis.

[BibT_eX]

[DOI]

CoRR, 2020

PPSpeech: Phrase based Parallel End-to-End TTS System.

[BibT_eX]

[DOI]

Yahuan Cong

Ran Zhang

Jian Luan

CoRR, 2020

DeepSinger: Singing Voice Synthesis with Data Mined From the Web.

[BibT_eX]

[DOI]

Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Re-Weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Adversarially Trained Multi-Singer Sequence-to-Sequence Singing Synthesizer.

[BibT_eX]

[DOI]

Jie Wu

Jian Luan

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

XiaoiceSing: A High-Quality and Integrated Singing Voice Synthesis System.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transfer Learning for Improving Singing-Voice Detection in Polyphonic Instrumental Music.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Vocal Pitch Extraction in Polyphonic Music Using Convolutional Residual Network.

[BibT_eX]

[DOI]

Mingye Dong

Jie Wu

Jian Luan

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2012

Expand CRF to Model Long Distance Dependencies in Prosodic Break Prediction.

[BibT_eX]

[DOI]

Jian Luan

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2010

Improvement on plural unit selection and fusion.

[BibT_eX]

[DOI]

Jian Luan

Jian Li

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

2009

The Toshiba Mandarin TTS System for the Blizzard Challenge 2009.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009

2008

The Toshiba Mandarin TTS System for the Blizzard Challenge 2008.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2008, 2008

2007

Codebook-Based Pseudo-Impostor Data Generation and Template Compression for Text-Dependent Speaker Verification.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2007

2006

Template Compression and Distance Normalization for Reliable Text-dependent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2006: The Speaker and Language Recognition Workshop, 2006

Frame-level Nonlinearity for Robust DTW-based Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Jian Luan

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...