Lu Lu

Orcid: 0000-0002-5476-5768

Affiliations:

Bytedance Inc., ByteDance AI Lab, Speech and Audio Team

According to our database¹, Lu Lu authored at least 40 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2025

End-to-end Listen, Look, Speak and Act.

[BibT_eX]

[DOI]

CoRR, October, 2025

Seed LiveInterpret 2.0: End-to-end Simultaneous Speech-to-speech Translation with Your Voice.

[BibT_eX]

[DOI]

CoRR, July, 2025

SALMONN-omni: A Standalone Speech LLM without Codec Injection for Full-duplex Conversation.

[BibT_eX]

[DOI]

CoRR, May, 2025

Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context.

[BibT_eX]

[DOI]

CoRR, March, 2025

DECC: Delay-Aware Edge-Cloud Collaboration for Accelerating DNN Inference.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2025

Spy Inside: Scalable Verification of Dependable Transformers for Event Time Series Systems.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Slice Sandwich: Jagged Slicing Multi-Tier Dynamic Resources for Diversified V2X Services.

[BibT_eX]

[DOI]

IEEE Trans. Mob. Comput., May, 2024

SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation.

[BibT_eX]

[DOI]

CoRR, 2024

NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training.

[BibT_eX]

[DOI]

CoRR, 2024

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

A Comprehensive Solution to Connect Speech Encoder and Large Language Model for ASR.

[BibT_eX]

[DOI]

CoRR, 2024

Seed-TTS: A Family of High-Quality Versatile Speech Generation Models.

[BibT_eX]

[DOI]

CoRR, 2024

SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Can Large Language Models Understand Spatial Audio?

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Challenges in Training PINNs: A Loss Landscape Perspective.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SALMONN: Towards Generic Hearing Abilities for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

PolyVoice: Language Models for Speech to Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Connecting Speech Encoder and Large Language Model for ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Extending Large Language Models for Speech and Audio Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Extending Multilingual ASR to New Languages Using Supplementary Encoder and Decoder Components.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Multi-SP Network Slicing Parallel Relieving Edge Network Conflict.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., November, 2023

Learning-Based Real-Time Transmission Control for Multi-Path TCP Networks.

[BibT_eX]

[DOI]

IEEE Trans. Cogn. Commun. Netw., October, 2023

Network Meets ChatGPT: Intent Autonomous Management, Control and Operation.

[BibT_eX]

[DOI]

J. Commun. Inf. Networks, September, 2023

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Language-specific Acoustic Boundary Learning for Mandarin-English Code-switching Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

PolyVoice: Language Models for Speech to Speech Translation.

[BibT_eX]

[DOI]

CoRR, 2023

Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions and Prospects.

[BibT_eX]

[DOI]

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Language-specific Boundary Learning for Improving Mandarin-English Code-switching Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Knowledge Distillation Approach for Efficient Internal Language Model Estimation.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

AudioQR: Deep Neural Audio Watermarks For QR Code.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Improving Large-Scale Deep Biasing With Phoneme Features and Text-Only Data in Streaming Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Lu Lu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...