We stand with Ukraine

We stand with Ukraine

Kaitao Song

Orcid: 0000-0002-4046-8594

According to our database¹, Kaitao Song authored at least 47 papers between 2018 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Learning Domain Invariant Prompt for Vision-Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

IEEE Trans. Image Process., 2024

EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2024

TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2024

Can Graph Learning Improve Task Planning?

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

EEGFormer: Towards Transferable and Interpretable Large-Scale EEG Foundation Model.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2024

EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

PromptTTS 2: Describing and Generating Voices with Text Prompt.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Improving Large Language Models in Event Relation Logical Prediction.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

TaskBench: Benchmarking Large Language Models for Task Automation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2023

Learning To Teach Large Language Models Logical Reasoning.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2023

PromptTTS 2: Describing and Generating Voices with Text Prompt.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2023

Deliberate then Generate: Enhanced Prompting Framework for Text Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2023

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2023

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

End-to-End Word-Level Pronunciation Assessment with MASK Pre-training.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

CircuitNet: A Generic Neural Network to Realize Universal Circuit Motif Modeling.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the International Conference on Machine Learning, 2023

A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Pretrained Representations With Task-Related Keywords for Alzheimer's Disease Detection.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Towards Understanding Omission in Dialogue Summarization.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

DiffusionNER: Boundary Diffusion for Named Entity Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

PVT v2: Improved baselines with Pyramid Vision Transformer.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Comput. Vis. Media, 2022

Your Autoregressive Generative Model Can be Better If You Treat It as an Energy-Based One.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2022

Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Analyzing and Mitigating Interference in Neural Architecture Search.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the International Conference on Machine Learning, 2022

A Study on the Efficacy of Model Pre-Training In Developing Neural Text-to-Speech System.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Coarse-to-fine: A dual-view attention network for click-through rate prediction.

[BibT_eX]

[DOI]

,

,

,

Knowl. Based Syst., 2021

PVTv2: Improved Baselines with Pyramid Vision Transformer.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2021

MPN: Multi-scale Progressive Restoration Network for Unsupervised Defect Detection.

[BibT_eX]

[DOI]

,

,

Proceedings of the Pattern Recognition and Computer Vision - 4th Chinese Conference, 2021

NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Wei-Qiang Zhang

,

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Bi-Modal Progressive Mask Attention for Fine-Grained Recognition.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Image Process., 2020

LightPAFF: A Two-Stage Distillation Framework for Pre-training and Fine-tuning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2020

MPNet: Masked and Permuted Pre-training for Language Understanding.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Neural Machine Translation with Error Correction.

[BibT_eX]

[DOI]

,

,

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

2019

MASS: Masked Sequence to Sequence Pre-training for Language Generation.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

Hybrid Self-Attention Network for Machine Translation.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2018

Generating Adversarial Examples With Conditional Generative Adversarial Net.

[BibT_eX]

[DOI]

,

,

Proceedings of the 24th International Conference on Pattern Recognition, 2018

Double Path Networks for Sequence to Sequence Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 27th International Conference on Computational Linguistics, 2018

Loading...