Kaitong Cai

According to our database¹, Kaitong Cai authored at least 31 papers between 2025 and 2026.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Aligning Multimodal Sequential Recommendations via Robust Direct Preference Optimization with Sparse MoE.

[BibT_eX]

[DOI]

CoRR, March, 2026

AgriWorld:A World Tools Protocol Framework for Verifiable Agricultural Reasoning with Code-Executing LLM Agents.

[BibT_eX]

[DOI]

CoRR, February, 2026

Process-of-Thought Reasoning for Videos.

[BibT_eX]

[DOI]

CoRR, February, 2026

Spectral Gating Networks.

[BibT_eX]

[DOI]

CoRR, February, 2026

Why Keep Your Doubts to Yourself? Trading Visual Uncertainties in Multi-Agent Bandit Systems.

[BibT_eX]

[DOI]

CoRR, January, 2026

LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Top-Down Semantic Refinement for Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

3DAlign-DAER: Dynamic Attention Policy and Efficient Retrieval Strategy for Fine-grained 3D-Text Alignment at Scale.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Cost-Effective Communication: An Auction-based Method for Language Agent Interaction.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

RaCoT: Plug-and-Play Contrastive Example Generation Mechanism for Enhanced LLM Reasoning Reliability.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Self-Rewarded Multimodal Coherent Reasoning Across Diverse Visual Domains.

[BibT_eX]

[DOI]

CoRR, December, 2025

CoAgent: Collaborative Planning and Consistency Agent for Coherent Video Generation.

[BibT_eX]

[DOI]

CoRR, December, 2025

RevFFN: Memory-Efficient Full-Parameter Fine-Tuning of Mixture-of-Experts LLMs with Reversible Blocks.

[BibT_eX]

[DOI]

CoRR, December, 2025

FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models.

[BibT_eX]

[DOI]

CoRR, December, 2025

SirenPose: Dynamic Scene Reconstruction via Geometric Supervision.

[BibT_eX]

[DOI]

CoRR, December, 2025

LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction.

[BibT_eX]

[DOI]

CoRR, December, 2025

PTTA: A Pure Text-to-Animation Framework for High-Quality Creation.

[BibT_eX]

[DOI]

CoRR, December, 2025

STORM: Search-Guided Generative World Models for Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, December, 2025

HybridToken-VLM: Hybrid Token Compression for Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, December, 2025

MM-CoT:A Benchmark for Probing Visual Chain-of-Thought Reasoning in Multimodal Models.

[BibT_eX]

[DOI]

CoRR, December, 2025

Causal Invariance and Counterfactual Learning Driven Cooperative Game for Multi-Label Classification.

[BibT_eX]

[DOI]

CoRR, December, 2025

Guardian: Decoupling Exploration from Safety in Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, October, 2025

Backward-Friendly Optimization: Training Large Language Models with Approximate Gradients under Memory Constraints.

[BibT_eX]

[DOI]

CoRR, October, 2025

Learning Dynamics of VLM Finetuning.

[BibT_eX]

[DOI]

CoRR, October, 2025

Failure-Driven Workflow Refinement.

[BibT_eX]

[DOI]

CoRR, October, 2025

CF-VLM:CounterFactual Vision-Language Fine-tuning.

[BibT_eX]

[DOI]

CoRR, June, 2025

Kolmogorov-Arnold Fourier Networks.

[BibT_eX]

[DOI]

CoRR, February, 2025

MAT-Agent: Adaptive Multi-Agent Training Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

OSC: Cognitive Orchestration through Dynamic Knowledge Alignment in Multi-Agent LLM Collaboration.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

DrDiff: Dynamic Routing Diffusion with Hierarchical Attention for Breaking the Efficiency-Quality Trade-off.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

CCG: Rare-Label Prediction via Neural SEM-Driven Causal Game.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Kaitong Cai

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...