Animesh Sinha

According to our database1, Animesh Sinha authored at least 20 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
ViTok-v2: Scaling Native Resolution Auto-Encoders to 5 Billion Parameters.
CoRR, May, 2026

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling.
CoRR, February, 2026

Non-Markov Multi-Round Conversational Image Generation with History-Conditioned MLLMs.
CoRR, January, 2026

Conversational Image Generation: Towards Multi-Round Personalized Generation with Multi-Modal Language Models.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2026

2025
MoCha: Towards Movie-Grade Talking Character Synthesis.
CoRR, March, 2025

Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Deep reinforcement learning in chemistry: A review.
J. Comput. Chem., 2024

DirectorLLM for Human-Centric Video Generation.
CoRR, 2024

Imagine yourself: Tuning-Free Personalized Image Generation.
CoRR, 2024

Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression.
Proceedings of the Computer Vision - ECCV 2024, 2024

Context Diffusion: In-Context Aware Image Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

GenTron: Diffusion Transformers for Image and Video Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
qLEET: visualizing loss landscapes, expressibility, entangling power and training trajectories for parameterized quantum circuits.
Quantum Inf. Process., 2023

Gen2Det: Generate to Detect.
CoRR, 2023

GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation.
CoRR, 2023

2022
CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval.
CoRR, 2022

CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Qubit Routing Using Graph Neural Network Aided Monte Carlo Tree Search.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Large-Scale Attribute-Object Compositions.
CoRR, 2021


  Loading...