Sayak Paul

Orcid: 0000-0003-0217-0778

According to our database1, Sayak Paul authored at least 23 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors.
CoRR, February, 2026

TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models.
CoRR, February, 2026

Margin-Aware Preference Optimization for Aligning Diffusion Models Without Reference.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Factuality Matters: When Image Generation and Editing Meet Structured Visuals.
CoRR, October, 2025

Fine-Grained Perturbation Guidance via Attention Head Selection.
CoRR, June, 2025

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis.
CoRR, May, 2025

DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
FiVL: A Framework for Improved Vision-Language Alignment.
CoRR, 2024

A Noise is Worth Diffusion Guidance.
CoRR, 2024

FastRM: An efficient and automatic explainability framework for multimodal generative models.
CoRR, 2024

PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models.
CoRR, 2024

Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss.
CoRR, 2024

Getting it Right: Improving Spatial Consistency in Text-to-Image Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

2022
Vision Transformers Are Robust Learners.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Flood Segmentation on Sentinel-1 SAR Imagery with Semi-Supervised Learning.
CoRR, 2021

Fast and Accurate Quantized Camera Scene Detection on Smartphones, Mobile AI 2021 Challenge: Report.
CoRR, 2021

2020
A review of deep learning with special emphasis on architectures, applications and recent trends.
Knowl. Based Syst., 2020

G-SimCLR: Self-Supervised Contrastive Learning with Guided Projection via Pseudo Labelling.
Proceedings of the 20th International Conference on Data Mining Workshops, 2020

2019
A Review of Deep Learning with Special Emphasis on Architectures, Applications and Recent Trends.
CoRR, 2019


  Loading...