Peter Vajda

Orcid: 0000-0001-9046-480X

According to our database¹, Peter Vajda authored at least 85 papers between 2006 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models.

[BibT_eX]

[DOI]

CoRR, April, 2025

MoCha: Towards Movie-Grade Talking Character Synthesis.

[BibT_eX]

[DOI]

CoRR, March, 2025

Learnings from Scaling Visual Tokenizers for Reconstruction and Generation.

[BibT_eX]

[DOI]

Philippe Hansen-Estruch

CoRR, January, 2025

Text-to-Image Generation Post-Training with Pixel-Space Loss.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Workshop on Rich Media With Generative AI, 2025

Learnings from Scaling Visual Tokenizers for Reconstruction and Generation.

[BibT_eX]

[DOI]

Philippe Hansen-Estruch

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Observation and Local Prediction of the Vertical Gravity Gradient: Review Paper.

[BibT_eX]

[DOI]

IEEE Instrum. Meas. Mag., September, 2024

An Investigation on Hardware-Aware Vision Transformer Scaling.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., May, 2024

GROWTH-23: An integrated code for inversion of complete Bouguer gravity anomaly or temporal gravity changes.

[BibT_eX]

[DOI]

Antonio G. Camacho

Peter Vajda

José Fernández

Comput. Geosci., January, 2024

DirectorLLM for Human-Centric Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Pixel-Space Post-Training of Latent Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2024

Imagine yourself: Tuning-Free Personalized Image Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation.

[BibT_eX]

[DOI]

CoRR, 2024

Animated Stickers: Bringing Stickers to Life with Video Diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

AVID: Any-Length Video Inpainting with Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Cache Me if You Can: Accelerating Diffusion Models through Block Caching.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ControlRoom3D: Room Generation Using Semantic Proxy Rooms.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MixRT: Mixed Neural Representations For Real-Time NeRF Rendering.

[BibT_eX]

[DOI]

Proceedings of the International Conference on 3D Vision, 2024

2023

Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack.

[BibT_eX]

[DOI]

CoRR, 2023

Pruning Compact ConvNets for Efficient Inference.

[BibT_eX]

[DOI]

CoRR, 2023

XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse.

[BibT_eX]

[DOI]

Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

A Practical Stereo Depth System for Smart Glasses.

[BibT_eX]

[DOI]

Jialiang Wang

Daniel Scharstein

Akash Bapat

Kevin Blackburn-Matzen

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference.

[BibT_eX]

[DOI]

CoRR, 2022

3D-Aware Encoding for Style-based Neural Radiance Fields.

[BibT_eX]

[DOI]

CoRR, 2022

Data Efficient Language-Supervised Zero-Shot Recognition with Optimal Transport Distillation.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Open-Set Semi-Supervised Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

INGeo: Accelerating Instant Neural Scene Reconstruction with Noisy Geometry Priors.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Cross-Domain Adaptive Teacher for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Data Efficient Language-supervised Zero-shot Recognition with Optimal Transport Distillation.

[BibT_eX]

[DOI]

CoRR, 2021

Cross-Domain Object Detection via Adaptive Self-Training.

[BibT_eX]

[DOI]

CoRR, 2021

FBNetV5: Neural Architecture Search for Multiple Tasks in One Run.

[BibT_eX]

[DOI]

CoRR, 2021

Image2Point: 3D Point-Cloud Understanding with Pretrained 2D ConvNets.

[BibT_eX]

[DOI]

CoRR, 2021

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Unbiased Teacher for Semi-Supervised Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Visual Transformers: Where Do Transformers Really Belong in Vision Models?

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Rethinking the Self-Attention in Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Tackling the Ill-Posedness of Super-Resolution Through Adaptive Target Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

FBNetV3: Joint Architecture-Recipe Search Using Predictor Pretraining.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Data-Efficient Language-Supervised Zero-Shot Learning With Self-Distillation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

2020

One shot 3D photography.

[BibT_eX]

[DOI]

ACM Trans. Graph., 2020

FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge.

[BibT_eX]

[DOI]

CoRR, 2020

Visual Transformers: Token-based Image Representation and Processing for Computer Vision.

[BibT_eX]

[DOI]

CoRR, 2020

FBNetV3: Joint Architecture-Recipe Search using Neural Acquisition Function.

[BibT_eX]

[DOI]

CoRR, 2020

Learning the Loss Functions in a Discriminative Space for Video Restoration.

[BibT_eX]

[DOI]

CoRR, 2020

SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Learning to Generate Grounded Visual Captions Without Localization Supervision.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Deep Space-Time Video Upsampling Networks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the Wild.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Learning to Generate Grounded Image Captions without Localization Supervision.

[BibT_eX]

[DOI]

CoRR, 2019

Efficient Segmentation: Learning Downsampling Near Semantic Boundaries.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Machine Learning at Facebook: Understanding Inference at the Edge.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Precision Highway for Ultra Low-Precision Quantization.

[BibT_eX]

[DOI]

CoRR, 2018

Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search.

[BibT_eX]

[DOI]

CoRR, 2018

Value-Aware Quantization for Training and Inference of Neural Networks.

[BibT_eX]

[DOI]

Eunhyeok Park

Sungjoo Yoo

Peter Vajda

Proceedings of the Computer Vision - ECCV 2018, 2018

2017

DSD: Dense-Sparse-Dense Training for Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

2014

Real-time query-by-image video search system.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

2013

Geotag Propagation with User Trust Modeling.

[BibT_eX]

[DOI]

Proceedings of the Social Media Retrieval, 2013

Comparative Study of Trust Modeling for Automatic Landmark Tagging.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2013

EigenNews: a personalized news video delivery platform.

[BibT_eX]

[DOI]

Proceedings of the ACM Multimedia Conference, 2013

Eigennews: Generating and delivering personalized news video.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

Analysis of visual similarity in news videos with robust and memory-efficient image retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

2012

In Tags We Trust: Trust modeling in social tagging of multimedia content.

[BibT_eX]

[DOI]

IEEE Signal Process. Mag., 2012

Geotag propagation in social networks based on user trust model.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2012

2011

Object Duplicate Detection.

[BibT_eX]

[DOI]

Péter Vajda

PhD thesis, 2011

Epitomize Your Photos.

[BibT_eX]

[DOI]

Int. J. Comput. Games Technol., 2011

Social game epitome versus automatic visual analysis.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

2010

Robust Duplicate Detection of 2D and 3D Objects.

[BibT_eX]

[DOI]

Int. J. Multim. Data Eng. Manag., 2010

3D object duplicate detection for video retrieval.

[BibT_eX]

[DOI]

Proceedings of the 11th International Workshop on Image Analysis for Multimedia Interactive Services, 2010

Object-based tag propagation for semi-automatic annotation of images.

[BibT_eX]

[DOI]

Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

2009

Graph-based approach for 3D object duplicate detection.

[BibT_eX]

[DOI]

Proceedings of the 10th Workshop on Image Analysis for Multimedia Interactive Services, 2009

Analysis of the Limits of Graph-Based Object Duplicate Detection.

[BibT_eX]

[DOI]

Peter Vajda

Lutz Goldmann

Touradj Ebrahimi

Proceedings of the 11th IEEE International Symposium on Multimedia, 2009

2008

Parameter Control Methods for Selection Operators in Genetic Algorithms.

[BibT_eX]

[DOI]

Péter Vajda

Ágoston E. Eiben

Wiebe Hordijk

Proceedings of the Parallel Problem Solving from Nature, 2008

Towards Fully Automatic Image Segmentation Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Advanced Concepts for Intelligent Vision Systems, 2008

2007

Hungarian WordNet and representation of verbal event structure.

[BibT_eX]

[DOI]

Acta Cybern., 2007

2006

Morphdb.hu: Hungarian lexical database and morphological grammar.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Peter Vajda

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...