Stephen Gould

Orcid: 0000-0001-8929-7899

According to our database1, Stephen Gould authored at least 163 papers between 2007 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 




View-coherent correlation consistency for semi-supervised semantic segmentation.
Pattern Recognit., March, 2024

eDiGS: Extended Divergence-Guided Shape Implicit Neural Representation for Unoriented Point Clouds.
World Sci. Annu. Rev. Artif. Intell., 2024

Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal Encoder.
Trans. Mach. Learn. Res., 2024

Temporally Grounding Instructional Diagrams in Unconstrained Videos.
CoRR, 2024

Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation.
CoRR, 2024

The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
CoRR, 2024

NeRFEditor: Differentiable Style Decomposition for 3D Scene Editing.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

LipAT: Beyond Style Transfer for Controllable Neural Simulation of Lipstick using Cosmetic Attributes.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Ray Deformation Networks for Novel View Synthesis of Refractive Objects.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

IKEA Ego 3D Dataset: Understanding furniture assembly actions from ego-view 3D Point Clouds.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Bi-directional Training for Composed Image Retrieval via Text Prompt Learning.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

An Empirical Study Into What Matters for Calibrating Vision-Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Towards Optimal Feature-Shaping Methods for Out-of-Distribution Detection.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Neuro-Symbolic Learning of Lifted Action Models from Visual Traces.
Proceedings of the Thirty-Fourth International Conference on Automated Planning and Scheduling, 2024

Bidirectionally self-normalizing neural networks.
Neural Networks, October, 2023

Hamilton transversals in random Latin squares.
Random Struct. Algorithms, March, 2023

Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines.
CoRR, 2023

3D-GPT: Procedural 3D Modeling with Large Language Models.
CoRR, 2023

PMaF: Deep Declarative Layers for Principal Matrix Features.
CoRR, 2023

Towards Understanding Gradient Approximation in Equality Constrained Deep Declarative Networks.
CoRR, 2023

Adaptive Cross Batch Normalization for Metric Learning.
CoRR, 2023

3DInAction: Understanding Human Actions in 3D Point Clouds.
CoRR, 2023

Learning to Select Camera Views: Efficient Multiview Understanding at Few Glances.
CoRR, 2023

Confidence and Dispersity Speak: Characterising Prediction Matrix for Unsupervised Accuracy Estimation.
CoRR, 2023

Revisiting Implicit Differentiation for Learning Problems in Optimal Control.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Confidence and Dispersity Speak: Characterizing Prediction Matrix for Unsupervised Accuracy Estimation.
Proceedings of the International Conference on Machine Learning, 2023

Deep Declarative Dynamic Time Warping for End-to-End Learning of Alignment Paths.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Exploring Predicate Visual Context in Detecting of Human-Object Interactions.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Scaling Data Generation in Vision-and-Language Navigation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Semi-Supervised Semantic Segmentation under Label Noise via Diverse Learning Groups.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Navigational Visual Representations with Semantic Map Supervision.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Aligning Step-by-Step Instructional Diagrams to Video Demonstrations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

High-Fidelity Guided Image Synthesis with Latent Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Octree Guided Unoriented Surface Reconstruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Fine-Grained Classification via Categorical Memory Networks.
IEEE Trans. Image Process., 2022

Deep Declarative Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Almost all optimally coloured complete graphs contain a rainbow Hamilton path.
J. Comb. Theory B, 2022

Understanding and Improving the Role of Projection Head in Self-Supervised Learning.
CoRR, 2022

NeRFEditor: Differentiable Style Decomposition for Full 3D Scene Editing.
CoRR, 2022

Learning to Structure an Image with Few Colors and Beyond.
CoRR, 2022

Multi-View Correlation Consistency for Semi-Supervised Semantic Segmentation.
CoRR, 2022

Exploiting Problem Structure in Deep Declarative Networks: Two Case Studies.
CoRR, 2022

On the Strong Correlation Between Model Invariance and Generalization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

GoferBot: A Visual Guided Human-Robot Collaborative Assembly System.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DiGS : Divergence guided shape implicit neural representation for unoriented point clouds.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Counting Hamilton cycles in Dirac hypergraphs.
Comb. Probab. Comput., 2021

DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Rethinking conditional GAN training: An approach using geometrically structured latent manifolds.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

What Does Rotation Prediction Tell Us about Classifier Accuracy under Varying Testing Environments?
Proceedings of the 38th International Conference on Machine Learning, 2021

Conditional Generative Modeling via Learning the Latent Space.
Proceedings of the 9th International Conference on Learning Representations, 2021

A Regularized Wasserstein Framework for Graph Kernels.
Proceedings of the IEEE International Conference on Data Mining, 2021

Spatially Conditioned Graphs for Detecting Human-Object Interactions.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Contextually Plausible and Diverse 3D Human Motion Prediction.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

VLN BERT: A Recurrent Vision-and-Language BERT for Navigation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Representation Learning on Unit Ball with 3D Roto-translational Equivariance.
Int. J. Comput. Vis., 2020

Semantics for Robotic Mapping, Perception and Interaction: A Survey.
Found. Trends Robotics, 2020

Spatio-attentive Graphs for Human-Object Interaction Detection.
CoRR, 2020

A Recurrent Vision-and-Language BERT for Navigation.
CoRR, 2020

How to train your conditional GAN: An approach using geometrically structured latent manifolds.
CoRR, 2020

DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video.
CoRR, 2020

Bidirectional Self-Normalizing Neural Networks.
CoRR, 2020

A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews.
CoRR, 2020

ArTIST: Autoregressive Trajectory Inpainting and Scoring for Tracking.
CoRR, 2020

Blended Convolution and Synthesis for Efficient Discrimination of 3D Shapes.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Language and Visual Entity Relationship Graph for Agent Navigation.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Spectral-GANs for High-Resolution 3D Point-cloud Generation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

A Signal Propagation Perspective for Pruning Neural Networks at Initialization.
Proceedings of the 8th International Conference on Learning Representations, 2020

Sub-Instruction Aware Vision-and-Language Navigation.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Multiview Detection with Feature Perspective Transformation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Solving the Blind Perspective-n-Point Problem End-to-End with Robust Differentiable Geometric Optimization.
Proceedings of the Computer Vision - ECCV 2020, 2020

DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares.
Proceedings of the Computer Vision - ECCV 2020, 2020

Learning to Structure an Image With Few Colors.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Inferring Temporal Compositions of Actions Using Probabilistic Automata.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

A Stochastic Conditioning Scheme for Diverse Human Motion Prediction.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Joint Unsupervised Learning of Optical Flow and Egomotion with Bi-Level optimization.
Proceedings of the 8th International Conference on 3D Vision, 2020

Visual Permutation Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Second-order Temporal Pooling for Action Recognition.
Int. J. Comput. Vis., 2019

Sampling Good Latent Variables via CPP-VAEs: VAEs with Condition Posterior as Prior.
CoRR, 2019

Deep Declarative Networks: A New Hope.
CoRR, 2019

Learning Variations in Human Motion via Mix-and-Match Perturbation.
CoRR, 2019

Learning to Find Common Objects Across Image Collections.
CoRR, 2019

Learning to Find Common Objects Across Few Image Collections.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Incorporating Network Built-in Priors in Weakly-Supervised Semantic Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Neural Algebra of Classifiers.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Partially-Supervised Image Captioning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Video Representation Learning Using Discriminative Pooling.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Non-Linear Temporal Subspace Representations for Activity Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Discriminatively Learned Hierarchical Rank Pooling Networks.
Int. J. Comput. Vis., 2017

Human Action Forecasting by Learning Task Grammars.
CoRR, 2017

Action Representation Using Classifier Decision Boundaries.
CoRR, 2017

Incorporating Network Built-in Priors in Weakly-supervised Semantic Segmentation.
CoRR, 2017

Bottom-Up and Top-Down Attention for Image Captioning and VQA.
CoRR, 2017

Higher-Order Pooling of CNN Features via Kernel Linearization for Action Recognition.
Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017

Guided Open Vocabulary Image Captioning with Constrained Beam Search.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Human Pose Forecasting via Deep Markov Models.
Proceedings of the 2017 International Conference on Digital Image Computing: Techniques and Applications, 2017

Unsupervised Human Action Detection by Action Matching.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Self-Supervised Video Representation Learning with Odd-One-Out Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

DeepPermNet: Visual Permutation Learning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Generalized Rank Pooling for Activity Recognition.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization.
CoRR, 2016

Extending gene ontology in the context of extracellular RNA and vesicle communication.
J. Biomed. Semant., 2016

Segmentation of developing human embryo in time-lapse microscopy.
Proceedings of the 13th IEEE International Symposium on Biomedical Imaging, 2016

Learning End-to-end Video Classification with Rank-Pooling.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Built-in Foreground/Background Prior for Weakly-Supervised Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2016, 2016

Deep Convolutional Neural Networks for Human Embryonic Cell Counting.
Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

SPICE: Semantic Propositional Image Caption Evaluation.
Proceedings of the Computer Vision - ECCV 2016, 2016

Depth Dropout: Efficient Training of Residual Convolutional Neural Networks.
Proceedings of the 2016 International Conference on Digital Image Computing: Techniques and Applications, 2016

Discriminative Hierarchical Rank Pooling for Activity Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Dynamic Image Networks for Action Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Multi-Target Tracking With Time-Varying Clutter Rate and Detection Profile: Application to Time-Lapse Cell Microscopy Sequences.
IEEE Trans. Medical Imaging, 2015

Learning Weighted Lower Linear Envelope Potentials in Binary Markov Random Fields.
IEEE Trans. Pattern Anal. Mach. Intell., 2015

Deep CNN Ensemble with Data Augmentation for Object Detection.
CoRR, 2015

The Angry Birds AI Competition.
AI Mag., 2015

Multi-class Semantic Video Segmentation with Exemplar-Based Object Reasoning.
Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, 2015

A Linear Chain Markov Model for Detection and Localization of Cells in Early Stage Embryo Development.
Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, 2015

Detecting Abnormal Cell Division Patterns in Early Stage Human Embryo Development.
Proceedings of the Machine Learning in Medical Imaging - 6th International Workshop, 2015

Automated monitoring of human embryonic cells up to the 5-cell stage in time-lapse microscopy images.
Proceedings of the 12th IEEE International Symposium on Biomedical Imaging, 2015

Hierarchical Higher-Order Regression Forest Fields: An Application to 3D Indoor Scene Labelling.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

A Unified Graphical Models Framework for Automated Mitosis Detection in Human Embryos.
IEEE Trans. Medical Imaging, 2014

Scene understanding by labeling pixels.
Commun. ACM, 2014

Joint semantic and geometric segmentation of videos with a stage model.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

A unified graphical models framework for automated human embryo tracking in time lapse microscopy.
Proceedings of the IEEE 11th International Symposium on Biomedical Imaging, 2014

Superpixel Graph Label Transfer with Learned Distance Metric.
Proceedings of the Computer Vision - ECCV 2014, 2014

Reflective Features Detection and Hierarchical Reflections Separation in Image Sequences.
Proceedings of the 2014 International Conference on Digital Image Computing: Techniques and Applications, 2014

An Exemplar-Based CRF for Multi-instance Object Segmentation.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Determining Interacting Objects in Human-Centric Activities via Qualitative Spatio-Temporal Reasoning.
Proceedings of the Computer Vision - ACCV 2014, 2014

Reliable Point Correspondences in Scenes Dominated by Highly Reflective and Largely Homogeneous Surfaces.
Proceedings of the Computer Vision - ACCV 2014 Workshops, 2014

Discriminative learning with latent variables for cluttered indoor scene understanding.
Commun. ACM, 2013

The Cyber Challenge - a Robotics Project to Enthuse.
Proceedings of the 10th IFAC Symposium on Advances in Control Education, 2013

A framework for generating realistic synthetic sequences of total internal reflection fluorescence microscopy images.
Proceedings of the 10th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2013

A Multiple Model Probability Hypothesis Density Tracker for Time-Lapse Cell Microscopy Sequences.
Proceedings of the Information Processing in Medical Imaging, 2013

Efficient Extraction and Representation of Spatial Information from Video Data.
Proceedings of the IJCAI 2013, 2013

Multi-instance Object Segmentation with Exemplars.
Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, 2013

DARWIN: a framework for machine learning and computer vision research and development.
J. Mach. Learn. Res., 2012

Application of the IMM-JPDA Filter to Multiple Target Tracking in Total Internal Reflection Fluorescence Microscopy Images.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2012, 2012

Towards unsupervised semantic segmentation of street scenes from motion cues.
Proceedings of the Image and Vision Computing New Zealand, 2012

On Learning Higher-Order Consistency Potentials for Multi-class Pixel Labeling.
Proceedings of the Computer Vision - ECCV 2012, 2012

PatchMatchGraph: Building a Graph of Dense Patch Correspondences for Label Transfer.
Proceedings of the Computer Vision - ECCV 2012, 2012

Multiclass pixel labeling with non-local matching constraints.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

A Noise Tolerant Watershed Transformation with Viscous Force for Seeded Image Segmentation.
Proceedings of the Computer Vision - ACCV 2012, 2012

Max-margin Learning for Lower Linear Envelope Potentials in Binary Markov Random Fields.
Proceedings of the 28th International Conference on Machine Learning, 2011

Simultaneous Multi-class Pixel Labeling over Coherent Image Sets.
Proceedings of the 2011 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2011

Accelerated dual decomposition for MAP inference.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

A Unified Contour-Pixel Model for Figure-Ground Segmentation.
Proceedings of the Computer Vision - ECCV 2010, 2010

Single image depth estimation from predicted semantic labels.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Region-based Segmentation and Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

High-accuracy 3D sensing for mobile manipulation: Improving object detection and door opening.
Proceedings of the 2009 IEEE International Conference on Robotics and Automation, 2009

Decomposing a scene into geometric and semantically consistent regions.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Alphabet SOUP: A framework for approximate energy minimization.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

Multi-Class Segmentation with Relative Location Prior.
Int. J. Comput. Vis., 2008

Projected Subgradient Methods for Learning Sparse Gaussians.
Proceedings of the UAI 2008, 2008

Cascaded Classification Models: Combining Models for Holistic Scene Understanding.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Learning Bounded Treewidth Bayesian Networks.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Peripheral-Foveal Vision for Real-time Object Recognition and Tracking in Video.
Proceedings of the IJCAI 2007, 2007
