Zhiyong Wang

Orcid: 0000-0002-8043-0312

Affiliations:
  • University of Sydney, School of Information Technologies, NSW, Australia
  • Hong Kong Polytechnic University, Hong Kong (PhD)


According to our database1, Zhiyong Wang authored at least 175 papers between 1999 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
TLDW: Extreme Multimodal Summarization of News Videos.
IEEE Trans. Circuits Syst. Video Technol., March, 2024

Siamese Biattention Pooling Network for Change Detection in Remote Sensing.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2024

SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Autoregressive Omni-Aware Outpainting for Open-Vocabulary 360-Degree Image Generation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Federated Unsupervised Cluster-Contrastive learning for person Re-identification: A coarse-to-fine approach.
Comput. Vis. Image Underst., December, 2023

Cascade Multi-Level Transformer Network for Surgical Workflow Analysis.
IEEE Trans. Medical Imaging, October, 2023

Multi-Level Adversarial Spatio-Temporal Learning for Footstep Pressure Based FoG Detection.
IEEE J. Biomed. Health Informatics, August, 2023

A Sparse Framework for Robust Possibilistic K-Subspace Clustering.
IEEE Trans. Fuzzy Syst., April, 2023

Graph Fusion Network-Based Multimodal Learning for Freezing of Gait Detection.
IEEE Trans. Neural Networks Learn. Syst., March, 2023

Region Assisted Sketch Colorization.
IEEE Trans. Image Process., 2023

Part to Whole: Collaborative Prompting for Surgical Instrument Segmentation.
CoRR, 2023

HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding.
CoRR, 2023

Terrain Diffusion Network: Climatic-Aware Terrain Generation with Geological Sketch Guidance.
CoRR, 2023

Bridging the Gap: Fine-to-Coarse Sketch Interpolation Network for High-Quality Animation Sketch Inbetweening.
CoRR, 2023

Robust Audio Anti-Spoofing with Fusion-Reconstruction Learning on Multi-Order Spectrograms.
CoRR, 2023

Robust Knowledge Adaptation for Federated Unsupervised Person ReID.
CoRR, 2023

Efficient and Interpretable Compressive Text Summarisation with Unsupervised Dual-Agent Reinforcement Learning.
Proceedings of The Fourth Workshop on Simple and Efficient Natural Language Processing, 2023

LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

TopicCAT: Unsupervised Topic-Guided Co-Attention Transformer for Extreme Multimodal Summarisation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Exploring Coarse-to-Fine Action Token Localization and Interaction for Fine-grained Video Action Recognition.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Embedding the Self-Organisation of Deep Feature Maps in the Hamburger Framework can Yield Better and Interpretable Results.
Proceedings of the International Joint Conference on Neural Networks, 2023

Online Visual SLAM Adaptation against Catastrophic Forgetting with Cycle-Consistent Contrastive Learning.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Material-Aware Self-Supervised Network for Dynamic 3D Garment Simulation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Self-Supervised Multimodal Fusion Network for Knee Osteoarthritis Severity Grading.
Proceedings of the International Conference on Digital Image Computing: Techniques and Applications, 2023

Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Multi-Scale Control Signal-Aware Transformer for Motion Synthesis without Phase.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Adversarial Evolving Neural Network for Longitudinal Knee Osteoarthritis Prediction.
IEEE Trans. Medical Imaging, 2022

Action Recognition With Motion Diversification and Dynamic Selection.
IEEE Trans. Image Process., 2022

Graph Convolutional Dictionary Selection With L₂<sub>, </sub>ₚ Norm for Video Summarization.
IEEE Trans. Image Process., 2022

Multi-scale Features Fusion for the Detection of Tiny Bleeding in Wireless Capsule Endoscopy Images.
ACM Trans. Internet Things, 2022

Vision-Enhanced and Consensus-Aware Transformer for Image Captioning.
IEEE Trans. Circuits Syst. Video Technol., 2022

Affective Audio Annotation of Public Speeches with Convolutional Clustering Neural Network.
IEEE Trans. Affect. Comput., 2022

Enhanced Local and Global Learning for Rotation-Invariant Point Cloud Representation.
IEEE Multim., 2022

TLDW: Extreme Multimodal Summarisation of News Videos.
CoRR, 2022

Sign Language Translation with Hierarchical Spatio-Temporal Graph Neural Network.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

OTExtSum: Extractive Text Summarisation with Optimal Transport.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Skin Lesion Recognition with Class-Hierarchy Regularized Hyperbolic Embeddings.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022

Deep Laparoscopic Stereo Matching with Transformers.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022

Multi-Scale Attention based Transformer U-NET for Change Detection.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2022

Confidence-Calibrated Face Image Forgery Detection with Contrastive Representation Distillation.
Proceedings of the Computer Vision - ACCV 2022, 2022

2021
Joint Input and Output Space Learning for Multi-Label Image Classification.
IEEE Trans. Multim., 2021

Patch Based Video Summarization With Block Sparse Representation.
IEEE Trans. Multim., 2021

Short-Term Lesion Change Detection for Melanoma Screening With Novel Siamese Neural Network.
IEEE Trans. Medical Imaging, 2021

Keyframe Extraction From Laparoscopic Videos via Diverse and Weighted Dictionary Selection.
IEEE J. Biomed. Health Informatics, 2021

Similarity Based Block Sparse Subset Selection for Video Summarization.
IEEE Trans. Circuits Syst. Video Technol., 2021

ERINet: Enhanced rotation-invariant network for point cloud classification.
Pattern Recognit. Lett., 2021

Coupling matrix manifolds assisted optimization for optimal transport problems.
Mach. Learn., 2021

Deep3D reconstruction: methods, data, and challenges.
Frontiers Inf. Technol. Electron. Eng., 2021

Sign Language Translation with Hierarchical Spatio-TemporalGraph Neural Network.
CoRR, 2021

Deep Learning Techniques for In-Crop Weed Identification: A Review.
CoRR, 2021

Single-channel EEG based insomnia detection with domain adaptation.
Comput. Biol. Medicine, 2021

A Multi-task Kernel Learning Algorithm for Survival Analysis.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2021

Keyframe Extraction from Motion Capture Sequences with Graph based Deep Reinforcement Learning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Topic-Guided Local-Global Graph Neural Network For Image Captioning.
Proceedings of the 2021 IEEE International Conference on Multimedia & Expo Workshops, 2021

Learning Efficient Rotation Representation for Point Cloud via Local-Global Aggregation.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Attention-based Long-term Modeling for Deep Visual Odometry.
Proceedings of the 2021 Digital Image Computing: Techniques and Applications, 2021

2020
Non-Contact Sleep Stage Detection Using Canonical Correlation Analysis of Respiratory Sound.
IEEE J. Biomed. Health Informatics, 2020

A Residual Based Attention Model for EEG Based Sleep Staging.
IEEE J. Biomed. Health Informatics, 2020

Vision-Based Freezing of Gait Detection With Anatomic Directed Graph Representation.
IEEE J. Biomed. Health Informatics, 2020

Graph Sequence Recurrent Neural Network for Vision-Based Freezing of Gait Detection.
IEEE Trans. Image Process., 2020

Real-time hand posture recognition using hand geometric features and Fisher Vector.
Signal Process. Image Commun., 2020

Learning visual relationship and context-aware attention for image captioning.
Pattern Recognit., 2020

Video summarization via block sparse dictionary selection.
Neurocomputing, 2020

Graph weeds net: A graph-based deep learning method for weed recognition.
Comput. Electron. Agric., 2020

3D Hand Pose Estimation with Disentangled Cross-Modal Latent Space.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

FCP Filter: A Dynamic Clustering-Prediction Framework for Customer Behavior.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2020

Speaker-Aware Monaural Speech Separation.
Proceedings of the Interspeech 2020, 2020

Correlation-Aware Next Basket Recommendation Using Graph Attention Networks.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

Efficient Brain Tumor Segmentation with Dilated Multi-fiber Network and Weighted Bi-directional Feature Pyramid Network.
Proceedings of the Digital Image Computing: Techniques and Applications, 2020

Multitask Learning for Video-based Surgical Skill Assessment.
Proceedings of the Digital Image Computing: Techniques and Applications, 2020

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Feature covariance matrix-based dynamic hand gesture recognition.
Neural Comput. Appl., 2019

Robust video summarization using collaborative representation of adjacent frames.
Multim. Tools Appl., 2019

3D human pose estimation from range images with depth difference and geodesic distance.
J. Vis. Commun. Image Represent., 2019

A study on multi-kernel intuitionistic fuzzy C-means clustering with multiple attributes.
Neurocomputing, 2019

Coupling Matrix Manifolds and Their Applications in Optimal Transport.
CoRR, 2019

Regularized Fuzzy Discriminant Analysis for Hyperspectral Image Classification With Noisy Labels.
IEEE Access, 2019

Learning Shared and Cluster-Specific Dictionaries for Single Image Super-Resolution.
IEEE Access, 2019

Morphological Filtering Enhanced Empirical Wavelet Transform for Mode Decomposition.
IEEE Access, 2019

Video Summarization via Nonlinear Sparse Dictionary Selection.
IEEE Access, 2019

IntersectGAN: Learning Domain Intersection for Generating Images with Multiple Attributes.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Stacked Memory Network for Video Summarization.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

2018
Real-Time Long-Term Tracking With Prediction-Detection-Correction.
IEEE Trans. Multim., 2018

Hyperspectral Image Classification With Global-Local Discriminant Analysis and Spatial-Spectral Context.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2018

Exploiting spatial-temporal context for trajectory based action video retrieval.
Multim. Tools Appl., 2018

Video Summarization via Weighted Neighborhood Based Representation.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Generative Adversarial Network (GAN) Based Data Augmentation for Palmprint Recognition.
Proceedings of the 2018 Digital Image Computing: Techniques and Applications, 2018

Convolutional 3D Attention Network for Video Based Freezing of Gait Recognition.
Proceedings of the 2018 Digital Image Computing: Techniques and Applications, 2018

Forward-Backward Nonlinear Sparse Dictionary Selection Based Video Summarization.
Proceedings of the Fourth IEEE International Conference on Multimedia Big Data, 2018

Vision-Based Freezing of Gait Detection with Anatomic Patch Based Representation.
Proceedings of the Computer Vision - ACCV 2018, 2018

2017
Eye tracking data guided feature selection for image classification.
Pattern Recognit., 2017

Learning universal multiview dictionary for human action recognition.
Pattern Recognit., 2017

Visual tracking utilizing robust complementary learner and adaptive refiner.
Neurocomputing, 2017

Video summarization via temporal collaborative representation of adjacent frames.
Proceedings of the 2017 International Symposium on Intelligent Signal Processing and Communication Systems, 2017

Matrix Neural Networks.
Proceedings of the Advances in Neural Networks - ISNN 2017 - 14th International Symposium, 2017

Nonlinear kernel sparse dictionary selection for video summarization.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Exploring the influence of feature representation for dictionary selection based video summarization.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Decision tree based sleep stage estimation from nocturnal audio signals.
Proceedings of the 22nd International Conference on Digital Signal Processing, 2017

Video Summarization via Simultaneous Block Sparse Representation.
Proceedings of the 2017 International Conference on Digital Image Computing: Techniques and Applications, 2017

Exploring Kernel Based Spatial Context for CNN Based Hyperspectral Image Classification.
Proceedings of the 2017 International Conference on Digital Image Computing: Techniques and Applications, 2017

SCUT-MMSIG: A Multimodal Online Signature Database.
Proceedings of the Biometric Recognition - 12th Chinese Conference, 2017

2016
A Scalable Approach for Content-Based Image Retrieval in Peer-to-Peer Networks.
IEEE Trans. Knowl. Data Eng., 2016

Investigating the impact of frame rate towards robust human action recognition.
Signal Process., 2016

Robust foreground object segmentation from handheld camera videos with occlusion map.
Multim. Tools Appl., 2016

Atmospheric turbulence mitigation based on turbulence extraction.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A Multiview Joint Sparse Representation with Discriminative Dictionary for Melanoma Detection.
Proceedings of the 2016 International Conference on Digital Image Computing: Techniques and Applications, 2016

2015
Exploratory Product Image Search With Circle-to-Search Interaction.
IEEE Trans. Circuits Syst. Video Technol., 2015

Video summarization via minimum sparse reconstruction.
Pattern Recognit., 2015

Resource restricted on-line Video Summarization with Minimum Sparse Reconstruction.
Proceedings of the 2015 Picture Coding Symposium, 2015

Spatial-temporal correlation for trajectory based action video retrieval.
Proceedings of the 17th IEEE International Workshop on Multimedia Signal Processing, 2015

Discovering Commonness and Specificness for Human Action Recognition.
Proceedings of the 2nd ACM International Workshop on Human-centered Event Understanding from Multimedia, 2015

Unsupervised snore detection from respiratory sound signals.
Proceedings of the 2015 IEEE International Conference on Digital Signal Processing, 2015

Automatic Preview Frame Selection for Online Videos.
Proceedings of the 2015 International Conference on Digital Image Computing: Techniques and Applications, 2015

2014
A Top-Down Approach for Video Summarization.
ACM Trans. Multim. Comput. Commun. Appl., 2014

Browse-to-Search: Interactive Exploratory Search with Visual Entities.
ACM Trans. Inf. Syst., 2014

A Bag-of-Importance Model With Locality-Constrained Coding Based Feature Learning for Video Summarization.
IEEE Trans. Multim., 2014

Unsupervised Spectral Mixture Analysis of Highly Mixed Data With Hopfield Neural Network.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2014

Spectral embedding based facial expression recognition with multiple features.
Neurocomputing, 2014

Multimedia Sensor Networks.
Int. J. Distributed Sens. Networks, 2014

L2, 0 constrained sparse dictionary selection for video summarization.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

EAGLE: A novel descriptor for identifying plant species using leaf lamina vascular features.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2014

Iterative keyframe selection by orthogonal subspace projection.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

2013
Realistic Human Action Recognition With Multimodal Feature Selection and Fusion.
IEEE Trans. Syst. Man Cybern. Syst., 2013

Keypoint-Based Keyframe Selection.
IEEE Trans. Circuits Syst. Video Technol., 2013

Learning realistic facial expressions from web images.
Pattern Recognit., 2013

Semantic context based refinement for news video annotation.
Multim. Tools Appl., 2013

Discriminative two-level feature selection for realistic human action recognition.
J. Vis. Commun. Image Represent., 2013

Fast human action classification and VOI localization with enhanced sparse coding.
J. Vis. Commun. Image Represent., 2013

A bag-of-importance model for video summarization.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

Graph cuts based relevance feedback in image retrieval.
Proceedings of the IEEE International Conference on Image Processing, 2013

A supervised multiview spectral embedding method for neuroimaging classification.
Proceedings of the IEEE International Conference on Image Processing, 2013

2012
What is happening: annotating images with verbs.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Browse-to-search.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Evaluating the impact of <i>frame rate</i> on video based human action recognition.
Proceedings of the Image and Vision Computing New Zealand, 2012

Content-Based Image Retrieval in P2P Networks with Bag-of-Features.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, 2012

How Many Frames Does Facial Expression Recognition Require?
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, 2012

Video Summarization with Global and Local Features.
Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, 2012

Unsupervised Spectral Mixture Analysis with Hopfield Neural Network for hyperspectral images.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Video Object Segmentation with Occlusion Map.
Proceedings of the 2012 International Conference on Digital Image Computing Techniques and Applications, 2012

Single Image Dehazing with White Balance Correction and Image Decomposition.
Proceedings of the 2012 International Conference on Digital Image Computing Techniques and Applications, 2012

Unsupervised Text Segmentation using LDA and MCMC.
Proceedings of the Tenth Australasian Data Mining Conference, AusDM 2012, Sydney, 2012

2011
Improving Spatial-Spectral Endmember Extraction in the Presence of Anomalous Ground Objects.
IEEE Trans. Geosci. Remote. Sens., 2011

<i>StoryImaging</i>: a media-rich presentation system for textual stories.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Structure context of local features in realistic human action recognition.
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011

Leaf Image Classification with Shape Context and SIFT Descriptors.
Proceedings of the 2011 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2011

Structural Image Classification with Graph Neural Networks.
Proceedings of the 2011 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2011

2010
Spatial Purity Based Endmember Extraction for Spectral Mixture Analysis.
IEEE Trans. Geosci. Remote. Sens., 2010

An efficient retinex-like brightness normalization method for coding camera flashes and strong brightness variation in videos.
Signal Process. Image Commun., 2010

Mixture Analysis by Multichannel Hopfield Neural Network.
IEEE Geosci. Remote. Sens. Lett., 2010

Video Summarization with Visual and Semantic Features.
Proceedings of the Advances in Multimedia Information Processing - PCM 2010, 2010

Two-step similarity matching for Content-Based Video Retrieval in P2P, networks.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Adaptive reference frame selection for near-duplicate video shot detection.
Proceedings of the International Conference on Image Processing, 2010

Harvesting Web Images for Realistic Facial Expression Recognition.
Proceedings of the International Conference on Digital Image Computing: Techniques and Applications, 2010

Realistic Human Action Recognition with Audio Context.
Proceedings of the International Conference on Digital Image Computing: Techniques and Applications, 2010

Improving News Video Annotation with Semantic Context.
Proceedings of the International Conference on Digital Image Computing: Techniques and Applications, 2010

2009
Two-level indexing for high-dimensional range queries in peer-to-peer networks.
Proceedings of the 2009 IEEE International Workshop on Multimedia Signal Processing, 2009

Reliable object recognition using SIFT features.
Proceedings of the 2009 IEEE International Workshop on Multimedia Signal Processing, 2009

Improved concept similarity measuring in the visual domain.
Proceedings of the 2009 IEEE International Workshop on Multimedia Signal Processing, 2009

Hierarchical Gaussian Mixture Model for Image Annotation via PLSA.
Proceedings of the Fifth International Conference on Image and Graphics, 2009

2008
Image annotation with parametric mixture model based multi-class multi-labeling.
Proceedings of the International Workshop on Multimedia Signal Processing, 2008

Measuring semantic similarity between concepts in visual domain.
Proceedings of the International Workshop on Multimedia Signal Processing, 2008

Windowing technique for the DCT based retinex algorithm to handle videos with brightness variations coded using the H.264.
Proceedings of the International Conference on Image Processing, 2008

Retinex based motion estimation for sequences with brightness variations and its application to H.264.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Concept Constrained Image Region Annotation.
Proceedings of the IEEE 9th Workshop on Multimedia Signal Processing, 2007

2006
Annotating Image Regions Using Spatial Context.
Proceedings of the Eigth IEEE International Symposium on Multimedia (ISM 2006), 2006

Utilizing Structural Context for Region Classification.
Proceedings of the Intelligent Information Processing III, 2006

2004
Comparison of image partition methods for adaptive image categorization based on structural image representation.
Proceedings of the 8th International Conference on Control, 2004

2003
Efficient Learning in Adaptive Processing of Data Structures.
Neural Process. Lett., 2003

Content-Based Image Retrieval with Relevance Feedback Using Adaptive Processing of Tree-Structure Image Representation.
Int. J. Image Graph., 2003

Region-of-interest based flower images retrieval.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Robust Learning in Adaptive Processing of Data Structures for Tree Representation Based Image Classification.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Fuzzy integral for leaf image retrieval.
Proceedings of the 2002 IEEE International Conference on Fuzzy Systems, 2002

2001
Adaptive Processing of Tree-Structure Image Representation.
Proceedings of the Advances in Multimedia Information Processing, 2001

2000
Leaf Image Retrieval with Shape Features.
Proceedings of the Advances in Visual Information Systems, 4th International Conference, 2000

1999
Block-Constrained Fractal Coding Scheme for Image Retrieval.
Proceedings of the Visual Information and Information Systems, 1999


  Loading...