Ziyu Guo

This page is a disambiguation page, it actually contains mutiple papers from persons of the same or a similar name.

Bibliography

2026
MCIRP: A multi-granularity cross-modal interaction model based on relational propagation for Multimodal Named Entity Recognition with multiple images.
Inf. Process. Manag., 2026

2025
LMT++: Adaptively Collaborating LLMs With Multi-Specialized Teachers for Continual VQA in Robotic Surgical Videos.
IEEE Trans. Medical Imaging, November, 2025

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark.
CoRR, October, 2025

BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities.
CoRR, October, 2025

IndusGCC: A Data Benchmark and Evaluation Framework for GUI-Based General Computer Control in Industrial Automation.
CoRR, September, 2025

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs.
CoRR, August, 2025

MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models.
CoRR, August, 2025

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos.
CoRR, June, 2025

Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking.
CoRR, May, 2025

CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms.
CoRR, May, 2025

Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO.
CoRR, May, 2025

UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens.
CoRR, May, 2025

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT.
CoRR, May, 2025

StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion.
CoRR, March, 2025

Less is More: Improving Motion Diffusion Models with Sparse Keyframes.
CoRR, March, 2025

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model.
CoRR, March, 2025

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems.
CoRR, March, 2025

Efficient UAV Swarm-Based Multi-Task Federated Learning with Dynamic Task Knowledge Sharing.
CoRR, March, 2025

MAB-Based Channel Scheduling for Asynchronous Federated Learning in Non-Stationary Environments.
CoRR, March, 2025

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency.
CoRR, February, 2025

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step.
CoRR, January, 2025

IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models.
CoRR, January, 2025

A power-iteration-based beamformer for large-scale antenna arrays.
Microelectron. J., 2025

Bifurcation of sub-harmonic solutions of some 4-dimensional piecewise smooth near-Hamiltonian systems.
Commun. Nonlinear Sci. Numer. Simul., 2025

Adverse childhood experiences and short-form video addiction: A serial mediation model of resilience and life satisfaction.
Comput. Hum. Behav., 2025

Background Calibration for Mixed Mismatches in TIADC Using Taylor-Volterra-Series-Based Model.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2025

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Multi-range Random Walk Based Graph Neural Network.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2025

Let's Verify and Reinforce Image Generation Step by Step.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
LIHAN: A Lattice-Guided Incomplete Heterogeneous Information Network Embedding Model for Node Classification.
IEEE Trans. Comput. Soc. Syst., December, 2024

Segmented Storage Based on Parallel Execution for IoT Blockchains.
IEEE Internet Things J., November, 2024

An Ultra-Efficient and Compact Flipping-Input Synchronized Switch Harvesting on Inductor and Capacitors Interface Circuit for Piezoelectric Energy Harvesting.
IEEE Trans. Circuits Syst. II Express Briefs, August, 2024

A RIS-Based Vehicle DOA Estimation Method With Integrated Sensing and Communication System.
IEEE Trans. Intell. Transp. Syst., June, 2024

A Security Evaluation Model for Edge Information Systems Based on Index Screening.
IEEE Internet Things J., June, 2024

An Energy-Efficient BNN Accelerator With Two-Stage Value Prediction for Sparse-Edge Gesture Recognition.
IEEE Trans. Circuits Syst. I Regul. Pap., January, 2024

Angular Parameter Estimation for Incoherently Distributed Sources With Single RF Chain.
IEEE Trans. Signal Process., 2024

Aligning knowledge concepts to whole slide images for precise histopathology image analysis.
npj Digit. Medicine, 2024

Transformer-based vision-language alignment for robot navigation and question answering.
Inf. Fusion, 2024

Point Cloud Understanding via Attention-Driven Contrastive Learning.
CoRR, 2024

Artificial Intelligence for Biomedical Video Generation.
CoRR, 2024

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines.
CoRR, 2024

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners.
CoRR, 2024

MAVIS: Mathematical Visual Instruction Tuning.
CoRR, 2024

TripletMix: Triplet Data Augmentation for 3D Understanding.
CoRR, 2024

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
CoRR, 2024

siRNADiscovery: a graph neural network for siRNA efficacy prediction via deep RNA sequence analysis.
Briefings Bioinform., 2024

Hardware Acceleration of Phase and Gain Control for Analog Beamforming.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2024

Grouped Graph Neural Networks for Anomaly Detection in Time Series.
Proceedings of the International Joint Conference on Neural Networks, 2024

X-former Elucidator: Reviving Efficient Attention for Long Context Language Modeling.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic Surgery.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Personalize Segment Anything Model with One Shot.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MATHVERSE: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Proceedings of the Computer Vision - ECCV 2024, 2024

No Time to Train: Empowering Non-Parametric Networks for Few-Shot 3D Scene Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SignVTCL: Multi-Modal Continuous Sign Language Recognition Enhanced by Visual-Textual Contrastive Learning.
Proceedings of the 35th British Machine Vision Conference, 2024

Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Spatio-Temporal Pivotal Graph Neural Networks for Traffic Flow Forecasting.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Robust beamforming design for UAV communications based on integrated sensing and communication.
EURASIP J. Wirel. Commun. Netw., December, 2023

A Security Resilience Metric Framework Based on the Evolution of Attack and Defense Scenarios.
IEEE Internet Things J., October, 2023

Data Sharing Privacy Metrics Model Based on Information Entropy and Group Privacy Preference.
Cryptogr., March, 2023

A security evaluation model for multi-source heterogeneous systems based on IOT and edge computing.
Clust. Comput., February, 2023

A foreground digital calibration algorithm for time-interleaved ADCs with low computational complexity.
Microelectron. J., 2023

ImageBind-LLM: Multi-modality Instruction Tuning.
CoRR, 2023

Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following.
CoRR, 2023

Less is More: Towards Efficient Few-shot 3D Semantic Segmentation via Training-free Networks.
CoRR, 2023

Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation.
CoRR, 2023

Personalize Segment Anything Model with One Shot.
CoRR, 2023

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance.
CoRR, 2023

Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud Pre-training.
CoRR, 2023

Bifurcation of periodic orbits and its application for high-dimensional piecewise smooth near integrable systems with two switching manifolds.
Commun. Nonlinear Sci. Numer. Simul., 2023

Design and application of a novel approximate constraint tracking robust control for permanent magnet synchronous motor.
Comput. Chem. Eng., 2023

Prostate Segmentation in MRI Using Transformer Encoder and Decoder Framework.
IEEE Access, 2023

Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

DS-Point: A Dual-Scale 3D Framework for Point Cloud Understanding.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2023

HIGT: Hierarchical Interaction Graph-Transformer for Whole Slide Image Analysis.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud Pre-training.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Hardware-Efficient Beamspace Direction-of-Arrival Estimator for Unequal-Sized Subarrays.
IEEE Trans. Circuits Syst. II Express Briefs, 2022

Low-Cost Beamforming and DOA Estimation Based on One-Bit Reconfigurable Intelligent Surface.
IEEE Signal Process. Lett., 2022

Cascade wavelet transform based convolutional neural networks with application to image classification.
Neurocomputing, 2022

Can Language Understand Depth?
CoRR, 2022

MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection.
CoRR, 2022

Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Can Language Understand Depth?
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

A Multiple Charge Extractions and Multiple Precharge Interface Circuit for Piezoelectric Energy Harvesting.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Security Evaluation Model of Blockchain System Based on Combination Weighting and Grey Clustering.
Proceedings of the 7th IEEE International Conference on Data Science in Cyberspace, 2022

PointCLIP: Point Cloud Understanding by CLIP.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Unambiguous Direction-of-Arrival Estimation for Improved Scanning Efficiency in Subarray-Based Hybrid Array.
IEEE Trans. Veh. Technol., 2021

Reconfigurable Intelligent Surface Aided Sparse DOA Estimation Method With Non-ULA.
IEEE Signal Process. Lett., 2021

Comparative Analysis of Two Machine Learning Algorithms in Predicting Site-Level Net Ecosystem Exchange in Major Biomes.
Remote. Sens., 2021

DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion.
CoRR, 2021

Improved Heatmap-Based Landmark Detection.
Proceedings of the Deep Generative Models, and Data Augmentation, Labelling, and Imperfections, 2021

Enhancing Continuous Service of Information Systems Based on Cyber Resilience.
Proceedings of the Sixth IEEE International Conference on Data Science in Cyberspace, 2021

2020
SGDAN - A Spatio-Temporal Graph Dual-Attention Neural Network for Quantified Flight Delay Prediction.
Sensors, 2020

An Ego Network Embedding Model via Neighbors Sampling and Self-attention Mechanism.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2020

A Big Service with Network Represent Learning for Quantified Flight Delay Prediction.
Proceedings of the 2020 IEEE International Conference on Web Services, 2020

2019
Psychological Gender Express via Mobile Social Network Activities: An Experimental Study on a Gay Network Data.
IEEE Access, 2019

SGNN: A Graph Neural Network Based Federated Learning Approach by Hiding Structure.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

2018
Public Opinion Spamming: A Model for Content and Users on Sina Weibo.
Proceedings of the 10th ACM Conference on Web Science, 2018

Social Media vs. News Media: Analyzing Real-World Events from Different Perspectives.
Proceedings of the Database and Expert Systems Applications, 2018

Detect Cooperative Hyping Among VIP Users and Spammers in Sina Weibo.
Proceedings of the Computer Supported Cooperative Work and Social Computing, 2018

2017
Millimeter-Wave Channel Estimation Based on 2-D Beamspace MUSIC Method.
IEEE Trans. Wirel. Commun., 2017

2016
Transmitter-Centric Channel Estimation and Low-PAPR Precoding for Millimeter-Wave MIMO Systems.
IEEE Trans. Commun., 2016

2015
Dynamic State Estimation and Anomaly Detection in Smart Grid Using Point-Based Gaussian Approximation Filtering.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

2014
Sequential Joint Spectrum Sensing and Channel Estimation for Dynamic Spectrum Access.
IEEE J. Sel. Areas Commun., 2014

2012
One stone two birds: synchronization relaxation and redundancy removal in GPU-CPU translation.
Proceedings of the International Conference on Supercomputing, 2012

2011
Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation.
Proceedings of the Languages and Compilers for Parallel Computing, 2011

On-the-fly elimination of dynamic irregularities for GPU computing.
Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, 2011

Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
Streamlining GPU applications on the fly: thread divergence elimination through runtime thread-data remapping.
Proceedings of the 24th International Conference on Supercomputing, 2010

2009
Single-particle 3d reconstruction from cryo-electron microscopy images on GPU.
Proceedings of the 23rd international conference on Supercomputing, 2009


  Loading...