Wenhai Wang

Orcid: 0000-0003-3707-6546

According to our database¹, Wenhai Wang authored at least 239 papers between 1999 and 2026.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2026

Discovering explicit and implicit causality for bioprocess factor forecasting.

[BibT_eX]

[DOI]

Inf. Sci., 2026

Semantic-aware testing for object detection systems.

[BibT_eX]

[DOI]

Inf. Softw. Technol., 2026

2025

Androfim: few-shot android malware family detection based on image representation.

[BibT_eX]

[DOI]

Cybersecur., December, 2025

Low-Light Image Enhancement via CLIP-Fourier Guided Wavelet Diffusion.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., November, 2025

A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook.

[BibT_eX]

[DOI]

ACM Comput. Surv., November, 2025

ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution.

[BibT_eX]

[DOI]

CoRR, October, 2025

MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites.

[BibT_eX]

[DOI]

CoRR, October, 2025

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning.

[BibT_eX]

[DOI]

CoRR, October, 2025

Rounding-Guided Backdoor Injection in Deep Learning Model Quantization.

[BibT_eX]

[DOI]

CoRR, October, 2025

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints.

[BibT_eX]

[DOI]

CoRR, October, 2025

Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding.

[BibT_eX]

[DOI]

CoRR, October, 2025

SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation.

[BibT_eX]

[DOI]

CoRR, October, 2025

Sequential Diffusion Language Models.

[BibT_eX]

[DOI]

CoRR, September, 2025

Cuckoo Attack: Stealthy and Persistent Attacks Against AI-IDE.

[BibT_eX]

[DOI]

CoRR, September, 2025

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data.

[BibT_eX]

[DOI]

CoRR, September, 2025

GenExam: A Multidisciplinary Text-to-Image Exam.

[BibT_eX]

[DOI]

CoRR, September, 2025

Pose-Guided Transformer for Fine-Grained Action Quality Assessment.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., August, 2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency.

[BibT_eX]

[DOI]

CoRR, August, 2025

ORFuzz: Fuzzing the "Other Side" of LLM Safety - Testing Over-Refusal.

[BibT_eX]

[DOI]

CoRR, August, 2025

CrossPL: Evaluating Large Language Models on Cross Programming Language Code Generation.

[BibT_eX]

[DOI]

CoRR, July, 2025

MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents.

[BibT_eX]

[DOI]

CoRR, July, 2025

ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding.

[BibT_eX]

[DOI]

CoRR, July, 2025

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning.

[BibT_eX]

[DOI]

CoRR, July, 2025

Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, July, 2025

InternSpatial: A Comprehensive Dataset for Spatial Reasoning in Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, June, 2025

OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis.

[BibT_eX]

[DOI]

CoRR, June, 2025

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces.

[BibT_eX]

[DOI]

CoRR, June, 2025

Scuzer: A Scheduling Optimization Fuzzer for TVM.

[BibT_eX]

[DOI]

ACM Trans. Softw. Eng. Methodol., May, 2025

KG4RecEval: Does Knowledge Graph Really Matter for Recommender Systems?

[BibT_eX]

[DOI]

ACM Trans. Inf. Syst., May, 2025

EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, May, 2025

ZeroGUI: Automating Online GUI Learning at Zero Human Cost.

[BibT_eX]

[DOI]

CoRR, May, 2025

Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings.

[BibT_eX]

[DOI]

CoRR, May, 2025

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows.

[BibT_eX]

[DOI]

CoRR, May, 2025

Fair-PP: A Synthetic Dataset for Aligning LLM with Personalized Preferences of Social Equity.

[BibT_eX]

[DOI]

CoRR, May, 2025

EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, May, 2025

Recurrent Neural Unit With Frequency Attention for Specific Emitter Identification.

[BibT_eX]

[DOI]

IEEE Trans. Cogn. Commun. Netw., April, 2025

Demystify Transformers & Convolutions in Modern Image Deep Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2025

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models.

[BibT_eX]

[DOI]

CoRR, April, 2025

Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR.

[BibT_eX]

[DOI]

CoRR, April, 2025

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

[BibT_eX]

[DOI]

CoRR, April, 2025

BEVFormer: Learning Bird's-Eye-View Representation From LiDAR-Camera via Spatiotemporal Transformers.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., March, 2025

LuaTaint: A Static Analysis System for Web Configuration Interface Vulnerability of Internet of Things Devices.

[BibT_eX]

[DOI]

IEEE Internet Things J., March, 2025

ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting.

[BibT_eX]

[DOI]

CoRR, March, 2025

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework.

[BibT_eX]

[DOI]

CoRR, March, 2025

ModiGen: A Large Language Model-Based Workflow for Multi-Task Modelica Code Generation.

[BibT_eX]

[DOI]

CoRR, March, 2025

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning.

[BibT_eX]

[DOI]

CoRR, March, 2025

MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, March, 2025

LLM4EFFI: Leveraging Large Language Models to Enhance Code Efficiency and Correctness.

[BibT_eX]

[DOI]

CoRR, February, 2025

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference.

[BibT_eX]

[DOI]

CoRR, February, 2025

InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling.

[BibT_eX]

[DOI]

CoRR, January, 2025

Dynamic cross-layer security risk assessment and mitigation for cyber-physical power systems.

[BibT_eX]

[DOI]

Pengchao Yao

Qiang Yang

Wenhai Wang

Reliab. Eng. Syst. Saf., 2025

S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models.

[BibT_eX]

[DOI]

Proc. ACM Softw. Eng., 2025

Beyond Static Pattern Matching? Rethinking Automatic Cryptographic API Misuse Detection in the Era of LLMs.

[BibT_eX]

[DOI]

Proc. ACM Softw. Eng., 2025

SSFuzz: State-Guided Fuzzing With Shared Feedback for Black-Box IoT Devices.

[BibT_eX]

[DOI]

IEEE Internet Things J., 2025

BinEGA: Enhancing DNN-based Binary Code Similarity Detection through Efficient Graph Alignment.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Software Analysis, 2025

MQueez: Specification-Driven Fuzzing for MQTT Broker (Registered Report).

[BibT_eX]

[DOI]

Proceedings of the 34th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2025

UltraModel: A Modeling Paradigm for Industrial Objects.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Diffuse&Refine: Intrinsic Knowledge Generation and Aggregation for Incremental Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

CoMemo: LVLMs Need Image Context with Image Memory.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.

[BibT_eX]

[DOI]

et al.

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Modeling and Verifying Concurrent Reactive Systems Using Separation Logic.

[BibT_eX]

[DOI]

Proceedings of the Formal Methods and Software Engineering, 2025

PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Docopilot: Improving Multimodal Models for Document-Level Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Generalized Security-Preserving Refinement for Concurrent Systems.

[BibT_eX]

[DOI]

Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, 2025

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference.

[BibT_eX]

[DOI]

Maosongcao Maosongcao

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Sticking to the Mean: Detecting Sticky Tokens in Text Embedding Models.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Uncovering LLM-Generated Code: A Zero-Shot Synthetic Code Detector via Code Rewriting.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Denoising Diffusion Straightforward Models for Energy Conversion Monitoring Data Imputation.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Informatics, October, 2024

Causality Enhanced Global-Local Graph Neural Network for Bioprocess Factor Forecasting.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Informatics, October, 2024

VLG: General Video Recognition with Web Textual Knowledge.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., October, 2024

From Coarse to Fine: Hierarchical Zero-Shot Fault Diagnosis With Multigrained Attributes.

[BibT_eX]

[DOI]

IEEE Trans. Fuzzy Syst., May, 2024

Delving Into the Devils of Bird's-Eye-View Perception: A Review, Evaluation and Recipe.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

Security-Enhanced Operational Architecture for Decentralized Industrial Internet of Things: A Blockchain-Based Approach.

[BibT_eX]

[DOI]

IEEE Internet Things J., March, 2024

Gaussian dynamic recurrent unit for emitter classification.

[BibT_eX]

[DOI]

Expert Syst. Appl., March, 2024

Mini-InternVL: a flexible-transfer pocket multi-modal model with 5% parameters and 90% performance.

[BibT_eX]

[DOI]

Vis. Intell., 2024

Canonical Variate Analysis for Detecting False Data Injection Attacks in Alternating Current State Estimation.

[BibT_eX]

[DOI]

IEEE Trans. Netw. Sci. Eng., 2024

Feature Selection Based on Intrusive Outliers Rather Than All Instances.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

Fast Fourier Transform With Multihead Attention for Specific Emitter Identification.

[BibT_eX]

[DOI]

IEEE Trans. Instrum. Meas., 2024

DTIN: Dual Transformer-based Imputation Nets for multivariate time series emitter missing data.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2024

Bayesian and stochastic game joint approach for Cross-Layer optimal defensive Decision-Making in industrial Cyber-Physical systems.

[BibT_eX]

[DOI]

Inf. Sci., 2024

DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction.

[BibT_eX]

[DOI]

CoRR, 2024

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.

[BibT_eX]

[DOI]

CoRR, 2024

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance.

[BibT_eX]

[DOI]

CoRR, 2024

Agents4PLC: Automating Closed-loop PLC Code Generation and Verification in Industrial Control Systems using LLM-based Agents.

[BibT_eX]

[DOI]

CoRR, 2024

MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Optimizing 4D Lookup Table for Low-light Video Enhancement via Wavelet Priori.

[BibT_eX]

[DOI]

CoRR, 2024

Characterizing and Evaluating the Reliability of LLMs against Jailbreak Attacks.

[BibT_eX]

[DOI]

CoRR, 2024

Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.

[BibT_eX]

[DOI]

CoRR, 2024

Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.

[BibT_eX]

[DOI]

CoRR, 2024

LLMs Meet Multimodal Generation and Editing: A Survey.

[BibT_eX]

[DOI]

CoRR, 2024

S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Does Knowledge Graph Really Matter for Recommender Systems?

[BibT_eX]

[DOI]

CoRR, 2024

LuaTaint: A Static Taint Analysis System for Web Interface Framework Vulnerability of IoT Devices.

[BibT_eX]

[DOI]

CoRR, 2024

FoolSDEdit: Deceptively Steering Your Edits Towards Targeted Attribute-aware Distribution.

[BibT_eX]

[DOI]

CoRR, 2024

MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer.

[BibT_eX]

[DOI]

CoRR, 2024

Low-light Image Enhancement via CLIP-Fourier Guided Wavelet Diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

FAMCF: A few-shot Android malware family classification framework.

[BibT_eX]

[DOI]

Comput. Secur., 2024

MMInstruct: a high-quality multi-modal instruction tuning dataset with extensive diversity.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2024

How far are we to GPT-4V? Closing the gap to commercial multimodal models with open-source suites.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2024

Statistical knowledge and game-theoretic integrated model for cross-layer impact assessment in industrial cyber-physical systems.

[BibT_eX]

[DOI]

Adv. Eng. Informatics, 2024

LCG-YOLO: A Real-Time Surface Defect Detection Method for Metal Components.

[BibT_eX]

[DOI]

IEEE Access, 2024

Critical Code Guided Directed Greybox Fuzzing for Commits.

[BibT_eX]

[DOI]

Proceedings of the 33rd USENIX Security Symposium, 2024

Exploring ChatGPT's Capabilities on Vulnerability Management.

[BibT_eX]

[DOI]

Proceedings of the 33rd USENIX Security Symposium, 2024

SyzTrust: State-aware Fuzzing on Trusted OS Designed for IoT Devices.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on Security and Privacy, 2024

Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Needle In A Multimodal Haystack.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Fault Diagnosis of Blast Furnace Throat Temperature Monitoring Device Based on Residual Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Instrumentation and Measurement Technology Conference, 2024

The All-Seeing Project V2: Towards General Relation Comprehension of the Open World.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

ControlLLM: Augment Language Models with Tools by Searching on Graphs.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Distilling Knowledge from Large-Scale Image Models for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Intern VL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

AVSegFormer: Audio-Visual Segmentation with Transformer.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

A traffic anomaly detection approach based on unsupervised learning for industrial cyber-physical system.

[BibT_eX]

[DOI]

Knowl. Based Syst., November, 2023

Cloud-edge coordinated traffic anomaly detection for industrial cyber-physical systems.

[BibT_eX]

[DOI]

Expert Syst. Appl., November, 2023

A novel mesh discretization strategy for numerical solution of optimal control problems in aerospace engineering.

[BibT_eX]

[DOI]

J. Frankl. Inst., September, 2023

A novel radar operating mode identification approach based on variational relevance vector machine with chaotic gravitational search optimization.

[BibT_eX]

[DOI]

Trans. Inst. Meas. Control, May, 2023

Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

A Novel Time-Domain Graph Tensor Attention Network for Specific Emitter Identification.

[BibT_eX]

[DOI]

IEEE Trans. Instrum. Meas., 2023

BSMD: A blockchain-based secure storage mechanism for big spatio-temporal data.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2023

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks.

[BibT_eX]

[DOI]

CoRR, 2023

A Survey of Reasoning with Foundation Models.

[BibT_eX]

[DOI]

CoRR, 2023

DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, 2023

Prompting Frameworks for Large Language Models: A Survey.

[BibT_eX]

[DOI]

CoRR, 2023

How ChatGPT is Solving Vulnerability Management Problem.

[BibT_eX]

[DOI]

CoRR, 2023

ControlLLM: Augment Language Models with Tools by Searching on Graphs.

[BibT_eX]

[DOI]

CoRR, 2023

Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

AVSegFormer: Audio-Visual Segmentation with Transformer.

[BibT_eX]

[DOI]

CoRR, 2023

Denoising Diffusion Semantic Segmentation with Mask Prior Modeling.

[BibT_eX]

[DOI]

CoRR, 2023

VideoChat: Chat-Centric Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

InternGPT: Solving Vision-Centric Tasks by Interacting with Chatbots Beyond Language.

[BibT_eX]

[DOI]

CoRR, 2023

A Survey of Historical Learning: Learning Models with Learning History.

[BibT_eX]

[DOI]

CoRR, 2023

Champion Solution for the WSDM2023 Toloka VQA Challenge.

[BibT_eX]

[DOI]

CoRR, 2023

How IoT Re-using Threatens Your Sensitive Data: Exploring the User-Data Disposal in Used IoT Devices.

[BibT_eX]

[DOI]

Proceedings of the 44th IEEE Symposium on Security and Privacy, 2023

RFT: Toward Highly Reliable Flow Data Transmission in Network Measurement.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual IEEE International Conference on Sensing, 2023

VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Vision Transformer Adapter for Dense Predictions.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

FB-BEV: BEV Representation from Forward-Backward View Transformations.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Applying Rely-Guarantee Reasoning on Concurrent Memory Management and Mailbox in μC/OS-II: A Case Study.

[BibT_eX]

[DOI]

Proceedings of the Formal Methods for Industrial Critical Systems, 2023

Static Semantics Reconstruction for Enhancing JavaScript-WebAssembly Multilingual Malware Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Security - ESORICS 2023, 2023

CP-BCS: Binary Code Summarization Guided by Control Flow Graph and Pseudo Code.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Planning-oriented Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Optimal Stealthy Attack to Remote Estimator for Estimation Error Regulation.

[BibT_eX]

[DOI]

Proceedings of the American Control Conference, 2023

2022

A Novel Aggregated Multipath Extreme Gradient Boosting Approach for Radar Emitter Classification.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Electron., 2022

Adversarial Malicious Encrypted Traffic Detection Based on Refined Session Analysis.

[BibT_eX]

[DOI]

Symmetry, 2022

PolarMask++: Enhanced Polar Representation for Single-Shot Instance Segmentation and Beyond.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Density Peak Clustering with connectivity estimation.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2022

On Efficient Reinforcement Learning for Full-length Game of StarCraft II.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2022

A novel locality-sensitive hashing relational graph matching network for semantic textual similarity measurement.

[BibT_eX]

[DOI]

Expert Syst. Appl., 2022

PVT v2: Improved baselines with Pyramid Vision Transformer.

[BibT_eX]

[DOI]

Comput. Vis. Media, 2022

Goal-oriented Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, 2022

Demystify Transformers & Convolutions in Modern Image Deep Networks.

[BibT_eX]

[DOI]

CoRR, 2022

Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe.

[BibT_eX]

[DOI]

CoRR, 2022

Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality.

[BibT_eX]

[DOI]

CoRR, 2022

Hybrid Cloud-Edge Collaborative Data Anomaly Detection in Industrial Sensor Networks.

[BibT_eX]

[DOI]

CoRR, 2022

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers.

[BibT_eX]

[DOI]

CoRR, 2022

WegFormer: Transformers for Weakly Supervised Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2022

Deep weighted joint distribution adaption network for fault diagnosis of blast furnace ironmaking process.

[BibT_eX]

[DOI]

Comput. Chem. Eng., 2022

Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Incremental Few-Shot Semantic Segmentation via Embedding Adaptive-Update and Hyper-class Representation.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

SLIME: program-sensitive energy allocation for fuzzing.

[BibT_eX]

[DOI]

Proceedings of the ISSTA '22: 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual Event, South Korea, July 18, 2022

Polygon-Free: Unconstrained Scene Text Detection with Box Annotations.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

BEVFormer: Learning Bird's-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition.

[BibT_eX]

[DOI]

CoRR, 2021

FAST: Searching for a Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation.

[BibT_eX]

[DOI]

CoRR, 2021

ARTS: Eliminating Inconsistency between Text Detection and Recognition with Auto-Rectification Text Spotter.

[BibT_eX]

[DOI]

CoRR, 2021

Panoptic SegFormer.

[BibT_eX]

[DOI]

CoRR, 2021

An empirical evaluation of attention-based multi-head models for improved turbofan engine remaining useful life prediction.

[BibT_eX]

[DOI]

CoRR, 2021

Learning Class-level Prototypes for Few-shot Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers.

[BibT_eX]

[DOI]

CoRR, 2021

PVTv2: Improved Baselines with Pyramid Vision Transformer.

[BibT_eX]

[DOI]

CoRR, 2021

PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text.

[BibT_eX]

[DOI]

CoRR, 2021

An Introduction of mini-AlphaStar.

[BibT_eX]

[DOI]

CoRR, 2021

DetCo: Unsupervised Contrastive Learning for Object Detection.

[BibT_eX]

[DOI]

CoRR, 2021

Trans2Seg: Transparent Object Segmentation with Transformer.

[BibT_eX]

[DOI]

CoRR, 2021

A novel adaptive generic model control strategy for internal thermally coupled air separation columns with multivariable recursive estimation.

[BibT_eX]

[DOI]

Ja'afar Sulaiman Zangina

Comput. Chem. Eng., 2021

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

IFIZZ: Deep-State and Efficient Fault-Scenario Generation to Test IoT Firmware.

[BibT_eX]

[DOI]

Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering, 2021

Segmenting Transparent Objects in the Wild with Transformer.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

DetCo: Unsupervised Contrastive Learning for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

A novel intrusion detection system based on an optimal hybrid kernel extreme learning machine.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2020

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervision and Dynamic Self-Training.

[BibT_eX]

[DOI]

CoRR, 2020

False Data Injection Attacks and Corresponding Countermeasure in DC Microgrid.

[BibT_eX]

[DOI]

CoRR, 2020

Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Guided Refine-Head for Object Detection.

[BibT_eX]

[DOI]

Lingyun Zeng

You Song

Wenhai Wang

Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

TK-Text: Multi-shaped Scene Text Detection via Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Segmenting Transparent Objects in the Wild.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Scene Text Image Super-Resolution in the Wild.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Differentiable Hierarchical Graph Grouping for Multi-person Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

PolarMask: Single Shot Instance Segmentation With Polar Representation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

TextSR: Content-Aware Text Super-Resolution Guided by Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

Shape Robust Text Detection with Progressive Scale Expansion Network.

[BibT_eX]

[DOI]

CoRR, 2019

Cropout: A General Mechanism for Reducing Overfitting on Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2019

Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Shape Robust Text Detection With Progressive Scale Expansion Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Selective Kernel Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Nonzero-Dynamics Stealthy Attack and Its Impacts Analysis in DC Microgrids.

[BibT_eX]

[DOI]

Proceedings of the 2019 American Control Conference, 2019

2018

Shape Robust Text Detection with Progressive Scale Expansion Network.

[BibT_eX]

[DOI]

CoRR, 2018

Hand Pose Estimation with Attention-and-Sequence Network.

[BibT_eX]

[DOI]

Tianping Hu

Wenhai Wang

Tong Lu

Proceedings of the Advances in Multimedia Information Processing - PCM 2018, 2018

A Novel 3D Human Action Recognition Framework for Video Content Analysis.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Cloud of Line Distribution and Random Forest Based Text Detection from Natural/Video Scene Images.

[BibT_eX]

[DOI]

Wenhai Wang

Yirui Wu

Palaiahnakote Shivakumara

Tong Lu

Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Mixed Link Networks.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

2017

Cloud of Line Distribution for Arbitrary Text Detection in Scene/Video/License Plate Images.

[BibT_eX]

[DOI]

Wenhai Wang

Yirui Wu

Shivakumara Palaiahnakote

Tong Lu

Jun Liu

Proceedings of the Advances in Multimedia Information Processing - PCM 2017, 2017

Visual Robotic Object Grasping Through Combining RGB-D Data and 3D Meshes.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

A Robust Symmetry-Based Method for Scene/Video Text Detection through Neural Network.

[BibT_eX]

[DOI]

Yirui Wu

Wenhai Wang

Shivakumara Palaiahnakote

Tong Lu

Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

2016

Biomechanics of high-grade spondylolisthesis with and without reduction.

[BibT_eX]

[DOI]

Medical Biol. Eng. Comput., 2016

2015

Remaining Useful Life Prediction for a Nonlinear Heterogeneous Wiener Process Model With an Adaptive Drift.

[BibT_eX]

[DOI]

IEEE Trans. Reliab., 2015

2013

A new fault detection method for computer networks.

[BibT_eX]

[DOI]

Reliab. Eng. Syst. Saf., 2013

On soft fault diagnosis method based HHT for analog circuits.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Conference on Control and Automation, 2013

A modular design approach for coal-fired power plant control system.

[BibT_eX]

[DOI]

Suwen Huang

Wenhai Wang

Proceedings of the 10th IEEE International Conference on Control and Automation, 2013

2012

Performance Degradation Monitoring for Onboard Speed Sensors of Trains.

[BibT_eX]

[DOI]

Zhengguo Xu

Wenhai Wang

Youxian Sun

IEEE Trans. Intell. Transp. Syst., 2012

2003

Sufficient conditions for the convergence of open-closed-loop PID-type iterative learning control for nonlinear time-varying systems.

[BibT_eX]

[DOI]

Jianxia Shou

Daoying Pi

Wenhai Wang

Proceedings of the IEEE International Conference on Systems, 2003

Associative classifier modeling method based on rough set theory and factor analysis technology.

[BibT_eX]

[DOI]

Xin Ma

Wenhai Wang

Youxian Sun

Proceedings of the IEEE International Conference on Systems, 2003

1999

Optimal robust digital control of systems with uncertainty.

[BibT_eX]

[DOI]

Xiang Liu

Youxian Sun

Wenhai Wang

Proceedings of the 5th European Control Conference, 1999

Wenhai Wang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...