MlyPredCSED: based on extreme point deviation compensated clustering combined with cross-scale convolutional neural networks to predict multiple lysine sites in human.

[BibT_eX]

[DOI]

Yun Zuo

Xingze Fang

Briefings Bioinform., March, 2025

Towards General Visual-Linguistic Face Forgery Detection(V2).

[BibT_eX]

[DOI]

CoRR, February, 2025

ME-FAS: Multimodal Text Enhancement for Cross-Domain Face Anti-Spoofing.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2025

M3ixup: A multi-modal data augmentation approach for image captioning.

[BibT_eX]

[DOI]

Pattern Recognit., 2025

Optical remote sensing image salient object detection via bidirectional cross-attention and attention restoration.

[BibT_eX]

[DOI]

Pattern Recognit., 2025

The Evolution of E-commerce Leadership: Traits, Innovation, and Performance Across Time.

[BibT_eX]

[DOI]

Jiaguo Liu

Siping Liang

Jiayi Ji

Proceedings of the E-Business. Generative Artificial Intelligence and Management Transformation, 2025

Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

GSAlign: Geometric and Semantic Alignment Network for Aerial-Ground Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

HRSeg: High-Resolution Visual Perception and Enhancement for Reasoning Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

MIHBench: Benchmarking and Mitigating Multi-Image Hallucinations in Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Multi-Modal Object Re-identification via Sparse Mixture-of-Experts.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Towards Semantic Equivalence of Tokenization in Multimodal LLM.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

γ-MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Aigi-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ACL: Activating Capability of Linear Attention for Image Restoration.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Towards General Visual-Linguistic Face Forgery Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding.

[BibT_eX]

[DOI]

CoRR, 2024

Any-to-3D Generation via Hybrid Diffusion Supervision.

[BibT_eX]

[DOI]

CoRR, 2024

Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension.

[BibT_eX]

[DOI]

CoRR, 2024

TraDiffusion: Trajectory-Based Training-Free Image Generation.

[BibT_eX]

[DOI]

CoRR, 2024

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

HRSAM: Efficiently Segment Anything in High-Resolution Images.

[BibT_eX]

[DOI]

CoRR, 2024

Evaluating and Analyzing Relationship Hallucinations in LVLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Synergistic Dual Spatial-aware Generation of Image-to-text and Text-to-image.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

3D-GRES: Generalized 3D Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Multi-branch Collaborative Learning Network for 3D Visual Grounding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

APL: Anchor-Based Prompt Learning for One-Stage Weakly Supervised Referring Expression Comprehension.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MMAPS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product Summarization.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Toward Open-Set Human Object Interaction Detection.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Improving Panoptic Narrative Grounding by Harnessing Semantic Relationships and Visual Confirmation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Towards local visual modeling for image captioning.

[BibT_eX]

[DOI]

Pattern Recognit., June, 2023

Knowing What it is: Semantic-Enhanced Dual Attention Transformer.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Multi-Branch Distance-Sensitive Self-Attention Network for Image Captioning.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation.

[BibT_eX]

[DOI]

CoRR, 2023

NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning.

[BibT_eX]

[DOI]

CoRR, 2023

M3PS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product Summarization in E-commerce.

[BibT_eX]

[DOI]

CoRR, 2023

Semi-Supervised Panoptic Narrative Grounding.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Towards Real-Time Panoptic Narrative Grounding by an End-to-End Grounding Network.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

CIMTx: An R Package for Causal Inference with Multiple Treatments using Observational Data.

[BibT_eX]

[DOI]

Liangyuan Hu

Jiayi Ji

R J., December, 2022

Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image Captioning.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Spatiotemporal Evolution of the Carbon Fluxes from Bamboo Forests and their Response to Climate Change Based on a BEPS Model in China.

[BibT_eX]

[DOI]

Remote. Sens., 2022

2021

Remote Sensing Estimation of Bamboo Forest Aboveground Biomass Based on Geographically Weighted Regression.

[BibT_eX]

[DOI]

Remote. Sens., 2021

Multiscale leaf area index assimilation for Moso bamboo forest based on Sentinel-2 and MODIS data.

[BibT_eX]

[DOI]

Int. J. Appl. Earth Obs. Geoinformation, 2021

RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Dual-level Collaborative Transformer for Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Attacking Image Captioning Towards Accuracy-Preserving Target Words Removal.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

2019

Semantic-aware Image Deblurring.

[BibT_eX]

[DOI]

CoRR, 2019

Variational Structured Semantic Inference for Diverse Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Jiayi Ji

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...