Li Shen

Orcid: 0000-0001-5659-3464

Affiliations:
  • Sun Yat-sen University Shenzhen Campus, School of Cyber Science and Technology, Shenzhen, China
  • JD Explore Academy, Beijing, China (2021 - 2024)
  • Tencent, Shenzhen, China (2017 - 2021)
  • South China University of Technology, Guangzhou, China (PhD 2017)


According to our database1, Li Shen authored at least 299 papers between 2017 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Toward Understanding the Generalizability of Delayed Stochastic Gradient Descent.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2025

Asymmetrically Decentralized Federated Learning.
IEEE Trans. Computers, August, 2025

Constraint Boundary Wandering Framework: Enhancing Constrained Optimization With Deep Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2025

On Nonconvex SGD Under Unbounded Noise With Weak Gradient Lipschitz and Delayed Stochastic Gradient.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2025

Data-Adaptive Weight-Ensembling for Multi-task Model Fusion.
Int. J. Comput. Vis., August, 2025

ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning.
Int. J. Comput. Vis., August, 2025

LOST: Low-rank and Sparse Pre-training for Large Language Models.
CoRR, August, 2025

Graph Convolutional Mixture-of-Experts Learner Network for Long-Tailed Domain Generalization.
IEEE Trans. Circuits Syst. Video Technol., July, 2025

DFedADMM: Dual Constraint Controlled Model Inconsistency for Decentralize Federated Learning.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2025

Hyper-modal Imputation Diffusion Embedding with Dual-Distillation for Federated Multimodal Knowledge Graph Completion.
CoRR, June, 2025

GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching.
CoRR, June, 2025

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning.
CoRR, June, 2025

AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs.
CoRR, June, 2025

MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models.
CoRR, June, 2025

TrojanTO: Action-Level Backdoor Attacks against Trajectory Optimization Models.
CoRR, June, 2025

Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning.
CoRR, June, 2025

DGL-GAN: discriminator-guided GAN compression.
Vis. Comput., May, 2025

Revisiting Flatness-Aware Optimization in Continual Learning With Orthogonal Gradient Projection.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2025

LightSAM: Parameter-Agnostic Sharpness-Aware Minimization.
CoRR, May, 2025

Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer.
CoRR, May, 2025

Decision Flow Policy Optimization.
CoRR, May, 2025

Refining Few-Step Text-to-Multiview Diffusion via Reinforcement Learning.
CoRR, May, 2025

Multimodal Reasoning Agent for Zero-Shot Composed Image Retrieval.
CoRR, May, 2025

Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging.
CoRR, May, 2025

Vad-R1: Towards Video Anomaly Reasoning via Perception-to-Cognition Chain-of-Thought.
CoRR, May, 2025

MLLM-Guided VLM Fine-Tuning with Joint Inference for Zero-Shot Composed Image Retrieval.
CoRR, May, 2025

R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search.
CoRR, May, 2025

R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO.
CoRR, May, 2025

CTRAP: Embedding Collapse Trap to Safeguard Large Language Models from Harmful Fine-Tuning.
CoRR, May, 2025

Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning.
CoRR, May, 2025

Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities.
CoRR, May, 2025

Federated Learning With Only Positive Labels by Exploring Label Correlations.
IEEE Trans. Neural Networks Learn. Syst., April, 2025

AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization.
CoRR, April, 2025

Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection.
CoRR, April, 2025

Neuron-level Balance between Stability and Plasticity in Deep Reinforcement Learning.
CoRR, April, 2025

A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2025

On Efficient Training of Large-Scale Deep Learning Models.
ACM Comput. Surv., March, 2025

Retrieval-Augmented Perception: High-Resolution Image Perception Meets Visual RAG.
CoRR, March, 2025

A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models.
CoRR, February, 2025

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam.
CoRR, February, 2025

On Theoretical Limits of Learning with Label Differential Privacy.
CoRR, February, 2025

SeWA: Selective Weight Average via Probabilistic Masking.
CoRR, February, 2025

HRP: High-Rank Preheating for Superior LoRA Initialization.
CoRR, February, 2025

Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More.
CoRR, February, 2025

Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging.
CoRR, February, 2025

Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment.
CoRR, February, 2025

Winning Prize Comes from Losing Tickets: Improve Invariant Learning by Exploring Variant Parameters for Out-of-Distribution Generalization.
Int. J. Comput. Vis., January, 2025

TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs.
CoRR, January, 2025

Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation.
CoRR, January, 2025

O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning.
CoRR, January, 2025

Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging.
CoRR, January, 2025

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent.
CoRR, January, 2025

QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning.
Trans. Mach. Learn. Res., 2025

Are Large Language Models Really Robust to Word-Level Perturbations?
Trans. Mach. Learn. Res., 2025

Cross-Domain Diffusion With Progressive Alignment for Efficient Adaptive Retrieval.
IEEE Trans. Image Process., 2025

A Pyramid Fusion MLP for Dense Prediction.
IEEE Trans. Image Process., 2025

DFedGFM: Pursuing global consistency for Decentralized Federated Learning via global flatness and global momentum.
Neural Networks, 2025

Communication-efficient distributed learning with Local Immediate Error Compensation.
Neural Networks, 2025

Learning from models beyond fine-tuning.
Nat. Mac. Intell., 2025

Code-switching finetuning: Bridging multilingual pretrained language models for enhanced cross-lingual performance.
Eng. Appl. Artif. Intell., 2025

Enhancing column generation by reinforcement learning-based hyper-heuristic for vehicle routing and scheduling problems.
Comput. Ind. Eng., 2025

Prompt Tuning with Diffusion for Few-Shot Pre-trained Policy Generalization.
Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems, 2025

Enhancing Learning with Label Differential Privacy by Vector Approximation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Dynamic Neural Fortresses: An Adaptive Shield for Model Extraction Defense.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Understanding the Stability-based Generalization of Personalized Federated Learning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Combatting Dimensional Collapse in LLM Pre-Training Data via Submodular File Selection.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

PEARL: Towards Permutation-Resilient LLMs.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Edit Once, Update Everywhere: A Simple Framework for Cross-Lingual Knowledge Synchronization in LLMs.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?
ACM Trans. Multim. Comput. Commun. Appl., December, 2024

Master-Slave Deep Architecture for Top-K Multiarmed Bandits With Nonlinear Bandit Feedback and Diversity Constraints.
IEEE Trans. Neural Networks Learn. Syst., December, 2024

FedGAMMA: Federated Learning With Global Sharpness-Aware Minimization.
IEEE Trans. Neural Networks Learn. Syst., December, 2024

Continual Learning From a Stream of APIs.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

On Transforming Reinforcement Learning With Transformers: The Development Trajectory.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Neural-aware Decoupling Fusion based Personalized Federated Learning for Intelligent Sensing.
ACM Trans. Sens. Networks, November, 2024

SPORT: A Subgraph Perspective on Graph Classification with Label Noise.
ACM Trans. Knowl. Discov. Data, November, 2024

Meta-Learning Without Data via Unconditional Diffusion Models.
IEEE Trans. Circuits Syst. Video Technol., November, 2024

A Unified Analysis of AdaGrad With Weighted Aggregation and Momentum Acceleration.
IEEE Trans. Neural Networks Learn. Syst., October, 2024

Quantum Imitation Learning.
IEEE Trans. Neural Networks Learn. Syst., October, 2024

Efficient Federated Learning With Enhanced Privacy via Lottery Ticket Pruning in Edge Computing.
IEEE Trans. Mob. Comput., October, 2024

Retain and Adapt: Online Sequential EEG Classification With Subject Shift.
IEEE Trans. Artif. Intell., September, 2024

Multi-Scenario and Multi-Task Aware Feature Interaction for Recommendation System.
ACM Trans. Knowl. Discov. Data, July, 2024

Generalized Embedding Machines for Recommender Systems.
Mach. Intell. Res., June, 2024

Messages are Never Propagated Alone: Collaborative Hypergraph Neural Network for Time-Series Forecasting.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

Local AdaGrad-type algorithm for stochastic convex-concave optimization.
Mach. Learn., April, 2024

AdaSAM: Boosting sharpness-aware minimization with adaptive learning rate and momentum for training deep neural networks.
Neural Networks, January, 2024

Joint Admission Control and Resource Allocation of Virtual Network Embedding via Hierarchical Deep Reinforcement Learning.
IEEE Trans. Serv. Comput., 2024

Revisiting Discrete Soft Actor-Critic.
Trans. Mach. Learn. Res., 2024

Visual Prompt Based Personalized Federated Learning.
Trans. Mach. Learn. Res., 2024

Dynamic PDGAN: discriminator-boosted knowledge distillation for StyleGANs.
Adv. Math. Commun., 2024

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search.
CoRR, 2024

DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context LLMs.
CoRR, 2024

Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs.
CoRR, 2024

Exploring the Generalization Capabilities of AID-based Bi-level Optimization.
CoRR, 2024

A Unified Analysis for Finite Weight Averaging.
CoRR, 2024

AGLP: A Graph Learning Perspective for Semi-supervised Domain Adaptation.
CoRR, 2024

DiM: <i>f</i>-Divergence Minimization Guided Sharpness-Aware Optimization for Semi-supervised Medical Image Segmentation.
CoRR, 2024

Aligning Few-Step Diffusion Models with Dense Reward Difference Learning.
CoRR, 2024

Continual Task Learning through Adaptive Policy Self-Composition.
CoRR, 2024

Stability and Generalization for Distributed SGDA.
CoRR, 2024

Task-Aware Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning.
CoRR, 2024

Communication Learning in Multi-Agent Systems from Graph Modeling Perspective.
CoRR, 2024

Towards Constraint-aware Learning for Resource Allocation in NFV-enabled Networks.
CoRR, 2024

Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging.
CoRR, 2024

Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces.
CoRR, 2024

SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery.
CoRR, 2024

Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace.
CoRR, 2024

Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models.
CoRR, 2024

Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation.
CoRR, 2024

Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration.
CoRR, 2024

OledFL: Unleashing the Potential of Decentralized Federated Learning via Opposite Lookahead Enhancement.
CoRR, 2024

DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion.
CoRR, 2024

USCD: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding.
CoRR, 2024

Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal.
CoRR, 2024

Convergent Differential Privacy Analysis for General Federated Learning: the f-DP Perspective.
CoRR, 2024

Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models.
CoRR, 2024

SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models.
CoRR, 2024

Sequential Federated Learning in Hierarchical Architecture on Non-IID Datasets.
CoRR, 2024

Byzantine-resilient Federated Learning Employing Normalized Gradients on Non-IID Datasets.
CoRR, 2024

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities.
CoRR, 2024

(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork.
CoRR, 2024

Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion.
CoRR, 2024

FusionBench: A Comprehensive Benchmark of Deep Model Fusion.
CoRR, 2024

AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization.
CoRR, 2024

Learning with User-Level Local Differential Privacy.
CoRR, 2024

Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo.
CoRR, 2024

Continuous Spiking Graph Neural Networks.
CoRR, 2024

A General and Efficient Federated Split Learning with Pre-trained Image Transformers for Heterogeneous Data.
CoRR, 2024

Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning.
CoRR, 2024

Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping.
CoRR, 2024

Solving Continual Offline Reinforcement Learning with Decision Transformer.
CoRR, 2024

Decomposed Prompt Decision Transformer for Efficient Unseen Task Generalization.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

A Huber Loss Minimization Approach to Mean Estimation under User-level Differential Privacy.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

A-FedPD: Aligning Dual-Drift is All Federated Primal-Dual Learning Needs.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual World Knowledge.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

PrimKD: Primary Modality Guided Multimodal Fusion for RGB-D Semantic Segmentation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

MuEP: A Multimodal Benchmark for Embodied Planning with Foundation Models.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Representation Surgery for Multi-Task Model Merging.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Generalization Analysis of Stochastic Weight Averaging with General Sampling.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Q-value Regularized Transformer for Offline Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

DREAM: Dual Structured Exploration with Mixup for Open-set Graph Domain Adaption.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

AdaMerging: Adaptive Model Merging for Multi-Task Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

A Unified and General Framework for Continual Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Parameter-Efficient Multi-Task Model Fusion with Partial Linearization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Learning Multi-Agent Communication from Graph Modeling Perspective.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Improving Non-Transferable Representation Learning by Harnessing Content and Style.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Training A Secure Model Against Data-Free Model Extraction.
Proceedings of the Computer Vision - ECCV 2024, 2024

Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer.
Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024

Sheared Backpropagation for Fine-Tuning Foundation Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Free: Faster and Better Data-Free Meta-Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Decentralized Directed Collaboration for Personalized Federated Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Your Transferability Barrier is Fragile: Free-Lunch for Transferring the Non-Transferable Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Revisiting Knowledge Distillation for Autoregressive Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Neural Network Approximation for Pessimistic Offline Reinforcement Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
OMG: Towards Effective Graph Classification Against Label Noise.
IEEE Trans. Knowl. Data Eng., December, 2023

Distributionally Robust Memory Evolution With Generalized Divergence for Continual Learning.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Efficient Federated Learning Via Local Adaptive Amended Optimizer With Linear Speedup.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Hierarchical Detailed Intermediate Supervision for Image-to-Image Translation.
IEICE Trans. Inf. Syst., December, 2023

Prescribed Safety Performance Imitation Learning From a Single Expert Dataset.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance.
Int. J. Comput. Vis., October, 2023

Task-Adaptive Feature Disentanglement and Hallucination for Few-Shot Classification.
IEEE Trans. Circuits Syst. Video Technol., August, 2023

Differentiable Neural Architecture Search for Extremely Lightweight Image Super-Resolution.
IEEE Trans. Circuits Syst. Video Technol., June, 2023

Curriculum-Based Asymmetric Multi-Task Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Reducing bi-level feature redundancy for unsupervised domain adaptation.
Pattern Recognit., May, 2023

Efficient-Adam: Communication-Efficient Distributed Adam.
IEEE Trans. Signal Process., 2023

Dynamic Contrastive Distillation for Image-Text Retrieval.
IEEE Trans. Multim., 2023

Fusion of Global and Local Knowledge for Personalized Federated Learning.
Trans. Mach. Learn. Res., 2023

FedDAG: Federated DAG Structure Learning.
Trans. Mach. Learn. Res., 2023

Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion.
CoRR, 2023

Task-Distributionally Robust Data-Free Meta-Learning.
CoRR, 2023

Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz.
CoRR, 2023

Learn From Model Beyond Fine-Tuning: A Survey.
CoRR, 2023

Asymmetrically Decentralized Federated Learning.
CoRR, 2023

Which mode is better for federated learning? Centralized or Decentralized.
CoRR, 2023

Efficient Federated Prompt Tuning for Black-box Large Pre-trained Models.
CoRR, 2023

Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation.
CoRR, 2023

Deep Model Fusion: A Survey.
CoRR, 2023

FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data.
CoRR, 2023

MerA: Merging Pretrained Adapters For Few-Shot Learning.
CoRR, 2023

Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints.
CoRR, 2023

Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent.
CoRR, 2023

DFedADMM: Dual Constraints Controlled Model Inconsistency for Decentralized Federated Learning.
CoRR, 2023

Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy.
CoRR, 2023

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer.
CoRR, 2023

Instructed Diffuser with Temporal Condition Guidance for Offline Reinforcement Learning.
CoRR, 2023

Towards More Suitable Personalization in Federated Learning via Decentralized Partial Model Training.
CoRR, 2023

Prompt-Tuning Decision Transformer with Preference Ranking.
CoRR, 2023

Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy.
CoRR, 2023

On Efficient Training of Large-Scale Deep Learning Models: A Literature Review.
CoRR, 2023

Graph Decision Transformer.
CoRR, 2023

SGDA: Towards 3D Universal Pulmonary Nodule Detection via Slice Grouped Domain Attention.
CoRR, 2023

OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System.
CoRR, 2023

Subspace based Federated Unlearning.
CoRR, 2023

Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE.
CoRR, 2023

Enhance Local Consistency in Federated Learning: A Multi-Step Inertial Momentum Approach.
CoRR, 2023

SaFormer: A Conditional Sequence Modeling Approach to Offline Safe Reinforcement Learning.
CoRR, 2023

Enhancing Adversarial Training via Reweighting Optimization Trajectory.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

An Efficient Dataset Condensation Plugin and Its Application to Continual Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Defending against Data-Free Model Extraction by Distributionally Robust Defensive Training.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Understanding How Consistency Works in Federated Learning via Stage-wise Relaxed Initialization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Towards Stable Backdoor Purification through Feature Shift Tuning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness for Semi-Supervised Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Federated Learning with Manifold Regularization and Normalized Update Reaggregation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Dynamic Sparsity Is Channel-Level Sparsity Learner.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LGViT: Dynamic Early Exiting for Accelerating Vision Transformer.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Off-policy Imitation Learning from Visual Inputs.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification.
Proceedings of the International Conference on Machine Learning, 2023

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape.
Proceedings of the International Conference on Machine Learning, 2023

Improving the Model Consistency of Decentralized Federated Learning.
Proceedings of the International Conference on Machine Learning, 2023

Are Large Kernels Better Teachers than Transformers for ConvNets?
Proceedings of the International Conference on Machine Learning, 2023

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning.
Proceedings of the International Conference on Machine Learning, 2023

Towards One-shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Harnessing Out-Of-Distribution Examples via Augmenting Content and Style.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Enhancing Fine-Tuning based Backdoor Defense with Sharpness-Aware Minimization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Global Balanced Experts for Federated Long-Tailed Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Data Augmented Flatness-aware Gradient Projection for Continual Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Zero-shot Sharpness-Aware Quantization for Pre-trained Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Towards Making the Most of ChatGPT for Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

MetaMix: Towards Corruption-Robust Continual Learning with Temporally Self-Adaptive Data Transformation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Make Landscape Flatter in Differentially Private Federated Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Robust Generalization Against Photon-Limited Corruptions via Worst-Case Sharpness Minimization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning.
Proceedings of the Conference on Lifelong Learning Agents, 2023

Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Evaluating Model-Free Reinforcement Learning toward Safety-Critical Tasks.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

FedABC: Targeting Fair Competition in Personalized Federated Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Offline Quantum Reinforcement Learning in a Conservative Manner.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Informative pairs mining based adaptive metric learning for adversarial domain adaptation.
Neural Networks, 2022

Towards harnessing feature embedding for robust learning with noisy labels.
Mach. Learn., 2022

Stochastic Client Selection for Federated Learning With Volatile Clients.
IEEE Internet Things J., 2022

On Transforming Reinforcement Learning by Transformer: The Development Trajectory.
CoRR, 2022

Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE.
CoRR, 2022

Strength-Adaptive Adversarial Training.
CoRR, 2022

SafeRL-Kit: Evaluating Efficient Reinforcement Learning Methods for Safe Autonomous Driving.
CoRR, 2022

Robust Weight Perturbation for Adversarial Training.
CoRR, 2022

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation.
CoRR, 2022

Robust Unlearnable Examples: Protecting Data Against Adversarial Learning.
CoRR, 2022

Achieving Personalized Federated Learning with Sparse Local Models.
CoRR, 2022

Meta-learning without data via Wasserstein distributionally-robust model fusion.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Enhancing Top-N Item Recommendations by Peer Collaboration.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

DEAL: An Unsupervised Domain Adaptive Framework for Graph-level Classification.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Penalized Proximal Policy Optimization for Safe Reinforcement Learning.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Robust Weight Perturbation for Adversarial Training.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Understanding Robust Overfitting of Adversarial Training and Beyond.
Proceedings of the International Conference on Machine Learning, 2022

Improving Task-free Continual Learning by Distributionally Robust Memory Evolution.
Proceedings of the International Conference on Machine Learning, 2022

Deep Neural Network Fusion via Graph Matching with Applications to Model Ensemble and Federated Learning.
Proceedings of the International Conference on Machine Learning, 2022

DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training.
Proceedings of the International Conference on Machine Learning, 2022

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions.
Proceedings of the Computer Vision - ECCV 2022, 2022

Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning to Learn and Remember Super Long Multi-Domain Task Sequence.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
Quantized Adam with Error Feedback.
ACM Trans. Intell. Syst. Technol., 2021

UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing.
IEEE Trans. Image Process., 2021

Knowledge Distillation With Multi-Objective Divergence Learning.
IEEE Signal Process. Lett., 2021

DGL-GAN: Discriminator Guided Learning for GAN Compression.
CoRR, 2021

Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer.
CoRR, 2021

Federated Causal Discovery.
CoRR, 2021

End-to-End Adaptive Monte Carlo Denoising and Super-Resolution.
CoRR, 2021

Local AdaGrad-Type Algorithm for Stochastic Convex-Concave Minimax Problems.
CoRR, 2021

Sparse Training via Boosting Pruning Plasticity with Neuroregeneration.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

DAG-GAN: Causal Structure Learning with Generative Adversarial Nets.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
MAP Inference Via ℓ <sub>2</sub>-Sphere Linear Program Reformulation.
Int. J. Comput. Vis., 2020

Adaptive Compact Attention For Few-shot Video-to-video Translation.
CoRR, 2020

Task-agnostic Temporally Consistent Facial Video Editing.
CoRR, 2020

Generalized Embedding Machines for Recommender Systems.
CoRR, 2020

A Block Decomposition Algorithm for Sparse Optimization.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
MAP Inference via L2-Sphere Linear Program Reformulation.
CoRR, 2019

Discrete Trust-aware Matrix Factorization for Fast Recommendation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
A Generalized Matrix Splitting Algorithm.
CoRR, 2018

2017
Adaptive Proximal Average Approximation for Composite Convex Minimization.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017


  Loading...