Yu Yang

Orcid: 0000-0002-6591-7704

Affiliations:
  • OpenAI, San Francisco, CA, USA
  • University of California, Los Angeles (UCLA), CA, USA (PhD 2024)


According to our database1, Yu Yang authored at least 31 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
RedCodeAgent: Automatic Red-teaming Agent against Diverse Code Agents.
CoRR, October, 2025

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI.
CoRR, 2024

Memory-efficient Training of LLMs with Larger Mini-batches.
CoRR, 2024

AIR-Bench 2024: A Safety Benchmark Based on Risk Categories from Regulations and Policies.
CoRR, 2024

AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies.
CoRR, 2024

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Few-shot Adaptation to Distribution Shifts By Mixing Source and Target Embeddings.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023
Towards Mitigating Spurious Correlations in the Wild: A Benchmark & a more Realistic Dataset.
CoRR, 2023

Eliminating Spurious Correlations from Pre-trained Models via Data Mixing.
CoRR, 2023

Network Transplanting for the Functionally Modular Architecture.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

Robust Learning with Progressive Data Expansion Against Spurious Correlation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning.
Proceedings of the International Conference on Machine Learning, 2023

Towards Sustainable Learning: Coresets for Data-efficient Deep Learning.
Proceedings of the International Conference on Machine Learning, 2023

NeSSA: Near-Storage Data Selection for Accelerated Machine Learning Training.
Proceedings of the 15th ACM/USENIX Workshop on Hot Topics in Storage and File Systems, 2023

2022
Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attacks.
CoRR, 2022

Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attack.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Not All Poisons are Created Equal: Robust Training against Data Poisoning.
Proceedings of the International Conference on Machine Learning, 2022

2019
Visual graph mining for graph matching.
Comput. Vis. Image Underst., 2019

A Generative Model for Sampling High-Performance and Diverse Weights for Neural Networks.
CoRR, 2019

Unsupervised Learning of Neural Networks to Explain Neural Networks (extended abstract).
CoRR, 2019

Network Transplanting (extended abstract).
CoRR, 2019

Explaining AlphaGo: Interpreting Contextual Effects in Neural Networks.
CoRR, 2019

Interpreting CNNs via Decision Trees.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Unsupervised Learning of Neural Networks to Explain Neural Networks.
CoRR, 2018

Network Transplanting.
CoRR, 2018

Interpreting CNNs via Decision Trees.
CoRR, 2018


  Loading...