Steven Basart

According to our database1, Steven Basart authored at least 18 papers between 2019 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
CoRR, 2024

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning.
CoRR, 2024

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal.
Proceedings of the Forty-first International Conference on Machine Learning, 2024


2023
Representation Engineering: A Top-Down Approach to AI Transparency.
CoRR, 2023

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark.
CoRR, 2023

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark.
Proceedings of the International Conference on Machine Learning, 2023

2022
How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Scaling Out-of-Distribution Detection for Real-World Settings.
Proceedings of the International Conference on Machine Learning, 2022

2021
Towards Robustness of Neural Networks.
CoRR, 2021

Measuring Coding Challenge Competence With APPS.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Measuring Mathematical Problem Solving With the MATH Dataset.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Measuring Massive Multitask Language Understanding.
Proceedings of the 9th International Conference on Learning Representations, 2021

Aligning AI With Shared Human Values.
Proceedings of the 9th International Conference on Learning Representations, 2021

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Natural Adversarial Examples.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2019
A Benchmark for Anomaly Segmentation.
CoRR, 2019

DIODE: A Dense Indoor and Outdoor DEpth Dataset.
CoRR, 2019


  Loading...