Eric Wong

Orcid: 0000-0002-8568-6659

Affiliations:
  • University of Pennsylvania, Department of Computer and Information Science, Philadelphia, PA, USA
  • Massachusetts Institute of Technology (MIT), CSAIL, Cambridge, MA, USA (former)
  • Carnegie Mellon University, Machine Learning Department, Pittsburgh, PA, USA (former, PhD 2020)


According to our database1, Eric Wong authored at least 56 papers between 2015 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Probabilistic Soundness Guarantees in LLM Reasoning Chains.
CoRR, July, 2025

Instruction Following by Boosting Attention of Large Language Models.
CoRR, June, 2025

Benchmarking Misuse Mitigation Against Covert Adversaries.
CoRR, June, 2025

The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models.
CoRR, May, 2025

Probabilistic Stability Guarantees for Feature Attributions.
CoRR, April, 2025

CTSketch: Compositional Tensor Sketching for Scalable Neurosymbolic Learning.
CoRR, March, 2025

NSF-SciFy: Mining the NSF Awards Database for Scientific Claims.
CoRR, March, 2025

Adaptively evaluating models with task elicitation.
CoRR, March, 2025

Where's the Bug? Attention Probing for Scalable Fault Localization.
CoRR, February, 2025

SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks.
Trans. Mach. Learn. Res., 2025

Jailbreaking Black Box Large Language Models in Twenty Queries.
Proceedings of the IEEE Conference on Secure and Trustworthy Machine Learning, 2025

Avoiding Copyright Infringement via Large Language Model Unlearning.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Towards Style Alignment in Cross-Cultural Translation.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
TorchQL: A Programming Framework for Integrity Constraints in Machine Learning.
Proc. ACM Program. Lang., 2024

Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning.
CoRR, 2024

The FIX Benchmark: Extracting Features Interpretable to eXperts.
CoRR, 2024

Avoiding Copyright Infringement via Machine Unlearning.
CoRR, 2024

Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing.
CoRR, 2024

Data-Efficient Learning with Neural Programs.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

AR-Pro: Counterfactual Explanations for Anomaly Repair with Formal Properties.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Towards Compositionality in Concept Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Evaluating Groups of Features via Consistency, Contiguity, and Stability.
Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Initialization Matters for Adversarial Transfer Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Sum-of-Parts Models: Faithful Attributions for Groups of Features.
CoRR, 2023

MDB: Interactively Querying Datasets and Models.
CoRR, 2023

Rectifying Group Irregularities in Explanations for Distribution Shift.
CoRR, 2023

Do Machine Learning Models Learn Common Sense?
CoRR, 2023

In-context Example Selection with Influences.
CoRR, 2023

Adversarial Prompting for Black Box Foundation Models.
CoRR, 2023

Adversarial robustness in discontinuous spaces via alternating sampling & descent.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Stability Guarantees for Feature Attributions with Multiplicative Smoothing.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Faithful Chain-of-Thought Reasoning.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

Do Machine Learning Models Learn Statistical Rules Inferred from Data?
Proceedings of the International Conference on Machine Learning, 2023

TopEx: Topic-based Explanations for Model Comparison.
Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

Comparing Styles across Languages.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

A Data-Based Perspective on Transfer Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
When does Bias Transfer in Transfer Learning?
CoRR, 2022

Missingness Bias in Model Debugging.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Certified Patch Robustness via Smoothed Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Provable, Structured, and Efficient Methods for Robustness of Deep Networks to Adversarial Examples.
PhD thesis, 2021

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting.
CoRR, 2021

Leveraging Sparse Linear Layers for Debuggable Deep Networks.
Proceedings of the 38th International Conference on Machine Learning, 2021

Learning perturbation sets for robust machine learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
Neural Network Virtual Sensors for Fuel Injection Quantities with Provable Performance Specifications.
Proceedings of the IEEE Intelligent Vehicles Symposium, 2020

Overfitting in adversarially robust deep learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Adversarial Robustness Against the Union of Multiple Perturbation Models.
Proceedings of the 37th International Conference on Machine Learning, 2020

Fast is better than free: Revisiting adversarial training.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Wasserstein Adversarial Examples via Projected Sinkhorn Iterations.
Proceedings of the 36th International Conference on Machine Learning, 2019

2018
Scaling provable adversarial defenses.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
A Semismooth Newton Method for Fast, Generic Convex Programming.
Proceedings of the 34th International Conference on Machine Learning, 2017

2015
An SVD and Derivative Kernel Approach to Learning from Geometric Data.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015


  Loading...