Valerie Chen

Orcid: 0009-0007-2783-0265

According to our database1, Valerie Chen authored at least 45 papers between 2019 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
RECAP: An End-to-End Platform for Capturing, Replaying, and Analyzing AI-Assisted Programming Interactions.
CoRR, May, 2026

Comparing Developer and LLM Biases in Code Evaluation.
CoRR, March, 2026

RCTs & Human Uplift Studies: Methodological Challenges and Practical Solutions for Frontier AI Evaluation.
CoRR, March, 2026

A Rubric-Supervised Critic from Sparse Real-World Outcomes.
CoRR, March, 2026

How Well Does Agent Development Reflect Real-World Work?
CoRR, March, 2026

GameDevBench: Evaluating Agentic Capabilities Through Game Development.
CoRR, February, 2026

Beyond the Commit: Developer Perspectives on Productivity with AI Coding Assistants.
CoRR, February, 2026

SWE-Tester: Training Open-Source LLMs for Issue Reproduction in Real-World Repositories.
CoRR, January, 2026

Developer Interaction Patterns with Proactive AI: A Five-Day Field Study.
Proceedings of the 31st International Conference on Intelligent User Interfaces, 2026

Coding Agents with Multimodal Browsing are Generalist Problem Solvers.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2026, 2026

Third Workshop on Human-Centered Evaluation and Auditing of Language Models: AI Agents-in-the-Loop.
Proceedings of the Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems, 2026

Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows.
Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems, 2026

2025
EDIT-Bench: Evaluating LLM Abilities to Perform Real-World Instructed Code Edits.
CoRR, November, 2025

The OpenHands Software Agent SDK: A Composable and Extensible Foundation for Production Agents.
CoRR, November, 2025

Completion ≠ Collaboration: Scaling Collaborative Effort with Agents.
CoRR, October, 2025

TOM-SWE: User Mental Modeling For Software Engineering Agents.
CoRR, October, 2025

How can we assess human-agent interactions? Case studies in software agent design.
CoRR, October, 2025

Why Do Decision Makers (Not) Use AI? A Cross-Domain Analysis of Factors Impacting AI Adoption.
CoRR, August, 2025

Beyond Memorization: Mapping the Originality-Quality Frontier of Language Models.
CoRR, April, 2025

The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers.
Trans. Mach. Learn. Res., 2025

CodingGenie: A Proactive LLM-Powered Programming Assistant.
Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering, 2025

Copilot Arena: A Platform for Code LLM Evaluation in the Wild.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Need Help? Designing Proactive AI Assistants for Programming.
Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, 2025

When Benchmarks Talk: Re-Evaluating Code LLMs with Interactive Feedback.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Learning Personalized Decision Support Policies.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans.
Trans. Mach. Learn. Res., 2024

Do LLMs Exhibit Human-like Response Biases? A Case Study in Survey Design.
Trans. Assoc. Comput. Linguistics, 2024

Modulating Language Model Experiences through Frictions.
CoRR, 2024

On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Perspectives on incorporating expert feedback into model updates.
Patterns, July, 2023

Assisting Human Decisions in Document Matching.
Trans. Mach. Learn. Res., 2023

Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations.
Proc. ACM Hum. Comput. Interact., 2023

Do LLMs exhibit human-like response biases? A case study in survey design.
CoRR, 2023

A Case Study on Designing Evaluations of ML Explanations with Simulated User Studies.
CoRR, 2023

FeedbackLogs: Recording and Incorporating Stakeholder Feedback into Machine Learning Pipelines.
Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, 2023

Are Model Explanations Useful in Practice? Rethinking How to Support Human-ML Interactions.
Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 2023

2022
Interpretable machine learning: moving from mythos to diagnostics.
Commun. ACM, 2022

Bayesian Persuasion for Algorithmic Recourse.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Use-Case-Grounded Simulations for Explanation Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
Towards Connecting Use Cases and Methods in Interpretable Machine Learning.
CoRR, 2021

Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
Task-Aware Novelty Detection for Visual-based Deep Learning in Autonomous Systems.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

2019
Secure Computation for Machine Learning With SPDZ.
CoRR, 2019

Video-Text Compliance: Activity Verification Based on Natural Language Instructions.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Novelty Detection via Network Saliency in Visual-Based Deep Learning.
Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2019


  Loading...