Seonglae Cho
Orcid: 0009-0008-2809-5861
According to our database1,
Seonglae Cho authored at least 10 papers
between 2023 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
Control Reinforcement Learning: Interpretable Token-Level Steering of LLMs via Sparse Autoencoder Features.
CoRR, February, 2026
The Confidence Manifold: Geometric Structure of Correctness Representations in Language Models.
CoRR, February, 2026
AgentGraph: Trace-to-Graph Platform for Interactive Analysis and Robustness Testing in Agentic AI Systems.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection.
CoRR, August, 2025
FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Dataset Dependencies.
CoRR, June, 2025
FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Datasets Dependency.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), 2025
LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source AI Libraries.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), 2025
2024
RTSUM: Relation Triple-based Interpretable Summarization with Multi-level Salience Visualization.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations, 2024
2023
RTSUM: Relation Triple-based Interpretable Summarization with Multi-level Salience Visualization.
CoRR, 2023