Haiyan Zhao
Orcid: 0009-0006-5358-6895Affiliations:
- New Jersey Institute of Technology, Newark, NJ, USA
  According to our database1,
  Haiyan Zhao
  authored at least 15 papers
  between 2024 and 2025.
  
  
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
- 
    on orcid.org
On csauthors.net:
Bibliography
  2025
    CoRR, September, 2025
    
  
Denoising Concept Vectors with Sparse Autoencoders for Improved Language Model Steering.
    
  
    CoRR, May, 2025
    
  
Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders.
    
  
    CoRR, May, 2025
    
  
A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models.
    
  
    CoRR, March, 2025
    
  
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models.
    
  
    CoRR, February, 2025
    
  
Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability.
    
  
    CoRR, January, 2025
    
  
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution.
    
  
    Proceedings of the Thirteenth International Conference on Learning Representations, 2025
    
  
Exploring Concept Depth: How Large Language Models Acquire Knowledge and Concept at Different Layers?
    
  
    Proceedings of the 31st International Conference on Computational Linguistics, 2025
    
  
  2024
    ACM Trans. Intell. Syst. Technol., April, 2024
    
  
    CoRR, 2024
    
  
Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
    
  
    CoRR, 2024
    
  
    CoRR, 2024
    
  
Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability.
    
  
    CoRR, 2024
    
  
    Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
    
  
    Proceedings of the Findings of the Association for Computational Linguistics, 2024