Yingnong Dang

Orcid: 0009-0006-0184-9681

According to our database1, Yingnong Dang authored at least 54 papers between 2011 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Why does Prediction Accuracy Decrease over Time? Uncertain Positive Learning for Cloud Failure Prediction.
CoRR, 2024

UniLog: Automatic Logging via LLM and In-Context Learning.
Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, 2024

2023
Xpert: Empowering Incident Management with Query Recommendations via Large Language Models.
CoRR, 2023

EDITS: An Easy-to-difficult Training Strategy for Cloud Failure Prediction.
Proceedings of the Companion Proceedings of the ACM Web Conference 2023, 2023

Assess and Summarize: Improve Outage Understanding with Large Language Models.
Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023

Aegis: Attribution of Control Plane Change Impact across Layers and Components for Cloud Systems.
Proceedings of the 45th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2023

CONAN: Diagnosing Batch Failures for Cloud Systems.
Proceedings of the 45th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2023

Towards Lightweight, Model-Agnostic and Diversity-Aware Active Anomaly Detection.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
An Intelligent Framework for Timely, Accurate, and Comprehensive Cloud Incident Detection.
ACM SIGOPS Oper. Syst. Rev., 2022

UniParser: A Unified Log Parser for Heterogeneous Log Data.
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022

SPINE: a scalable log parser with feedback guidance.
Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2022

An empirical investigation of missing data handling in cloud node failure prediction.
Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2022

An empirical study of log analysis at Microsoft.
Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2022

RESIN: A Holistic Service for Dealing with Memory Leaks in Production Cloud Infrastructure.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

NENYA: Cascade Reinforcement Learning for Cost-Aware Failure Mitigation at Microsoft 365.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Multi-task Hierarchical Classification for Disk Failure Prediction in Online Service Systems.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

2021
NTAM: Neighborhood-Temporal Attention Model for Disk Failure Prediction in Cloud Platforms.
Proceedings of the WWW '21: The Web Conference 2021, 2021

Onion: identifying incident-indicating logs for cloud systems.
Proceedings of the ESEC/FSE '21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021

HALO: Hierarchy-aware Fault Localization for Cloud Systems.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

How Long Will it Take to Mitigate this Incident for Online Service Systems?
Proceedings of the 32nd IEEE International Symposium on Software Reliability Engineering, 2021

CARE: Infusing Causal Aware Thinking to Root Cause Analysis in Cloud System.
Proceedings of the HAOC '21: Proceedings of the 1st Workshop on High Availability and Observability of Cloud Systems, 2021

2020
Breaking hypothesis testing for failure rates.
CoRR, 2020

How to mitigate the incident? an effective troubleshooting guide recommendation technique for online service systems.
Proceedings of the ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020

Efficient incident identification from multi-dimensional issue reports via meta-heuristic search.
Proceedings of the ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020

Towards intelligent incident management: why we need it and how we make it.
Proceedings of the ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020

Predictive and Adaptive Failure Mitigation to Avert Production Cloud VM Interruptions.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Gandalf: An Intelligent, End-To-End Analytics Service for Safe Deployment in Large-Scale Cloud Infrastructure.
Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation, 2020

How Incidental are the Incidents? Characterizing and Prioritizing Incidents for Large-Scale Online Service Systems.
Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020

2019
Annual Interruption Rate as a KPI, its measurement and comparison.
CoRR, 2019

Outage Prediction and Diagnosis for Cloud Service Systems.
Proceedings of the World Wide Web Conference, 2019

Cross-dataset Time Series Anomaly Detection for Cloud Systems.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Robust log-based anomaly detection on unstable log data.
Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019

Continuous Incident Triage for Large-Scale Online Service Systems.
Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019

AIOps: real-world challenges and research innovations.
Proceedings of the 41st International Conference on Software Engineering: Companion Proceedings, 2019

An empirical investigation of incident triage for online service systems.
Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice, 2019

Neural Feature Search: A Neural Architecture for Automated Feature Engineering.
Proceedings of the 2019 IEEE International Conference on Data Mining, 2019

2018
Improving Service Availability of Cloud Systems by Predicting Disk Error.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

Predicting Node failure in cloud service systems.
Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018

Capturing and Enhancing In Situ System Observability for Failure Detection.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Deepview: Virtual Disk Failure Diagnosis and Pattern Detection for Azure.
Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation, 2018

2017
Transferring Code-Clone Detection and Analysis to Practice.
Proceedings of the 39th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice Track, 2017

Gray Failure: The Achilles' Heel of Cloud-Scale Systems.
Proceedings of the 16th Workshop on Hot Topics in Operating Systems, 2017

2015
YADING: Fast Clustering of Large-Scale Time Series Data.
Proc. VLDB Endow., 2015

Pingmesh: A Large-Scale System for Data Center Network Latency Measurement and Analysis.
Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 2015

2014
Predicting Consistency-Maintenance Requirement of Code Clonesat Copy-and-Paste Time.
IEEE Trans. Software Eng., 2014

2013
Software Analytics in Practice.
IEEE Softw., 2013

Mining succinct and high-coverage API usage patterns from source code.
Proceedings of the 10th Working Conference on Mining Software Repositories, 2013

2012
How do software engineers understand code changes?: an exploratory study in industry.
Proceedings of the 20th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-20), 2012

Can I clone this piece of code here?
Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, 2012

Performance debugging in the large via mining millions of stack traces.
Proceedings of the 34th International Conference on Software Engineering, 2012

ReBucket: A method for clustering duplicate crash reports based on call stack similarity.
Proceedings of the 34th International Conference on Software Engineering, 2012

Teaching and Training for Software Analytics.
Proceedings of the 25th IEEE Conference on Software Engineering Education and Training, 2012

XIAO: tuning code clones at hands of engineers in practice.
Proceedings of the 28th Annual Computer Security Applications Conference, 2012

2011
Code clone detection experience at microsoft.
Proceedings of the Proceeding of the 5th ICSE International Workshop on Software Clones, 2011


  Loading...