2025
-
TrainVerify: Equivalence-Based Verification for Distributed LLM Training
Yunchi Lu, Youshan Miao, Cheng Tan, Peng Huang, Yi Zhu, Xian Zhang, Fan Yang
SOSP 2025 citation
slides software arXiv
-
Optimistic Recovery for High-Availability Software via Partial Process State Preservation
Yuzhuo Jing, Yuqi Mai, Angting Cai, Yi Chen, Wanning He, Xiaoyang Qian, Peter M. Chen, Peng Huang
SOSP 2025 citation
slides software
-
Mitigating Application Resource Overload with Targeted Task Cancellation
Yigong Hu, Zeyin Zhang, Yicheng Liu, Yile Gu, Shuangyu Lei, Baris Kasikci, Peng Huang
SOSP 2025 citation
slides software
-
Training with Confidence: Catching Silent Errors in Deep Learning Training with Automated Proactive Checks
Yuxuan Jiang, Ziming Zhou, Boyu Xu, Beijie Liu, Runhui Xu, Peng Huang
OSDI 2025 citation
slides software arXiv
-
Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed Systems
Chang Lou, Dimas Shidqi Parikesit, Yujin Huang, Zhewen Yang, Senapati Diwangkara, Yuzhuo Jing, Achmad Imam Kistijantoro, Ding Yuan, Suman Nath, Peng Huang
OSDI 2025 citation software
-
One-Size-Fits-None: Understanding and Enhancing Slow-Fault Tolerance in Modern Distributed Systems
Ruiming Lu, Yunchi Lu, Yuxuan Jiang, Guangtao Xue, Peng Huang
NSDI 2025 citation
slides software
2024
2023
-
Simplifying Cloud Management with Cloudless Computing
Yiming Qiu, Patrick Tser Jern Kon, Jiarong Xing, Yibo Huang, Hongyi Liu, Xinyu Wang, Peng Huang, Mosharaf Chowdhury, Ang Chen
HotNets 2023 citation
-
Pushing Performance Isolation Boundaries into Application with pBox
Yigong Hu, Gongqi Huang, Peng Huang
SOSP 2023 citation
slides software
-
Effective Performance Issue Diagnosis with Value-Assisted Cost Profiling
Lingmei Weng, Yigong Hu, Peng Huang, Jason Nieh, Junfeng Yang
EuroSys 2023 citation
slides software
2022
-
Operating System Support for Safe and Efficient Auxiliary Execution
Yuzhuo Jing, Peng Huang
OSDI 2022 citation
slides software
-
Demystifying and Checking Silent Semantic Violations in Large Distributed Systems
Chang Lou, Yuzhuo Jing, Peng Huang
OSDI 2022 citation
slides software
-
RESIN: A Holistic Service for Dealing with Memory Leaks in Production Cloud Infrastructure
Chang Lou, Cong Chen, Peng Huang, Yingnong Dang, Si Qin, Xinsheng Yang, Xukun Li, Qingwei Lin, Murali Chintalapati
OSDI 2022 citation
slides
2021
2020
-
Automated Reasoning and Detection of Specious Configuration in Large Systems with Symbolic Execution
Yigong Hu, Gongqi Huang, Peng Huang
OSDI 2020 citation
slides software tech report
-
Predictive and Adaptive Failure Mitigation to Avert Production Cloud VM Interruptions
Sebastien Levy, Randolph Yao, Youjiang Wu, Yingnong Dang, Peng Huang, Zheng Mu, Pu Zhao, Tarun Ramani, Naga Govindaraju, Xukun Li, Qingwei Lin, Gil Lapid Shafriri, Murali Chintalapati
OSDI 2020 citation tech report
-
Understanding, Detecting and Localizing Partial Failures in Large System Software [Best Paper Award]
Chang Lou, Peng Huang, Scott Smith
NSDI 2020 citation
slides
-
Gandalf: An Intelligent, End-To-End Analytics Service for Safe Deployment in Large-Scale Cloud Infrastructure
Ze Li, Qian Cheng, Ken Hsieh, Yingnong Dang, Peng Huang, Pankaj Singh, Xinsheng Yang, Qingwei Lin, Youjiang Wu, Sebastien Levy, Murali Chintalapati
NSDI 2020 citation
-
Scaling Performance Issue Detection and Diagnosis in Cloud Infrastructures
Yigong Hu, Ze Li, Peng Huang, Suhas Pinnamaneni, Francis David, Yingnong Dang, Murali Chintalapati
AAAI-20 Workshop on Cloud Intelligence citation
2019
-
Comprehensive and Efficient Runtime Checking in System Software through Watchdogs
Chang Lou, Peng Huang, Scott Smith
HotOS 2019 citation
slides
-
URSA: Hybrid Block Storage for Cloud-Scale Virtual Disks
Huiba Li, Yiming Zhang, Dongsheng Li, Zhiming Zhang, Shengyun Liu, Peng Huang, Zheng Qin, Kai Chen, Yongqiang Xiong
EuroSys 2019 citation
-
A Case for Lease-Based, Utilitarian Resource Management on Mobile Devices [Best Paper Award]
Yigong Hu, Suyi Liu, Peng Huang
ASPLOS 2019 citation
slides software
-
AIOps: Real-World Challenges and Research Innovations
Yingnong Dang, Qingwei Lin, Peng Huang
ICSE 2019 Technical Briefings citation
2018
-
Capturing and Enhancing In Situ System Observability for Failure Detection
Peng Huang, Chuanxiong Guo, Jacob R. Lorch, Lidong Zhou, Yingnong Dang
OSDI 2018 citation
slides software
-
End-to-End Automated Exploit Generation for Validating the Security of Processor Designs [Best Paper Candidate]
Rui Zhang, Calvin Deutschbein, Peng Huang, Cynthia Sturton
MICRO 2018 citation
-
TerseCades: Efficient Data Compression in Stream Processing
Gennady Pekhimenko, Chuanxiong Guo, Myeongjae Jeon, Peng Huang, Lidong Zhou
USENIX ATC 2018 citation
2017
2016
-
Early Detection of Configuration Errors to Reduce Failure Damage [Best Paper Award]
Tianyin Xu, Xinxin Jin, Peng Huang, Yuanyuan Zhou, Shan Lu, Long Jin, Shankar Pasupathy
OSDI 2016 citation
-
DefDroid: Towards a More Defensive Mobile OS Against Disruptive App Behavior
Peng Huang, Tianyin Xu, Xinxin Jin, Yuanyuan Zhou
MobiSys 2016 citation
slides
video website
-
Saving Mobile App Developers from Network Disruptions
Xinxin Jin, Peng Huang, Tianyin Xu, Yuanyuan Zhou
EuroSys 2016 citation
2015
2014
2013
-
Do Not Blame Users for Misconfigurations
Tianyin Xu, Jiaqi Zhang, Peng Huang, Jing Zheng, Tianwei Sheng, Ding Yuan, Yuanyuan Zhou, Shankar Pasupathy
SOSP 2013 citation
-
eDoctor: Automatically Diagnosing Abnormal Battery Drain Issues on Smartphones
Xiao Ma, Peng Huang, Xinxin Jin, Pei Wang, Soyeon Park, Yuanyuan Zhou, Lawrence K. Saul, Geoffrey M. Voelker
NSDI 2013 citation
-
Understanding Latent Interactions in Online Social Networks
Jing Jiang, Christo Wilson, Xiao Wang, Wenpeng Sha, Peng Huang, Yafei Dai, Ben Y. Zhao
TWEB 7(4), Oct. 2013 citation
2012
2010
-
Understanding Latent Interactions in Online Social Networks
Jing Jiang, Christo Wilson, Xiao Wang, Peng Huang, Wenpeng Sha, Yafei Dai, Ben Y. Zhao
IMC 2010 citation
-
A Multiple User Sharing Behaviors Based Approach for Fake File Detection in P2P Environments
Jing Jiang, Yongjun Li, Qinyuan Feng, Peng Huang, Yafei Dai
SCIS 53(11), Nov. 2010 citation