Chokchai Leangsuksun
According to our database1, Chokchai Leangsuksun
Timeline
Legend:
Book In proceedings Article PhD thesis OtherLinks
Homepage:
On csauthors.net:
Bibliography
2016
A failure index for HPC applications.
J. Parallel Distrib. Comput., 2016
2014
Reliability-aware performance model for optimal GPU-enabled cluster environment.
The Journal of Supercomputing, 2014
2013
Reliability model of a system of k nodes with simultaneous failures for high-performance computing applications.
IJHPCA, 2013
2012
A Reliability Model for Cloud Computing for High Performance Computing Applications.
Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012
Workshop on Resiliency in High Performance Computing (Resilience) in Clusters, Clouds, and Grids.
Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012
An Economic Model for Maximizing Profit of a Cloud Service Provider.
Proceedings of the Seventh International Conference on Availability, 2012
2011
Baler: deterministic, lossless log message clustering tool.
Computer Science - R&D, 2011
High Availability on Cloud with HA-OSCAR.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011
Workshop on Resiliency in High Performance Computing (Resilience) in Clusters, Clouds, and Grids.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011
Framework for Enabling System Understanding.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011
Two-level checkpoint/restart modeling for GPGPU.
Proceedings of the 9th IEEE/ACS International Conference on Computer Systems and Applications, 2011
2010
Reliability of a System of k Nodes for High Performance Computing Applications.
IEEE Trans. Reliability, 2010
Incremental Checkpoint Schemes for Weibull Failure Distribution.
Int. J. Found. Comput. Sci., 2010
Proficiency Metrics for Failure Prediction in High Performance Computing.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2010
Benefits of Software Rejuvenation on HPC Systems.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2010
2009
A tunable holistic resiliency approach for high-performance computing systems.
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009
HPC failure prediction proficiency metrics.
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009
VCCP: A transparent, coordinated checkpointing system for virtualization-based cluster computing.
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009
Blue Gene/L Log Analysis and Time to Interrupt Estimation.
Proceedings of the The Forth International Conference on Availability, 2009
2008
An optimal checkpoint/restart model for a large scale high performance computing system.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Reliability-Aware Approach: An Incremental Checkpoint/Restart Model in HPC Environments.
Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008
Symmetric Active/Active High Availability for High-Performance Computing System Services: Accomplishments and Limitations.
Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008
A Framework for Proactive Fault Tolerance.
Proceedings of the The Third International Conference on Availability, 2008
Symmetric Active/Active Replication for Dependent Services.
Proceedings of the The Third International Conference on Availability, 2008
2007
Evaluation of fault-tolerant policies using simulation.
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007
A reliability-aware approach for an optimal checkpoint/restart model in HPC environments.
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007
Reliability-aware resource allocation in HPC systems.
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007
Transparent Symmetric Active/Active Replication for Service-Level High Availability.
Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007
On Programming Models for Service-Level High Availability.
Proceedings of the The Second International Conference on Availability, 2007
2006
MOLAR: adaptive runtime support for high-end computing operating and runtime systems.
Operating Systems Review, 2006
Symmetric Active/Active High Availability for High-Performance Computing System Services.
JCP, 2006
Availability Modeling and Evaluation on High Performance Cluster Computing Systems.
Journal of Research and Practice in Information Technology, 2006
Policy-Based Access Control Framework for Grid Computing.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006
IPMI-based Efficient Notification Framework for Large Scale Cluster Computing.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006
Work in Progress: RASS Framework for a Cluster-Aware SELinux.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006
A Novel Computational Framework for Fast Distributed Computing and Knowledge Integration for Microarray Gene Expression Data Analysis.
Proceedings of the 20th International Conference on Advanced Information Networking and Applications (AINA 2006), 2006
Availability Modeling and Analysis on High Performance Cluster Computing Systems.
Proceedings of the The First International Conference on Availability, 2006
Active/Active Replication for Highly Available HPC System Services.
Proceedings of the The First International Conference on Availability, 2006
2005
Achieving high availability and performance computing with an HA-OSCAR cluster.
Future Generation Comp. Syst., 2005
Performance of an Operating High Energy Physics Data Grid: D0SAR-Grid
CoRR, 2005
UML-based Beowulf Cluster Availability Modeling.
Proceedings of the International Conference on Software Engineering Research and Practice, 2005
OOMSE-An Object Oriented Markov Chain Specification and Evaluation Framework.
Proceedings of the 17th International Conference on Software Engineering and Knowledge Engineering (SEKE'2005), 2005
Grid-Aware HA-OSCAR.
Proceedings of the 19th Annual International Symposium on High Performance Computing Systems and Applications (HPCS 2005), 2005
Reliability-aware resource management for computational grid/cluster environments.
Proceedings of the 6th IEEE/ACM International Conference on Grid Computing (GRID 2005), 2005
Reliability-aware Checkpoint/Restart Scheme: A Performability Trade-off.
Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005
Job-Site Level Fault Tolerance for Cluster and Grid environments.
Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005
Feasibility study and early experimental results towards cluster survivability.
Proceedings of the 5th International Symposium on Cluster Computing and the Grid (CCGrid 2005), 2005
A light-weight solution for large sparse Markov processes.
Proceedings of the 43nd Annual Southeast Regional Conference, 2005
A framework for cluster availability specification and evaluation.
Proceedings of the 43nd Annual Southeast Regional Conference, 2005
2004
Highly Reliable Linux HPC Clusters: Self-Awareness Approach.
Proceedings of the Parallel and Distributed Processing and Applications, 2004
Building highly available HPC clusters with HA-OSCAR.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004
2003
Reliability Modeling Using UML.
Proceedings of the International Conference on Software Engineering Research and Practice, 2003
Dependability Prediction of High Availability OSCAR Cluster Server.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2003
Availability Prediction and Modeling of High Availability OSCAR Cluster.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003
2000
The Enhanced Service Manager: A service management system for next-generation networks.
Bell Labs Technical Journal, 2000
1994
ASC: An Associative-Computing Paradigm.
IEEE Computer, 1994
A Task Graph Centroid.
Proceedings of the Third International Symposium on High Performance Distributed Computing, 1994