Xipeng Shen

Yan Solihin

Proceedings of the 2021 International Symposium on Secure and Private Execution Environment Design (SEED), 2021

Brief Industry Paper: Towards Real-Time 3D Object Detection for Autonomous Vehicles with Pruning Search.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE Real-Time and Embedded Technology and Applications Symposium, 2021

Exploring deep reuse in winograd CNN inference.

[BibT_eX]

[DOI]

Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

Understanding and bridging the gaps in current GNN performance optimizations.

[BibT_eX]

[DOI]

Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

HPCFAIR: Enabling FAIR AI for HPC Applications.

[BibT_eX]

[DOI]

Tristan Vanderbruggen

Barbara M. Chapman

Proceedings of the IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2021

HPC Ontology: Towards a Unified Ontology for Managing Training Datasets and AI Models for High-Performance Computing.

[BibT_eX]

[DOI]

Chunhua Liao

Pei-Hung Lin

Gaurav Verma

Tristan Vanderbruggen

Murali Emani

Zifan Nan

Proceedings of the IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2021

PCCS: Processor-Centric Contention-aware Slowdown Model for Heterogeneous System-on-Chips.

[BibT_eX]

[DOI]

Yuanchao Xu

Mehmet Esat Belviranli

Jeffrey S. Vetter

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Supporting Legacy Libraries on Non-Volatile Memory: A User-Transparent Approach.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Revisit the Scalability of Deep Auto-Regressive Models for Graph Generation.

[BibT_eX]

[DOI]

Seung-Hwan Lim

Proceedings of the International Joint Conference on Neural Networks, 2021

Simple Augmentation Goes a Long Way: ADRL for DNN Quantization.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Recurrent Neural Networks Meet Context-Free Grammar: Two Birds with One Stone.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Data Mining, 2021

G-TADOC: Enabling Efficient GPU-Based Text Analytics without Decompression.

[BibT_eX]

[DOI]

Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Hardware-Based Address-Centric Acceleration of Key-Value Store.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

Best-Effort Lazy Evaluation for Python Software Built on APIs.

[BibT_eX]

[DOI]

Guoqiang Zhang

Proceedings of the 35th European Conference on Object-Oriented Programming, 2021

Deep NLP-based co-evolvement for synthesizing code analysis from natural language.

[BibT_eX]

[DOI]

Proceedings of the CC '21: 30th ACM SIGPLAN International Conference on Compiler Construction, 2021

RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Enabling Runtime SpMV Format Selection through an Overhead Conscious Method.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2020

DIAC: An Inter-app Conflicts Detector for Open IoT Systems.

[BibT_eX]

[DOI]

Xinyi Li

Lei Zhang

ACM Trans. Embed. Comput. Syst., 2020

Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device.

[BibT_eX]

[DOI]

CoRR, 2020

Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices.

[BibT_eX]

[DOI]

CoRR, 2020

CoCoPIE: Making Mobile AI Sweet As PIE -Compression-Compilation Co-Design Goes a Long Way.

[BibT_eX]

[DOI]

CoRR, 2020

Special Issue: Graph Computing.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2020

HISyn: human learning-inspired natural language programming.

[BibT_eX]

[DOI]

Zifan Nan

Proceedings of the ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020

FLEET: Flexible Efficient Ensemble Training for Heterogeneous Deep Neural Networks.

[BibT_eX]

[DOI]

Laxmikant Kishor Mokadam

Seung-Hwan Lim

Robert M. Patton

Proceedings of the Third Conference on Machine Learning and Systems, 2020

Hardware-Based Domain Virtualization for Intra-Process Isolation of Persistent Memory Objects.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

HARP: holistic analysis for refactoring Python-based analytics programs.

[BibT_eX]

[DOI]

Proceedings of the ICSE '20: 42nd International Conference on Software Engineering, Seoul, South Korea, 27 June, 2020

MKPipe: a compiler framework for optimizing multi-kernel workloads in OpenCL for FPGA.

[BibT_eX]

[DOI]

Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

Enabling Efficient Random Access to Hierarchically-Compressed Data.

[BibT_eX]

[DOI]

Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

MERR: Improving Security of Persistent Memory Objects via Efficient Memory Exposure Reduction and Randomization.

[BibT_eX]

[DOI]

Yuanchao Xu

Yan Solihin

Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

GOPipe: A Granularity-Oblivious Programming Framework for Pipelined Stencil Executions on GPU.

[BibT_eX]

[DOI]

Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019

Wootz: a compiler-based framework for fast CNN pruning via composability.

[BibT_eX]

[DOI]

Seung-Hwan Lim

Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019

In-Place Zero-Space Memory Protection for CNN.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

IA-graph based inter-app conflicts detection in open IoT systems.

[BibT_eX]

[DOI]

Xinyi Li

Lei Zhang

Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, 2019

Deep reuse: streamline CNN inference on the fly via coarse-grained computation reuse.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Supercomputing, 2019

Adaptive Deep Reuse: Accelerating CNN Training on the Fly.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

Streamline Density Peak Clustering for Practical Adoptions.

[BibT_eX]

[DOI]

Min Chi

Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

HiWayLib: A Software Framework for Enabling High Performance Communications for Heterogeneous Pipeline Computations.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018

Efficient Document Analytics on Compressed Data: Method, Challenges, Algorithms, Insights.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2018

Editorial for the Special Issue on In-Memory Computing.

[BibT_eX]

[DOI]

Róbert Lovas

Xiaofei Liao

J. Parallel Distributed Comput., 2018

Resolving the GPU responsiveness dilemma through program transformations.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2018

Hyperparameter Optimization for Effort Estimation.

[BibT_eX]

[DOI]

CoRR, 2018

Why Software Effort Estimation Needs SBSE.

[BibT_eX]

[DOI]

CoRR, 2018

Exploring flexible communications for streamlining DNN ensemble training pipelines.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2018

Bridging the gap between deep learning and sparse matrix format selection.

[BibT_eX]

[DOI]

Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

Footprint modeling of cache associativity and granularity.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Memory Systems, 2018

Overhead-Conscious Format Selection for SpMV-Based Applications.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Taming the "Monster": Overcoming Program Optimization Challenges on SW26010 Through Precise Performance Modeling.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Supercomputing, 2018

LEEM: Lean Elastic EM for Gaussian Mixture Model via Bounds-Based Filtering.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Data Mining, 2018

Reuse-Centric K-Means Configuration.

[BibT_eX]

[DOI]

Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

FALCON: A Fast Drop-In Replacement of Citation KNN for Multiple Instance Learning.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018

Rethinking compilers in the rise of machine learning and AI (keynote).

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Compiler Construction, 2018

2017

Optimizing Data Placement on GPU Memory: A Portable Approach.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2017

GLORE: generalized loop redundancy elimination upon LER-notation.

[BibT_eX]

[DOI]

Yufei Ding

Proc. ACM Program. Lang., 2017

Understanding co-run performance on CPU-GPU integrated processors: observations, insights, directions.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2017

Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processing.

[BibT_eX]

[DOI]

Hamid Krim

Proceedings of the International Conference for High Performance Computing, 2017

POSTER: An Infrastructure for HPC Knowledge Sharing and Reuse.

[BibT_eX]

[DOI]

Yue Zhao

Chunhua Liao

Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

EffiSha: A Software Framework for Enabling Effficient Preemptive Scheduling of GPU.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

Generalizations of the theory and deployment of triangular inequality for compiler-based strength reduction.

[BibT_eX]

[DOI]

Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2017

Versapipe: a versatile programming framework for pipelined computing on GPU.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Efficient support of position independence on non-volatile memory.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Bridging the gap between memory performance and massive parallelism: the critical role of programming systems innovations (keynote).

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM SIGPLAN International Symposium on Memory Management, 2017

Co-Run Scheduling with Power Cap on Integrated CPU-GPU Systems.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

LCD: A Fast Contrastive Divergence Based Algorithm for Restricted Boltzmann Machine.

[BibT_eX]

[DOI]

Randall Pittman

Proceedings of the 2017 IEEE International Conference on Data Mining, 2017

Sweet KNN: An Efficient KNN on GPU through Reconciliation between Redundancy Removal and Regularity.

[BibT_eX]

[DOI]

Guoyang Chen

Yufei Ding

Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

POSTER: Bridging the Gap Between Deep Learning and Sparse Matrix Format Selection.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

POSTER: Cutting the Fat: Speeding Up RBM for Fast Deep Learning Through Generalized Redundancy Elimination.

[BibT_eX]

[DOI]

Randall Pittman

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016

Examining and Reducing the Influence of Sampling Errors on Feedback-Driven Optimizations.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2016

Tuning for software analytics: Is it really necessary?

[BibT_eX]

[DOI]

Wei Fu

Tim Menzies

Inf. Softw. Technol., 2016

Data-centric combinatorial optimization of parallel code.

[BibT_eX]

[DOI]

Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016

Coherence-Free Multiview: Enabling Reference-Discerning Data Placement on GPU.

[BibT_eX]

[DOI]

Guoyang Chen

Proceedings of the 2016 International Conference on Supercomputing, 2016

Towards Ontology-Based Program Analysis.

[BibT_eX]

[DOI]

Proceedings of the 30th European Conference on Object-Oriented Programming, 2016

The workshop on compiler-driven performance.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual International Conference on Computer Science and Software Engineering, 2016

OpenCL-based erasure coding on heterogeneous architectures.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

2015

TOP: A Framework for Enabling Algorithmic Optimizations for Distance-Related Problems.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2015

Enabling Portable Optimizations of Data Placement on GPU.

[BibT_eX]

[DOI]

IEEE Micro, 2015

Enhancing domain specific language implementations through ontology.

[BibT_eX]

[DOI]

Proceedings of the 5th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, 2015

Autotuning algorithmic choice for input sensitivity.

[BibT_eX]

[DOI]

Yufei Ding

Jason Ansel

Kalyan Veeramachaneni

Una-May O'Reilly

Saman P. Amarasinghe

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2015

Free launch: optimizing GPU dynamic kernel launches through thread reuse.

[BibT_eX]

[DOI]

Guoyang Chen

Proceedings of the 48th International Symposium on Microarchitecture, 2015

Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Yinyang K-Means: A Drop-In Replacement of the Classic K-Means with Consistent Speedup.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Machine Learning, 2015

Software Engagement with Sleeping CPUs.

[BibT_eX]

[DOI]

Proceedings of the 15th Workshop on Hot Topics in Operating Systems, 2015

14th compiler-driven performance workshop.

[BibT_eX]

[DOI]

Proceedings of 25th Annual International Conference on Computer Science and Software Engineering, 2015

On-the-Fly Principled Speculation for FSM Parallelization.

[BibT_eX]

[DOI]

Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

2014

Space-efficient multi-versioning for input-adaptive feedback-driven program optimizations.

[BibT_eX]

[DOI]

Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

Call sequence prediction through probabilistic calling automata.

[BibT_eX]

[DOI]

Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

PORPLE: An Extensible Optimizer for Portable Data Placement on GPU.

[BibT_eX]

[DOI]

Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

Understanding Co-run Degradations on Integrated Heterogeneous Processors.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2014

Localization of concurrency bugs using shared memory access pairs.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE International Conference on Automated Software Engineering, 2014

SatScore: uncovering and avoiding a principled pitfall in responsiveness measurements of app launches.

[BibT_eX]

[DOI]

Mingzhou Zhou

Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2014

Challenging the "embarrassingly sequential": parallelizing finite state machine-based computations through principled speculation.

[BibT_eX]

[DOI]

Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

Finding the limit: examining the potential and complexity of compilation scheduling for JIT-based runtime systems.

[BibT_eX]

[DOI]

Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

SM-centric transformation: circumventing hardware restrictions for flexible GPU scheduling.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013

HPar: A practical parallel parser for HTML-taming HTML complexities for parallel parsing.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2013

An Infrastructure for Tackling Input-Sensitivity of GPU Program Optimizations.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2013

Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

Software-level scheduling to exploit non-uniformly shared data cache on GPGPU.

[BibT_eX]

[DOI]

Weilin Wang

Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, 2013

Do computer programs have to be as dumb as they are?: input-centric dynamic program optimizations.

[BibT_eX]

[DOI]

Proceedings of the VMIL@SPLASH '13: Proceedings of the 7th ACM workshop on Virtual machines and intermediate languages, 2013

A Versatile Performance and Energy Simulation Tool for Composite GPU Global Memory.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE 21st International Symposium on Modelling, 2013

Simple Profile Rectifications Go a Long Way - Statistically Exploring and Alleviating the Effects of Sampling Errors for Program Optimizations.

[BibT_eX]

[DOI]

Proceedings of the ECOOP 2013 - Object-Oriented Programming, 2013

Profmig: A framework for flexible migration of program profiles across software versions.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization, 2013

Exploring hybrid memory for GPU energy efficiency through software-hardware co-design.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012

The Significance of CMP Cache Sharing on Contemporary Multithreaded Applications.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2012

A study towards optimal data layout for GPU computing.

[BibT_eX]

[DOI]

Han Li

Proceedings of the 2012 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '12, 2012

Exploiting inter-sequence correlations for program behavior prediction.

[BibT_eX]

[DOI]

Proceedings of the 27th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2012

Optimal Co-Scheduling to Minimize Makespan on Chip Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Job Scheduling Strategies for Parallel Processing, 2012

One stone two birds: synchronization relaxation and redundancy removal in GPU-CPU translation.

[BibT_eX]

[DOI]

Ziyu Guo

Proceedings of the International Conference on Supercomputing, 2012

Speculative parallelization needs rigor: probabilistic analysis for optimal speculation of finite-state machine applications.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011

The Complexity of Optimal Job Co-Scheduling on Chip Multiprocessors and Heuristics-Based Solutions.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2011

A step towards transparent integration of input-consciousness into dynamic program optimizations.

[BibT_eX]

[DOI]

Kai Tian

Proceedings of the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2011

Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation.

[BibT_eX]

[DOI]

Ziyu Guo

Proceedings of the Languages and Compilers for Parallel Computing, 2011

On-the-fly elimination of dynamic irregularities for GPU computing.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, 2011

Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU.

[BibT_eX]

[DOI]

Ziyu Guo

Eddy Zheng Zhang

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010

Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?

[BibT_eX]

[DOI]

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

An input-centric paradigm for program dynamic optimizations.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2010

LU Decomposition on Cell Broadband Engine: An Empirical Study to Exploit Heterogeneous Chip Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, IFIP International Conference, 2010

Array Regrouping on CMP with Non-uniform Cache Sharing.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2010

Streamlining GPU applications on the fly: thread divergence elimination through runtime thread-data remapping.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Supercomputing, 2010

Combining Locality Analysis with Online Proactive Job Co-scheduling in Chip Multiprocessors.

[BibT_eX]

[DOI]

Kai Tian

Proceedings of the High Performance Embedded Architectures and Compilers, 2010

Exploiting statistical correlations for proactive prediction of program behaviors.

[BibT_eX]

[DOI]

Proceedings of the CGO 2010, 2010

Is Reuse Distance Applicable to Data Locality Analysis on Chip Multiprocessors?

[BibT_eX]

[DOI]

Proceedings of the Compiler Construction, 19th International Conference, 2010

2009

Program locality analysis using reuse distance.

[BibT_eX]

[DOI]

ACM Trans. Program. Lang. Syst., 2009

The study and handling of program inputs in the selection of garbage collectors.

[BibT_eX]

[DOI]

ACM SIGOPS Oper. Syst. Rev., 2009

Influence of program inputs on the selection of garbage collectors.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Virtual Execution Environments, 2009

A cross-input adaptive framework for GPU program optimizations.

[BibT_eX]

[DOI]

Yixun Liu

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Speculation with Little Wasting: Saving Cost in Software Speculation through Transparent Learning.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

Cross-Input Learning and Discriminative Prediction in Evolvable Virtual Machines.

[BibT_eX]

[DOI]

Proceedings of the CGO 2009, 2009

A study on optimally co-scheduling jobs of different lengths on chip multiprocessors.

[BibT_eX]

[DOI]

Kai Tian

Proceedings of the 6th Conference on Computing Frontiers, 2009

2008

Scalable Implementation of Efficient Locality Approximation.

[BibT_eX]

[DOI]

Jonathan Shaw

Proceedings of the Languages and Compilers for Parallel Computing, 2008

Adaptive speculation in behavior-oriented parallelization.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Adaptive Software Speculation for Enhancing the Cost-Efficiency of Behavior-Oriented Parallelization.

[BibT_eX]

[DOI]

Proceedings of the 2008 International Conference on Parallel Processing, 2008

Exploration of the Influence of Program Inputs on CMP Co-scheduling.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2008, 2008

Analysis and approximation of optimal co-scheduling on chip multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

2007

Miss Rate Prediction Across Program Inputs and Cache Configurations.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2007

Predicting locality phases for dynamic memory optimization.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2007

Locality approximation using time.

[BibT_eX]

[DOI]

Proceedings of the 34th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2007

Software behavior oriented parallelization.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, 2007

Modeling Relations between Inputs and Dynamic Behavior for General Programs.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2007

A Key-based Adaptive Transactional Memory Executor.

[BibT_eX]

[DOI]

Tongxin Bai

Chengliang Zhang

William N. Scherer III

Michael L. Scott

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Analysis of input-dependent program behavior using active profiling.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Experimental Computer Science, 2007

Bridging Inputs and Program Dynamic Behavior.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007

2006

Program-level adaptive memory management.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Memory Management, 2006

2005

Parallelization of Utility Programs Based on Behavior Phase Analysis.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2005

Lightweight reference affinity analysis.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual International Conference on Supercomputing, 2005

Gated memory control for memory monitoring, leak detection and garbage collection.

[BibT_eX]

[DOI]

Proceedings of the 2005 workshop on Memory System Performance, 2005

2004

Learning multi-label scene classification.

[BibT_eX]

[DOI]

Pattern Recognit., 2004

Multilabel machine learning and its application to semantic scene classification.

[BibT_eX]

[DOI]

Proceedings of the Storage and Retrieval Methods and Applications for Multimedia 2004, 2004

Array regrouping and structure splitting using whole-program reference affinity.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation 2004, 2004

Phase-Based Miss Rate Prediction Across Program Inputs.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for High Performance Computing, 2004

Adaptive Data Partition for Sorting Using Probability Distribution.

[BibT_eX]

[DOI]

Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Locality phase prediction.

[BibT_eX]

[DOI]

Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

2003

A Hierarchical Model of Reference Affinity.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2003

2001

The study of the effect of training set on statistical language modeling.

[BibT_eX]

[DOI]

Bo Xu

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Study and auto-detection of stress based on tonal pitch range in Mandarin.

[BibT_eX]

[DOI]

Bo Xu

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000

A CART-Based Hierarchical Stochastic Model for Prosodic Phrasing in Chinese.

[BibT_eX]

[DOI]