Xiaoming Li

Orcid: 0000-0002-5079-3219

Affiliations:
  • University of Delaware, Department of Department of Electrical and Computer Engineering, Newark, DE, USA


According to our database1, Xiaoming Li authored at least 42 papers between 2007 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
CoachGPT: A Scaffolding-based Academic Writing Assistant.
Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025

2024
Memory Efficiency Oriented Fine-Grain Representation and Optimization of FFT.
Proceedings of the International Symposium on Memory Systems, 2024

2023
A Distributed Pricing Strategy for Edge Computation Offloading Optimization in Autonomous Driving.
IEEE Netw., September, 2023

On Memory Codelets: Prefetching, Recoding, Moving and Streaming Data.
CoRR, 2023

DEMAC: A Platform for Education in High-performance Computing, Bridging the Gap Between Users and Hardware.
Proceedings of the Workshop on Computer Architecture Education, 2023

Memory Transfer Decomposition: Exploring Smart Data Movement Through Architecture-Aware Strategies.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

A gem5 Implementation of the Sequential Codelet Model: Reducing Overhead and Expanding the Software Memory Interface.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Towards Fault Tolerance and Resilience in the Sequential Codelet Model.
Proceedings of the High Performance Computing - 10th Latin American Conference, 2023

2022
Chiplets and the Codelet Model.
CoRR, 2022

Programming Autonomous Machines.
CoRR, 2022

Automatic Asynchronous Execution of Synchronously Offloaded OpenMP Target Regions.
Proceedings of the Eighth IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC, 2022

Programming Autonomous Machines : Special Session Paper.
Proceedings of the International Conference on Embedded Software, 2022

2021
Fast Monotonicity Preserving Text Sorting On GPU.
Proceedings of the IEEE International Performance, 2021

An Efficient Shuffle-Light FFT Library.
Proceedings of the IEEE International Performance, 2021

2020
G-Code Re-compilation and Optimization for Faster 3D Printing.
Proceedings of the Languages and Compilers for Parallel Computing, 2020

Fast Convolutional Neural Networks with Fine-Grained FFTs.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2018
CSSMT: Compiler Based Software Simultaneous Multithreading (SMT).
Proceedings of the 26th Euromicro International Conference on Parallel, 2018

2017
A scalable interface-resolved simulation of particle-laden flow using the lattice Boltzmann method.
Parallel Comput., 2017

Scalable Top-K Query Processing Using Graphics Processing Unit.
Proceedings of the Languages and Compilers for Parallel Computing, 2017

Improving Retrieval Effectiveness for Temporal-Constrained Top-K Query Processing.
Proceedings of the Information Retrieval Technology, 2017

2015
Network and Parallel Computing.
Int. J. Parallel Program., 2015

FreshBreeze: A Data Flow Approach for Meeting DDDAS Challenges.
Proceedings of the International Conference on Computational Science, 2015

A Thread Merging Transformation to Improve Throughput of Multiple Programs.
Proceedings of the 29th IEEE International Conference on Advanced Information Networking and Applications, 2015

2014
Page Classifier and Placer: A Scheme of Managing Hybrid Caches.
Proceedings of the Network and Parallel Computing, 2014

Input-adaptive parallel sparse fast fourier transform for stream processing.
Proceedings of the 2014 International Conference on Supercomputing, 2014

A Dataflow Programming Language and its Compiler for Streaming Systems.
Proceedings of the International Conference on Computational Science, 2014

2013
An Input-Adaptive Algorithm for High Performance Sparse Fast Fourier Transform.
Proceedings of the Languages and Compilers for Parallel Computing, 2013

A hybrid GPU/CPU FFT library for large FFT problems.
Proceedings of the IEEE 32nd International Performance Computing and Communications Conference, 2013

2012
Static micro-scheduling: Resource contention relief in multithreaded programs.
Proceedings of the 31st IEEE International Performance Computing and Communications Conference, 2012

2011
A Code Merging Optimization Technique for GPU.
Proceedings of the Languages and Compilers for Parallel Computing, 2011

Using GPUs to compute large out-of-card FFTs.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Source Code Partitioning in Program Optimization.
Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

2010
Software-based branch predication for AMD GPUs.
SIGARCH Comput. Archit. News, 2010

An empirically tuned 2D and 3D FFT library on CUDA GPU.
Proceedings of the 24th International Conference on Supercomputing, 2010

A Micro-benchmark Suite for AMD GPUs.
Proceedings of the 39th International Conference on Parallel Processing, 2010

2009
DFT Performance Prediction in FFTW.
Proceedings of the Languages and Compilers for Parallel Computing, 2009

Iterative layer-based raytracing on CUDA.
Proceedings of the 28th International Performance Computing and Communications Conference, 2009

Performance modeling for DFT algorithms in FFTW.
Proceedings of the 23rd international conference on Supercomputing, 2009

CUDA Memory Optimizations for Large Data-Structures in the Gravit Simulator.
Proceedings of the ICPPW 2009, 2009

A control-structure splitting optimization for GPGPU.
Proceedings of the 6th Conference on Computing Frontiers, 2009

2007
Automatic Program Segment Similarity Detection in Targeted Program Performance Improvement.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Experience of Optimizing FFT on Intel Architectures.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007


  Loading...