Wei Xue
Orcid: 0000-0001-9740-6581Affiliations:
- Tsinghua University, Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Beijing, China
- Qinghai University, China
- National Supercomputing Center in Wuxi, China
According to our database1,
Wei Xue
authored at least 100 papers
between 2004 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
IEEE Trans. Parallel Distributed Syst., September, 2025
CoRR, July, 2025
Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025
An AI-Enhanced 1km-Resolution Seamless Global Weather and Climate Model to Achieve Year-Scale Simulation Speed using 34 Million Cores.
Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2025
VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
2024
CCF Trans. High Perform. Comput., June, 2024
J. Supercomput., May, 2024
Parallel optimization and application of unstructured sparse triangular solver on new generation of Sunway architecture.
Parallel Comput., 2024
Kilometer-Level Coupled Modeling Using 40 Million Cores: An Eight-Year Journey of Model Development.
CoRR, 2024
Full Lifecycle Data Analysis on a Large-scale and Leadership Supercomputer: What Can We Learn from It?
Proceedings of the 2024 USENIX Annual Technical Conference, 2024
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
FP16 Acceleration in Structured Multigrid Preconditioner for Real-World Applications.
Proceedings of the 53rd International Conference on Parallel Processing, 2024
BoostN: Optimizing Imbalanced Neighborhood Communication on Homogeneous Many-Core System.
Proceedings of the 53rd International Conference on Parallel Processing, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the 2024 ACM Symposium on Cloud Computing, 2024
2023
Redesign and Accelerate the AIREBO Bond-Order Potential on the New Sunway Supercomputer.
IEEE Trans. Parallel Distributed Syst., December, 2023
CCF Trans. High Perform. Comput., December, 2023
Bio-ESMD: A Data Centric Implementation for Large-Scale Biological System Simulation on Sunway TaihuLight Supercomputer.
IEEE Trans. Parallel Distributed Syst., March, 2023
Model guided algorithm optimization for tridiagonal solver on many-core architectures.
CCF Trans. High Perform. Comput., March, 2023
ACM Trans. Storage, February, 2023
IEEE Trans. Parallel Distributed Syst., 2023
FengWu-4DVar: Coupling the Data-driven Weather Forecasting Model with 4D Variational Assimilation.
CoRR, 2023
Automatic Search Guided Code Optimization Framework for Mixed-Precision Scientific Applications.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023
69.7-PFlops Extreme Scale Earthquake Simulation with Crossing Multi-faults and Topography on Sunway.
Proceedings of the International Conference for High Performance Computing, 2023
5 ExaFlop/s HPL-MxP Benchmark with Linear Scalability on the 40-Million-Core Sunway Supercomputer.
Proceedings of the International Conference for High Performance Computing, 2023
Rapid simulations of atmospheric data assimilation of hourly-scale phenomena with modern neural networks.
Proceedings of the International Conference for High Performance Computing, 2023
Enabling Real World Scale Structural Superlubricity All-Atom Simulation on the Next-Generation Sunway Supercomputer.
Proceedings of the International Conference for High Performance Computing, 2023
HadaFS: A File System Bridging the Local and Shared Burst Buffer for Exascale Supercomputers.
Proceedings of the 21st USENIX Conference on File and Storage Technologies, 2023
2022
IEEE Trans. Parallel Distributed Syst., 2022
Jdebug: A Fast, Non-Intrusive and Scalable Fault Locating Tool for Ten-Million-Scale Parallel Applications.
IEEE Trans. Parallel Distributed Syst., 2022
Optimization of Reactive Force Field Simulation: Refactor, Parallelization, and Vectorization for Interactions.
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Computers, 2022
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
2021
Many-core acceleration of the first-principles all-electron quantum perturbation calculations.
Comput. Phys. Commun., 2021
CoRR, 2021
Editorial for the special issue on large-scale AI in classical HPC environment and AI for science.
CCF Trans. High Perform. Comput., 2021
LMFF: efficient and scalable layered materials force field on heterogeneous many-core processors.
Proceedings of the International Conference for High Performance Computing, 2021
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021
2020
Millimeter-Scale and Billion-Atom Reactive Force Field Simulation on Sunway Taihulight.
IEEE Trans. Parallel Distributed Syst., 2020
Lessons Learned from Optimizing the Sunway Storage System for Higher Application I/O Performance.
J. Comput. Sci. Technol., 2020
CCF Trans. High Perform. Comput., 2020
CCF Trans. High Perform. Comput., 2020
APMT: an automatic hardware counter-based performance modeling tool for HPC applications.
CCF Trans. High Perform. Comput., 2020
Cell-list based molecular dynamics on many-core processors: a case study on sunway TaihuLight supercomputer.
Proceedings of the International Conference for High Performance Computing, 2020
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020
Proceedings of the Algorithms and Architectures for Parallel Processing, 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
An automatic performance model-based scheduling tool for coupled climate system models.
J. Parallel Distributed Comput., 2019
Proceedings of the International Conference for High Performance Computing, 2019
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation, 2019
Proceedings of the 48th International Conference on Parallel Processing, 2019
Proceedings of the 17th USENIX Conference on File and Storage Technologies, 2019
2018
Proceedings of the International Conference for High Performance Computing, 2018
Redesigning LAMMPS for peta-scale and hundred-billion-atom simulation on Sunway TaihuLight.
Proceedings of the International Conference for High Performance Computing, 2018
swSpTRSV: a fast sparse triangular solve with sparse level tile layout on sunway architectures.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
vSensor: leveraging fixed-workload snippets of programs for performance variance detection.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
Taming the "Monster": Overcoming Program Optimization Challenges on SW26010 Through Precise Performance Modeling.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Proceedings of the 32nd International Conference on Supercomputing, 2018
A Fast Sparse Triangular Solver for Structured-grid Problems on Sunway Many-core Processor SW26010.
Proceedings of the 47th International Conference on Parallel Processing, 2018
2017
IEEE Micro, 2017
Proceedings of the International Conference for High Performance Computing, 2017
18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios.
Proceedings of the International Conference for High Performance Computing, 2017
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017
2016
Sci. China Inf. Sci., 2016
Proceedings of the International Conference for High Performance Computing, 2016
Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer.
Proceedings of the International Conference for High Performance Computing, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016
Accelerating the 3D euler atmospheric solver through heterogeneous CPU-GPU platforms.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016
Unleashing the performance potential of CPU-GPU platforms for the 3D atmospheric Euler solver.
Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016
2015
Solving the Global Atmospheric Equations through Heterogeneous Reconfigurable Platforms.
ACM Trans. Reconfigurable Technol. Syst., 2015
IEEE Trans. Computers, 2015
ParSA: High-throughput scientific data analysis framework with distributed file system.
Future Gener. Comput. Syst., 2015
2014
Proceedings of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2014
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Scaling and analyzing the stencil performance on multi-core and many-core architectures.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014
A highly-efficient and green data flow engine for solving euler atmospheric equations.
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014
2013
Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform.
J. Supercomput., 2013
Concurr. Comput. Pract. Exp., 2013
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013
Accelerating solvers for global atmospheric equations through mixed-precision data flow engine.
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013
Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013
Proceedings of the Euro-Par 2013 Parallel Processing, 2013
Proceedings of the 2013 IEEE Symposium on Low-Power and High-Speed Chips, 2013
2012
Fast time domain simulation of power systems using multilevel preconditioners with adaptive reconstruction strategies.
Simul. Model. Pract. Theory, 2012
2010
2007
ACM Trans. Storage, 2007
IEEE Trans. Computers, 2007
Proceedings of the International Conference on Networking, 2007
Proceedings of the International Conference on Networking, 2007
2006
Proceedings of the Computational Science, 2006
Proceedings of the Computational Science, 2006
2005
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2005
Parallel Algorithm and Implementation for Realtime Dynamic Simulation of Power System.
Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), 2005
Proceedings of the Computational Science, 2005
2004
Proceedings of the Parallel and Distributed Processing and Applications, 2004