Meena Arunachalam

Orcid: 0000-0002-3155-6269

According to our database1, Meena Arunachalam authored at least 19 papers between 2015 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Pretraining large language models with MXFP4 on Native FP4 Hardware.
CoRR, May, 2026

Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks.
CoRR, March, 2026

2025
Delivering MLPerf Submissions on AMD Instinct GPUs: Journey to Leadership Performance.
Proceedings of the Performance Evaluation and Benchmarking, 2025

Multi-Dimensional ML-Pipeline Optimization in Cost-Effective Disaggregated Datacenter.
Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture, 2025

TPNM: A CXL Based General Purpose Tiered Process Near Memory Framework.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2025

2024
Parallelization Strategies for DLRM Embedding Bag Operator on AMD CPUs.
IEEE Micro, 2024

Early prediction of onset of sepsis in Clinical Setting.
CoRR, 2024

PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

2023
Optimizing CPU Performance for Recommendation Systems At-Scale.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

Quantization for Bayesian Deep Learning: Low-Precision Characterization and Robustness.
Proceedings of the IEEE International Symposium on Workload Characterization, 2023

2022
End-to-End Industrial IoT: Software Optimization and Acceleration.
IEEE Internet Things Mag., 2022

Strategies for Optimizing End-to-End Artificial Intelligence Pipelines on Intel Xeon Processors.
CoRR, 2022

2019
Multiobjective evaluation and optimization of CMT-bone on multiple CPU/GPU systems.
Sustain. Comput. Informatics Syst., 2019

Architecture-Centric Bottleneck Analysis for Deep Neural Network Applications.
Proceedings of the 26th IEEE International Conference on High Performance Computing, 2019

2018
A Learning-Guided Hierarchical Approach for Biomedical Image Segmentation.
Proceedings of the 31st IEEE International System-on-Chip Conference, 2018

Efficient K nearest neighbor algorithm implementations for throughput-oriented architectures.
Proceedings of the 19th International Symposium on Quality Electronic Design, 2018

Multiobjective Evaluation and Optimization of CMT-bone on Intel Knights Landing.
Proceedings of the Ninth International Green and Sustainable Computing Conference, 2018

2015
Machine learning techniques for improved data prefetching.
Proceedings of the 5th International Conference on Energy Aware Computing Systems & Applications, 2015

Optimizing Non-contiguous Memory Access on Intel Xeon Phi Coprocessors.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015


  Loading...