Argobots: A lightweight low-level threading and tasking framework S Seo, A Amer, P Balaji, C Bordage, G Bosilca, A Brooks, P Carns, ... IEEE Transactions on Parallel and Distributed Systems 29 (3), 512-526, 2017 | 157 | 2017 |
MPI+ threads: Runtime contention and remedies A Amer, H Lu, Y Wei, P Balaji, S Matsuoka ACM SIGPLAN Notices 50 (8), 239-248, 2015 | 64 | 2015 |
The glorious Glasgow Haskell compilation system user’s guide GHC Team Version 7 (3), 2002-2007, 2005 | 46* | 2005 |
MPI+ ULT: Overlapping communication and computation with user-level threads H Lu, S Seo, P Balaji 2015 IEEE 17th International Conference on High Performance Computing and …, 2015 | 40 | 2015 |
P-GAS: Parallelizing a cycle-accurate event-driven many-core processor simulator using parallel discrete event simulation H Lv, Y Cheng, L Bai, M Chen, D Fan, N Sun 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation, 1-8, 2010 | 28 | 2010 |
Mpich user’s guide P Balaji, W Bland, W Gropp, R Latham, H Lu, AJ Pena, K Raffenetti, S Seo, ... Argonne National Laboratory, 2014 | 25 | 2014 |
Lessons learned implementing user-level failure mitigation in mpich W Bland, H Lu, S Seo, P Balaji 2015 15th IEEE/ACM international symposium on cluster, cloud and grid …, 2015 | 22 | 2015 |
Characterizing MPI and hybrid MPI+ Threads applications at scale: Case study with BFS A Amer, H Lu, P Balaji, S Matsuoka 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2015 | 22 | 2015 |
Argo: An exascale operating system and runtime S Perarnau, R Gupta, P Beckman, P Balaji, C Bordage, G Bosilca, ... The International Conference for High Performance Computing, Networking …, 2015 | 19 | 2015 |
Understanding parallelism in graph traversal on multi-core clusters H Lv, G Tan, M Chen, N Sun Computer Science-Research and Development 28, 193-201, 2013 | 15 | 2013 |
Lock contention management in multithreaded mpi A Amer, H Lu, P Balaji, M Chabbi, Y Wei, J Hammond, S Matsuoka ACM Transactions on Parallel Computing (TOPC) 5 (3), 1-21, 2019 | 11 | 2019 |
Locking aspects in multithreaded MPI implementations A Amer, H Lu, Y Wei, J Hammond, S Matsuoka, P Balaji Argonne National Lab., Tech. Rep. P6005-0516, 2016 | 11 | 2016 |
Reducing communication in parallel breadth-first search on distributed memory systems H Lu, G Tan, M Chen, N Sun 2014 IEEE 17th International Conference on Computational Science and …, 2014 | 8 | 2014 |
MPICH User's Guide, Version 3.1. 1 P Balaji, W Bland, W Gropp, R Latham, H Lu, A Pena, K Raffenetti, ... Mathematics and Computer Science Division Argonne National Laboratory …, 2014 | 8 | 2014 |
MPICH User’s Guide, Version 3.2 A Amer, P Balaji, W Bland, W Gropp, R Latham, H Lu, L Oden, A Pena, ... Argonne National Laboratory, 2015 | 5 | 2015 |
MPICH Installer’s Guide P Balaji, W Bland, W Gropp, R Latham, H Lu, AJ Pena, K Raffenetti, S Seo, ... | 4 | 2014 |
G-Cluster: Instruction-Level Many-Core Processor ClusterSimulation at Scale H Lv, Y Cheng, L Bai, M Chen, D Fan, X Lu, C Zhang, N Sun Advanced System Laboratory, NCIC: ICT. Technology Report, 2010 | 1 | 2010 |