Follow
Kaiming Ouyang
Kaiming Ouyang
Verified email at nvidia.com
Title
Cited by
Cited by
Year
FT-CNN: Algorithm-based fault tolerance for convolutional neural networks
K Zhao, S Di, S Li, X Liang, Y Zhai, J Chen, K Ouyang, F Cappello, ...
IEEE Transactions on Parallel and Distributed Systems 32 (7), 1677-1689, 2020
1022020
Correcting soft errors online in fast fourier transform
X Liang, J Chen, D Tao, S Li, P Wu, H Li, K Ouyang, Y Liu, F Song, ...
Proceedings of the International Conference for High Performance Computing …, 2017
422017
TSM2: optimizing tall-and-skinny matrix-matrix multiplication on GPUs
J Chen, N Xiong, X Liang, D Tao, S Li, K Ouyang, K Zhao, ...
Proceedings of the ACM International Conference on Supercomputing, 106-116, 2019
352019
Silent data corruption resilient two-sided matrix factorizations
P Wu, N DeBardeleben, Q Guan, S Blanchard, J Chen, D Tao, X Liang, ...
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of …, 2017
312017
Fault tolerant one-sided matrix decompositions on heterogeneous systems with gpus
J Chen, H Li, S Li, X Liang, P Wu, D Tao, K Ouyang, Y Liu, K Zhao, ...
SC18: International Conference for High Performance Computing, Networking …, 2018
252018
FT-iSort: efficient fault tolerance for introsort
S Li, H Li, X Liang, J Chen, E Giem, K Ouyang, K Zhao, S Di, F Cappello, ...
Proceedings of the International Conference for High Performance Computing …, 2019
162019
CAB-MPI: Exploring interprocess work-stealing towards balanced MPI communication
K Ouyang, M Si, A Hori, Z Chen, P Balaji
SC20: International Conference for High Performance Computing, Networking …, 2020
82020
Daps: a dynamic asynchronous progress stealing model for MPI communication
K Ouyang, M Si, A Hori, Z Chen, P Balaji
2021 IEEE International Conference on Cluster Computing (CLUSTER), 516-527, 2021
32021
Accelerating mpi collectives with process-in-process-based multi-object techniques
J Huang, K Ouyang, Y Zhai, J Liu, M Si, K Raffenetti, H Zhou, A Hori, ...
Proceedings of the 32nd International Symposium on High-Performance Parallel …, 2023
12023
On the difference between shared memory and shared address space in HPC communication
A Hori, K Ouyang, B Gerofi, Y Ishikawa
Asian Conference on Supercomputing Frontiers, 59-78, 2022
12022
KF K-means: A High Performance K-means Implementation using Kernel Fusion
K Ouyang, V Tran, J Liu, BM Wong, Z Chen
2023 IEEE International Conference on Big Data (BigData), 121-127, 2023
2023
PiP-MColl: Process-in-Process-based Multi-object MPI Collectives
J Huang, K Ouyang, Y Zhai, J Liu, M Si, K Raffenetti, H Zhou, A Hori, ...
2023 IEEE International Conference on Cluster Computing (CLUSTER), 354-364, 2023
2023
Exploring Interprocess Techniques for High-Performance MPI Communication
K Ouyang
University of California, Riverside, 2022
2022
TSM2
J Chen, N Xiong, X Liang, D Tao, S Li, K Ouyang, K Zhao, ...
Proceedings of the ACM International Conference on Supercomputing, 2019
2019
Exploring Interprocess Work Stealing for Balanced MPI Communication
K Ouyang, M Si, Z Chen
Cell 64, 64KB, 0
The system can't perform the operation now. Try again later.
Articles 1–15