Brucek Khailany
Brucek Khailany
Senior Director of VLSI Research, NVIDIA
Verified email at - Homepage
Cited by
Cited by
SCNN: An accelerator for compressed-sparse convolutional neural networks
A Parashar, M Rhu, A Mukkara, A Puglielli, R Venkatesan, B Khailany, ...
ACM SIGARCH computer architecture news 45 (2), 27-40, 2017
GPUs and the future of parallel computing
SW Keckler, WJ Dally, B Khailany, M Garland, D Glasco
IEEE micro 31 (5), 7-17, 2011
Imagine: Media processing with streams
B Khailany, WJ Dally, UJ Kapasi, P Mattson, J Namkoong, JD Owens, ...
IEEE micro 21 (2), 35-46, 2001
Programmable stream processors
UJ Kapasi, S Rixner, WJ Dally, B Khailany, JH Ahn, P Mattson, JD Owens
Computer 36 (8), 54-62, 2003
Register organization for media processing
S Rixner, WJ Dally, B Khailany, P Mattson, UJ Kapasi, JD Owens
Proceedings Sixth International Symposium on High-Performance Computer …, 2000
Timeloop: A systematic approach to dnn accelerator evaluation
A Parashar, P Raina, YS Shao, YH Chen, VA Ying, A Mukkara, ...
2019 IEEE international symposium on performance analysis of systems and …, 2019
Simba: Scaling deep-learning inference with multi-chip-module-based architecture
YS Shao, J Clemons, R Venkatesan, B Zimmer, M Fojtik, N Jiang, B Keller, ...
Proceedings of the 52nd Annual IEEE/ACM International Symposium on …, 2019
The Imagine stream processor
UJ Kapasi, WJ Dally, S Rixner, JD Owens, B Khailany
Proceedings. IEEE International Conference on Computer Design: VLSI in …, 2002
A bandwidth-efficient architecture for media processing
S Rixner, WJ Dally, UJ Kapasi, B Khailany, A Lopez-Lagunas, PR Mattson, ...
Proceedings. 31st Annual ACM/IEEE International Symposium on …, 1998
Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement
Y Lin, S Dhar, W Li, H Ren, B Khailany, DZ Pan
Proceedings of the 56th Annual Design Automation Conference 2019, 1-6, 2019
CudaDMA: optimizing GPU memory bandwidth via warp specialization
M Bauer, H Cook, B Khailany
Proceedings of 2011 international conference for high performance computing …, 2011
Evaluating the imagine stream architecture
JH Ahn, WJ Dally, B Khailany, UJ Kapasi, A Das
ACM SIGARCH Computer Architecture News 32 (2), 14, 2004
Unifying primary cache, scratch, and register file memories in a throughput processor
M Gebhart, SW Keckler, B Khailany, R Krashinsky, WJ Dally
2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, 96-106, 2012
A programmable 512 GOPS stream processor for signal, image, and video processing
BK Khailany, T Williams, J Lin, EP Long, M Rygh, DFW Tovey, WJ Dally
IEEE Journal of solid-state circuits 43 (1), 202-213, 2008
Efficient conditional operations for data-parallel architectures
UJ Kapasi, WJ Dally, S Rixner, PR Mattson, JD Owens, B Khailany
Proceedings of the 33rd annual ACM/IEEE International Symposium on …, 2000
Magnet: A modular accelerator generator for neural networks
R Venkatesan, YS Shao, M Wang, J Clemons, S Dai, M Fojtik, B Keller, ...
2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 1-8, 2019
High performance graph convolutional networks with applications in testability analysis
Y Ma, H Ren, B Khailany, H Sikka, L Luo, K Natarajan, B Yu
Proceedings of the 56th Annual Design Automation Conference 2019, 1-6, 2019
Stream processors: Progammability and efficiency: Will this new kid on the block muscle out ASIC and DSP?
WJ Dally, UJ Kapasi, B Khailany, JH Ahn, A Das
Queue 2 (1), 52-62, 2004
A 0.32–128 TOPS, scalable multi-chip-module-based deep neural network inference accelerator with ground-referenced signaling in 16 nm
B Zimmer, R Venkatesan, YS Shao, J Clemons, M Fojtik, N Jiang, B Keller, ...
IEEE Journal of Solid-State Circuits 55 (4), 920-932, 2020
GRANNITE: Graph neural network inference for transferable power estimation
Y Zhang, H Ren, B Khailany
2020 57th ACM/IEEE Design Automation Conference (DAC), 1-6, 2020
The system can't perform the operation now. Try again later.
Articles 1–20