• [NeurIPS'21] "NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM", Connor Holmes, Minjia Zhang, Yuxiong He, and Bo Wu. Thirty-fifth Conference on Neural Information Processing Systems, 2021.

  • [PACT'21] "Dryadic: Flexible and Fast Graph Pattern Matching at Scale", Daniel Mawhirter, Samuel Reinehr, Wei Han, Noah Fields, Miles Claver, Connor Holmes, Jedidiah McClurg, Tongping Liu, and Bo Wu. The 30th International Conference on Parallel Architectures and Compilation Techniques, 2021.

  • [OSR'21] "GraphZero: A High-Performance Subgraph Matching System", Daniel Mawhirter, Sam Reinehr, Connor Holmes, Tongping Liu, and Bo Wu. ACM SIGOPS Operating Systems Review, Volume 55, Issue 1, 2021.

  • [TKDE'21] "Automatic Irregularity-Aware Fine-Grained Workload Partitioning on Integrated Architectures", Feng Zhang, Jidong Zhai, and Bo Wu, Bingsheng He, Wenguang Chen, and Xiaoyong Du. IEEE Transactions on Knowledge and Data Engineering, 2021.

  • [SOSP'19] "AutoMine: Harmonizing High-Level Abstraction and High Performance for Graph Mining", Daniel Mawhirter and Bo Wu. ACM Symposium on Operating Systems Principles, Huntsville, Ontario, Canada, October, 2019. Acceptance ratio: 13.8% (38/276).

  • [LCPC'19] "FLARE: Flexibly Sharing Commodity GPUs to Enforce QoS and Improve Utilization", Wei Han, Daniel Mawhirter, Lin Ma, Chen Tian, and Bo Wu. The 32nd Workshop on Languages and Compilers for Parallel Computing, Atlanta, October, 2019.

  • [EuroSys'19] "GRNN: Low-Latency and Scalable RNN Inference on GPUs", Connor Holmes, Daniel Mawhirter, Yuxiong He, Feng Yan, and Bo Wu. European Conference on Computer Systems, Dresden, Germany, March, 2019. Acceptance ratio: 21.8% (45/206).

  • [ICS'19] "Laius: Towards Latency Awareness and Improved Utilization of Spatial Multitasking Accelerators in Datacenters", Wei Zhang, Weihao Cui, Kaihua Fu, Quan Chen, Daniel Mawhirter, Bo Wu, Chao Li and Minyi Guo. International Conference on Supercomputing, Phoenix, Arizona, USA, June, 2019. Acceptance ratio: 23.3% (45/193).

  • [PACT'18] "GraphPhi: Efficient Parallel Graph Processing on Emerging Throughput-oriented Architectures", Zhen Peng, Alexander Powell, Bo Wu, Tekin Bicer and Bin Ren. The 27th International Conference on Parallel Architectures and Compilation Techniques, Limassol, Cyprus, November, 2018. Acceptance ratio: 29% (36/126).

  • [CCGrid'18] "ApproxG: Fast Approximate Parallel Graphlet Counting Through Accuracy Control", Daniel Mawhirter, Bo Wu, Dinesh Mehta and Chao Ai. The 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Washington DC, May, 2018. Acceptance ratio: 20.8% (52/250).

  • [FCS'18] "Resolving the GPU responsiveness dilemma through program transformations", Qi Zhu, Bo Wu, Xipeng Shen, Kai Shen, Li Shen, Zhiying Wang, Frontiers of Computer Science, Springer, 2018, 12 (3): 545-559.

  • [ASPLOS'17] "FLEP: Enabling Flexible and Efficient Preemption on GPUs", Bo Wu, Xu Liu, Xiaobo Zhou, and Changjun Jiang. The 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Xi'an, China, April, 2017. Acceptance ratio: 17% (56/321).

  • [PACT'17] "Graphie: Large-Scale Asynchronous Graph Traversals on Just a GPU", Wei Han, Daniel Mawhirter, Matthew Buland, and Bo Wu. The 26th International Conference on Parallel Architectures and Compilation Techniques, Portland, Oregon, Sep. 2017. Acceptance ratio: 23% (25/108). Nominated for best paper award.

  • [ICS'17] "ScalaFSM: Enabling Scalability-Sensitive Speculative Parallelization for FSM Computations", Junqiao Qiu, Zhijia Zhao, Bo Wu, Abhinav Vishnu and Shuaiwen Leon Song. The International Conference on Supercomputing, Chicago, IL, June, 2017. Acceptance ratio: 16%.

  • [IPDPS'17] "Co-Run Scheduling with Power Cap on Integrated CPU-GPU Systems", Qi Zhu, Bo Wu, Xipeng Shen, Li Shen and Zhiying Wang. The 31st IEEE International Parallel & Distributed Processing Symposium, Orlando, Florida, May, 2017. Acceptance ratio: 23%.

  • [CGO'17] "FinePar: Irregularity-Aware Fine-Grained Workload Partitioning on Integrated Architectures", Feng Zhang, Bo Wu, Jidong Zhai, Bingsheng He, and Wenguang Chen. The International Symposium on Code Generation and Optimization, Austin, TX, Feb, 2017. Acceptance ratio: 22% (26/114).

  • [Book Chapter] "Data Placement on GPUs", Xipeng Shen and Bo Wu. To appear as a chapter in "Advances in GPU Research and Practice", by H. Sarbazi-Azad (editor), Elsevier, 2016.

  • [Book Chapter] "Software-Level Task Scheduling on GPUs", Bo Wu and Xipeng Shen. To appear as a chapter in "Advances in GPU Research and Practice", by H. Sarbazi-Azad (editor), Elsevier, 2016.

  • [TC'16] "Optimizing Data Placement on GPU Memory: A Portable Approach", Guoyang Chen, Xipeng Shen, Bo Wu, and Dong Li. The IEEE Transactions on Computers, 2016. To appear.

  • [FCS'16] "Understanding Co-run Performance on CPU-GPU Integrated Processors: Observations, Insights, Directions", Qi Zhu, Bo Wu, Kai Shen, and Xipeng Shen. Frontiers of Computer Science, Springer, 2016. To appear.

  • [TACO'16] "Examining and Reducing the Influence of Sampling Errors on Feedback-Driven Optimizations", Mingzhou Zhou, Bo Wu, Xipeng Shen, Yaoqing Gao, and Graham Yiu. The ACM Transactions on Architecture and Code Optimization, 2016. To appear.

  • [SC'15] "ScaAnalyzer: A Tool to Identify Memory Scalability Bottlenecks in Parallel Programs", Xu Liu and Bo Wu. The International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, TX, Nov, 2015. Acceptance ratio: 22%. Best paper award (1 out of 358 submissions).

  • [IEEE/Micro'15] "Enabling Portable Optimizations of Data Placement on GPU", Guoyang Chen, Bo Wu, Dong Li and Xipeng Shen. July/August Issue, The Heterogeneous Computing special issue of IEEE Micro, 2015.

  • [HotOS'15] "Software Engagement with Sleeping CPUs", Qi Zhu, Meng Zhu, Bo Wu, Xipeng Shen, Kai Shen and Zhiying Wang. The 15th Workshop on Hot Topics in Operating Systems, Kartause Ittingen, Switzerland, May, 2015. Acceptance ratio: 32% (29/90)

  • [ICS'15] "Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations", Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen and Jeffrey Vetter. The 29th International Conference on Supercomputing, Newport Beach, CA, June, 2015. Acceptance ratio: 25%

  • [MICRO'14] "PORPLE: An Extensible Optimizer for Portable Data Placement on GPU", Guoyang Chen, Bo Wu, Dong Li and Xipeng Shen. The 47th Annual IEEE/ACM International Symposium on Microarchitecture, Cambridge, UK, Dec, 2014. Acceptance ratio: 19% (53/273)

  • [LCPC'14] "Understanding Co-Run Degradations on Integrated Heterogeneous Processors", Qi Zhu, Bo Wu, Xipeng Shen, Li Shen and Zhiying Wang. The 27th International Workshop on Languages and Compilers for Parallel Computing, Hillsboro, OR, Sep, 2014.

  • [PACT'14 poster] "SM-Centric Transformation: Circumventing Hardware Restrictions for Flexible GPU Scheduling ", Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen, Jeffrey Vetter. The 23rd International Conference on Parallel Architectures and Compilation Techniques, Edmonton, Alberta, Canada, Aug. 2014.

  • [OOPSLA'14] "Call Sequence Prediction through Probabilistic Calling Automata", Zhijia Zhao, Bo Wu, Mingzhou Zhou, Yufei Ding, Jianhua Sun, Xipeng Shen, and Youfeng Wu. ACM SIGPLAN conference on Systems, Programming, Languages and Applications, Portland, USA, 2014. Acceptance ratio: 28% (53/186).

  • [ASPLOS'14] "Challenging the "Embarrassingly Sequential": Parallelizing Finite State Machine-Based Computations through Principled Speculation", Zhijia Zhao, Bo Wu, Xipeng Shen, The Nineteenth International Conference on Architectural Support for Programming Languages and Operating Systems, Salt Lake City, Utah, Mar, 2014. Acceptance ratio: 23% (49/217).

  • [PACT'13] "Exploring Hybrid Memory for GPU Energy Efficiency through Software-Hardware Co-Design", Bin Wang, Bo Wu, Dong Li, Xipeng Shen, Weikuan Yu, Yizheng Jiao, Jeffrey Vetter, The 22nd International Conference on Parallel Architectures and Compilation Techniques, Edinburgh, Scotland, Sep, 2013. Acceptance ratio: 17% (36/208).

  • [MSPC'13 poster] "Software-level Scheduling to Exploit Non-uniformly Shared Data Cache", Bo Wu, Weilin Wang, Xipeng Shen, ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, Seattle, USA, June, 2013. Two-page position paper.

  • [ECOOP'13] "Simple Profile Rectifications Go A Long Way: Demystifying the Influence of Sampling Errors on Feedback Driven Program Optimizations ", Bo Wu, Mingzhou Zhou, Xipeng Shen, Yaoqing Gao, Raul Silvera, Graham Yiu, European Conference on Object-oriented Programming, Montpellier, France, July, 2013. Acceptance ratio: 25%.

  • [PPoPP'13] "Complexity Analysis and Algorithm Design for Reorganizing Data to Minimize Non-Coalesced GPU Memory Accesses", Bo Wu, Zhijia Zhao, Eddy Zhang, Yunlian Jiang, Xipeng Shen, 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Shenzhen, China, Feb, 2013. Acceptance ratio: 18%.

  • [CGO'13] "ProfMig: The First Framework for Migrating Program Profiles Across Software Versions", Mingzhou Zhou, Bo Wu, Yufei Ding, Xipeng Shen, International Symposium on Code Generation and Optimization, Shenzhen, China, Feb, 2013. Acceptance ratio: 28%.

  • [OOPSLA'12] "Exploiting Inter-Sequence Correlations for Program Behavior Prediction", Bo Wu, Zhijia Zhao, Xipeng Shen, Yunlian Jiang, Yaoqing Gao, Raul Silvera, The 27th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications, Tucson, Arizona, USA, Oct, 2012. Acceptance ratio: 25%.

  • [PACT'12 poster] "Speculative Parallelization Needs Rigor: Probabilistic Analysis for Optimal Speculation of Finite State Machine Applications", Zhijia Zhao, Bo Wu, Xipeng Shen, The Twenty-first International Conference on Parallel Architectures and Compilation Techniques, two-page poster paper, Minneapolis, MN, USA, Sep, 2012.

  • [ICS'12] "One Stone Two Birds: Synchronization Relaxation and Redundancy Removal in GPU-CPU Translation", Ziyu Guo and Bo Wu and Xipeng Shen, ACM International Conference on Supercomputing, Venice, Italy, 2012. Acceptance ratio: 22%.

  • [PACT'11] "Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control", Bo Wu, Eddy Zhang, Xipeng Shen, The Twentieth International Conference on Parallel Architectures and Compilation Techniques, Galveston Island, Texas, USA, Oct, 2011. Acceptance ratio: 16% (36/221).

  • [PACT'11 SRC] "Probabilistic Models towards Optimal Speculation of DFA Applications", Zhijia Zhao and Bo Wu, PACT 2011 ACM Student Research Competition, Galveston Island, Texas, USA, Oct, 2011. (Second place among 29 submissions.)