Zizhong (Jeffrey) Chen

Assistant Professor
Colorado School of Mines

Address:   Department of Mathematical and Computer Sciences
           Colorado School of Mines
           1500 Illinois Street, Golden, CO 80401, USA

Office:    Chauvenet Hall 122
Telephone: +1 303 384 2326
Fax:       +1 303 273 3875
Email:     zchen_AT_mines_DOT_edu

Biography | Research | Teaching | Publications | Services

Announcement: Research Assistant positions are available in high performance computing (HPC) for Ph.D. dissertation students. I also enjoy working with well motivated graduate and undergraduate students on other projects related to HPC. If you are interested in HPC, feel free to contact me.


Biography
Research Interests
  • High Performance Computing
  • Fault Tolerance and Checkpointing
  • Numerical Algorithms and Software
  • Large Scale Computer Simulations


Teaching

Spring 2010 Fall 2009 Previous Teaching
  • CSCI 440/563: Parallel Computing, Spring 2009 (CSM)
  • CSCI 598 B: Fault Tolerant Computing, Fall 2008 (CSM)
  • CS 550: Distributed Computing Systems, Spring 2008 (JSU)
  • CS 553: Simulation, Modeling, and Forecasting, Fall 2007 (JSU)
  • CS 450/450G: Computer Networking, Spring 2007 (JSU)
  • CS 331: Data Structures and Algorithms, Fall 2006, Fall 2007 (JSU)
  • MS 112: Pre-calculus Algebra, Fall 2007, Spring 2008 (JSU)
  • CS 201: Introduction to Information Technology , Fall 2006, Spring 2007, Fall 2007, Spring 2008 (JSU)


Publications (student co-authors underlined)

Journal Articles
  • "Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing."
    Zizhong Chen and Jack Dongarra.
    IEEE Transactions on Computers. Vol. 58, No. 11, November, 2009.
  • "Pipelining Parallel Image Compositing and Delivery for Efficient Remote Visualization."
    Qishi Wu, Jinzhu Gao, Zizhong Chen, and Mengxia Zhu.
    Journal of Parallel and Distributed Computing, Vol. 69, No. 3, March, 2009.
  • "Algorithm-Based Fault Tolerance for Fail-Stop Failures."
    Zizhong Chen and Jack Dongarra.
    IEEE Transactions on Parallel and Distributed Systems, Vol. 19, No. 12, December, 2008.
  • "Recovery Patterns for Iterative Methods in a Parallel Unstable Environment."
    Julien Langou, Zizhong Chen, George Bosilca, and Jack Dongarra.
    SIAM Journal on Scientific Computing, 30(1):102-116, 2007.
  • "Self Adapting Numerical Software (SANS) Effort."
    Jack Dongarra, George Bosilca, Zizhong Chen, Victor Eijkhout, Graham Fagg, Erika Fuentes, Julien Langou, Piotr Luszczek, Jelena Pjesivac-Grbovic, Keith Seymour, Haihang You, and Satish S. Vadiyar.
    IBM Journal of Research and Development. Volume 50, Number 2/3, Page 223-238, 2006.
  • "Process Fault-Tolerance: Semantics, Design and Applications for High Performance Computing."
    Graham E. Fagg, Edgar Gabriel, Zizhong Chen, Thara Angskun, George Bosilca, Jelena Pjesivac-Grbovic, and Jack Dongarra.
    International Journal of High Performance Computing Applications, Volume 19, Number 4, Page 465-477, Winter, 2005.
  • "Condition Numbers of Gaussian Random Matrices."
    Zizhong Chen and Jack J. Dongarra.
    SIAM Journal on Matrix Analysis and Applications, Volume 27, Number 3, Page 603-620, 2005.
  • "Self Adapting Software for Numerical Linear Algebra and LAPACK for Clusters."
    Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche.
    Parallel Computing, Volume 29, Number 11-12, Page 1723-1743, November-December, 2003.
Book Chapters
  • "Scalable Fault Tolerance for Large-Scale Parallel and Distributed Computing."
    Zizhong Chen.
    Handbook of Research on Scalable Computing Technologies, IGI Global, 2009.
  • "Disaster Survival Guide in Petascale Computing: An Algorithmic Approach."
    Jack J. Dongarra, Zizhong Chen, George Bosilca, and Julien Langou.
    Petascale Computing: Algorithms and Applications, Chapman & Hall / CRC Press, 2007.
Papers in Conference Proceedings
  • "Optimal Real Number Codes for Fault Tolerant Matrix Operations."
    Zizhong Chen.
    Proceedings of the ACM/IEEE SC09 Conference, Portland, OR, November 14-20, 2009. ACM Press. 22% acceptance ratio.
  • "N-Level Diskless Checkpointing."
    Doug Hakkarinen and Zizhong Chen.
    Proceedings of the 11th IEEE International conference on High Performance Computing and Communications (HPCC-09), Seoul, Korea, June 25-27, 2009. IEEE Computer Society Press. 23% acceptance ratio.
  • "A Scalable Checkpoint Encoding Algorithm for Diskless Checkpointing."
    Zizhong Chen and Jack Dongarra.
    Proceedings of the 11th IEEE High Assurance Systems Engineering Symposium, (HASE'08), Nanjing, China, December 3 - 5, 2008. IEEE Computer Society Press. 22% acceptance ratio for full papers.
  • "Extending Algorithm-based Fault Tolerance to Tolerate Fail-stop Failures in High Performance Distributed Environments."
    Zizhong Chen.
    Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium, DPDNS'08 Workshop, Miami, FL, USA, April 14-18, 2008. IEEE Computer Society Press.
  • "Performance of MPI Broadcast Algorithms."
    Daniel M. Wadsworth and Zizhong Chen.
    Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium, PDSEC'08 Workshop, Miami, FL, USA, April 14-18, 2008. IEEE Computer Society Press.
  • "An Efficient Packet Loss Recovery Methodology for Video-over-IP."
    Ming Yang, Nikolaos Bourbakis, Zizhong Chen, and Guillermo Francia, III.
    Proceedings of the 9th IASTED International Conference on Signal and Image Processing (SIP2007), Honolulu, Hawaii, USA, August 20-22, 2007.
  • "An Efficient Recovery Scheme for Supercomputing Clusters and Grids."
    Zizhong Chen, Ming Yang, Monica Trifas, and Jack Dongarra.
    Proceedings of the 6th International Conference on Distributed Computing and Applications for Business, Engineering and Sciences (DCABES2007), Yichang, Hubei, P. R. China, August 14-17, 2007.
  • "Self Adaptive Application Level Fault Tolerance for Parallel and Distributed Computing."
    Zizhong Chen, Ming Yang, Guillermo Francia, III, and Jack Dongarra.
    Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium, DPDNS'07 Workshop, Long Beach, CA, USA, March 26-29, 2007. IEEE Computer Society Press.
  • "An Efficient Audio-Video Synchronization Methodology."
    Ming Yang, Nikolaos Bourbakis, Zizhong Chen, and Monica Trifas.
    Proceedings of the 2007 IEEE International Conference on Multimedia & Expo (ICME 2007), Beijing, P. R. China, July 2-5 , 2007. IEEE Computer Society Press.
  • "Algorithm-Based Checkpoint-Free Fault Tolerance for Parallel Matrix Computations on Volatile Resources."
    Zizhong Chen and Jack Dongarra.
    Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2006), Rhodes Island, Greece, April 25-29, 2006. IEEE Computer Society Press.
  • "Fault Tolerant High Performance Computing by a Coding Approach."
    Zizhong Chen, Graham E. Fagg, Edgar Gabriel, Julien Langou, Thara Angskun, George Bosilca, and Jack J. Dongarra.
    Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'05), Chicago, Illinois, USA, June 15-17, 2005. ACM Press.
  • "Numerically Stable Real Number Codes Based on Random Matrices."
    Zizhong Chen and Jack J. Dongarra.
    Proceedings of the 5th International Conference on Computational Science (ICCS2005), Atlanta, Georgia, USA, May 22-25, 2005. LNCS 3514, Springer-Verlag.
  • "Extending the MPI Specification for Process Fault Tolerance on High Performance Computing Systems."
    Graham E. Fagg, Edgar Gabriel, George Bosilca, Thara Angskun, Zizhong Chen, Jelena Pjesivac-Grbovic, Kevin London and Jack J. Dongarra.
    Proceedings of the 19th International Supercomputer Conference (ISC2004), Heidelberg, German, June 21-24, 2004.
  • "LAPACK for Clusters Project: An Example of Self Adapting Numerical Software."
    Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche.
    Proceedings of the 37th Hawaii International Conference on System Sciences (HICSS-37), Kauai, Hawaii, USA, January 5-8, 2004. IEEE Computer Society Press.
  • "Fault Tolerant Communication Library and Applications for High Performance Computing."
    Graham E. Fagg, Edgar Gabriel, Zizhong Chen, Thara Angskun, George Bosilca, Antonin Bukovsky, and Jack J. Dongarra.
    Proceedings of the 4th Los Alamos Computer Science Institute Symposium (LACSI'03), Santa Fe, NM, USA, October 27-29, 2003.
  • "Self Adaptive Software for Numerical Linear Algebra Library Routines on Clusters."
    Zizhong Chen, Jack Dongarra, Piotr Luszczek, and Kenneth Roche.
    Proceedings of the 3rd International Conference on Computational Science, WoPLA'03 Workshop, Melbourne, Australia, June 7-9, 2003. LNCS 2659, Springer-Verlag.


Presentations
    - 2009 -
  • Invited Talk, HPC Resiliency Summit at Los Alamos Computer Science Symposium 2009 (LACSS 2009), Santa Fe, New Mexico, USA, Oct 14th, 2009.
  • SIAT Seminar, Shenzhen Institute of Advanced Technology, Chinese Academy Sciences, Shenzhen, China, July 24, 2009.
  • Computational Mathematics Colloquium, University of Colorado, Denver, Colorado, April 29, 2009.
  • MCS Colloquium, Colorado School of Mines, Golden, Colorado, March 27, 2009.
  • CARDI Lunch Talk, Colorado School of Mines, Golden, Colorado, March 19, 2009.

  • - 2008 -
  • Engineering Graduate Colloquium, Colorado School of Mines, Golden, Colorado, November 11, 2008.
  • BMAC Seminar, Colorado State University, Fort Collins, Colorado, November 10, 2008.
  • Invited Talk, Colorado School of Mines, Golden, Colorado, February 11, 2008.
  • The 13th IEEE Workshop on Dependable Parallel, Distributed, and Network-Centric Systems (DPDNS 2008), Miami, Florida, USA, April 14-18, 2008.

  • - 2007 -
  • Colloquium, Jacksonville State University , Jacksonville, Alabama, June 20, 2007.
  • The 12th IEEE Workshop on Dependable Parallel, Distributed, and Network-Centric Systems (DPDNS 2007), Long Beach, California, USA, March 26-29, 2007.

  • - 2006 -
  • Colloquium, Jacksonville State University, Jacksonville, Alabama, September 14, 2006.
  • Invited Talk, University at Albany, State University of New York, Albany, New York, USA, May 10, 2006.
  • Invited Talk, Montana State University, Bozeman, Montana, April 27, 2006.
  • Invited Talk, Florida International University, Miami, Florida, USA, March 9, 2006.
  • The 12th SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, California, USA, February 22-24, 2006.
  • Ph.D. Dissertation Defense, Knoxville, Tennessee, USA, February 20, 2006.

  • - 2005 -
  • The 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 05), Illinois, Chicago, USA, June 15 - 17, 2005.
  • Invited Talk, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA, July 12, 2005.
  • The 12th International Linear Algebra Society Conference, Regina, Saskatchewan, Canada, June 26-29, 2005.
  • The 5th International Conference on Computational Science, Atlanta, Georgia, USA, May 22-25, 2005.
  • The 7th IMACS International Symposium on Iterative Methods in Scientific Computing, Toronto, Ontario, Canada, May 5-8, 2005.

  • - 2004 -
  • ICL 2004 Retreat, Townsend, Tennessee, USA, September 2 - 3, 2004.

Professional Activities

Program/General/Steering/Publicity Chair
  • Program Chair, The 2009 International Symposium on Scientific and Engineering Computing (SEC-09), Vancouver, Canada, August 29-31, 2009.
  • Workshop Chair, the 12th IEEE International Conference on Computational Science and Engineering (CSE-09), Vancouver, Canada, August 29-31, 2009.
  • Program Vice Chair, the 11th IEEE International Conference on High Performance Computing and Communications (HPCC-09), Seoul, Korea, June 25-27, 2009.
  • Program Co-Chair, the 9th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC-08), Miami, Florida, USA, April 14-18, 2008
Technical Program Committee
  • The 3rd International Workshop on Resiliency in High Performance Computing (Resilience 2010), Melbourne, Victoria, Australia, May 17-20, 2010.
  • The 12th IEEE International Conference on Computational Science and Engineering (CSE-09), Vancouver, Canada, August 29-31, 2009.
  • The 2009 IEEE Workshop on Grid and P2P Systems and Applications (GridPeer 2009), San Francisco, CA, USA, August 2 ~ 6, 2009.
  • The 10th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC-09), Rome, Italy, May 25-29, 2009.
  • The 2009 International Conference on Bio-Science and Bio-Technology (BSBT 2009), Jeju Island, Korea, December 10 ~ 12, 2009.
  • The 4th International Symposium on Innovations and Real-time Applications of Distributed Sensor Networks (IRA-DSN 2009), Hangzhou, P. R. China, May 18-21, 2009.
  • The 10th IEEE International Conference on High Performance Computing and Communications (HPCC-08), DaLian, China, September 25-27, 2008.
  • The 2008 International Conference on Bio-Science and Bio-Technology (BSBT 2008), Hainan Island, China, December 13 ~ 15, 2008.
  • The 2007 IEEE International Conference on Networking, Architecture, and Storage (NAS-07), Guilin, P. R. China, July 29-31, 2007.
Reviewer for Journals and Conferences
  • IEEE Transactions on Parallel and Distributed Systems
  • IEEE Transactions on Dependable and Secure Computing
  • IEEE Transactions on Circuits and Systems I
  • ACM Transactions on Autonomous Adaptive Systems
  • ACM Journal of Experimental Algorithmics
  • Journal of Supercomputing
  • International Journal of Distributed Sensor Networks
  • International Journal of Parallel, Emergent and Distributed Systems
  • International Journal of Future Generation Communication and Networking
  • BMC Genomics
  • Journal of Parallel and Distributed Computing
  • Various conferences and workshops in computer science and engineering

Reviewer for Grant Proposal
  • National Telecommunications and Information Administration, U.S. Department of Commerce









Last update: Sept 16, 2009