G. Călin Caşcaval

Office address
Barefoot Networks
2185 Park Blvd, Palo Alto, CA, 94306
Email: cascaval@acm.org
LinkedIn: https://www.linkedin.com/in/cascaval

Dr. Calin Cascaval is Director of Compilers and Tools at Barefoot Networks. He is in charge of the compiler toolchain for the P4 language. Previously he held positions at Qualcomm Research and IBM Research. As Sr. Director in Qualcomm Research led Qualcomm's strategy on power aware computing and led the development of the Qualcomm Symphony System Manager (formerly known as Qualcomm MARE), a runtime system and programming model for heterogeneous computing. His led the team that developed parallel math libraries, the Zoomm parallel browser, and the MCJS parallel Javascript engine. At the IBM TJ Watson Research Center Calin worked on systems software, programming models, and compilers for large scale parallel systems projects, including Blue Gene and PERCS. Calin has a PhD in Computer Science from the University of Illinois at Urbana-Champaign. He is a senior member of ACM and IEEE. Calin has more than 50 peer-reviewed publications and more than 20 awarded patents. Calin works extensively with academia. As part of his service activities, he is currently the Steering Committee chair for the ACM SIGPLAN PPoPP Symposium and Guest Editor for the 2015 IEEE Micro Special Issue on Mobile Systems. He is passionate about programming and making it easier to target concurrent systems.

Professional Experience

Sept 2016 - present
Director
Barefoot Networks, Palo Alto, CA.

Chief Bit Orchestrator -- in charge of compilers and tools for the P4 language.

Oct 2009 - Sept 2016
Sr. Director of Engineering (2013-), Director (2009-2013)
Qualcomm Research Silicon Valley, Santa Clara, CA.

Led Qualcomm's Power Aware Computing strategy. Led projects on heterogeneous mobile computing, including the Symphony System Manager, and parallel libraries, e.g., best in class ARM math libraries (Snapdragon Math Libraries)
Led the development of the first end-to-end parallel browser and parallel JavaScript engine for mobile devices.
Responsible for management, mentoring, and hiring.
Led collaborations with academia (UC Berkeley, UIUC, UT Austin, University of Washington, Georgia Tech).

Sept 2004 - Oct 2009
Manager, Programming Models and Tools for Scalable Systems Group
IBM T.J. Watson Research Center, Yorktown Heights, NY.

Job responsibilities include leading research projects with globally distributed teams of 10-15 persons and managing a 7 person research team. Participating in developing and driving the IBM Research strategy in compilers and systems software.

Leader of the PERCS (DARPA HPCS) Compilers team. As part of this effort we explored and developed a number of technologies to improve programmer productivity:

Leading exploratory projects in parallel programming models and parallel languages to improve programmer productivity:

Mentoring IBM employees, PhD students and student interns.
Extensive collaborations with academia.

July 2000 - Sept. 2004
Research Staff Member
IBM T.J. Watson Research Center, Yorktown Heights, NY.
Participated in the design and development of system software and performance analysis tools as part of several projects.

Aug. 1996 - June 2000
Graduate Research Assistant
Computer Science Department, Univ. of Illinois at Urbana-Champaign, Urbana, IL.
Involved in the Polaris and Delphi projects.
Involved in the Polaris and Delphi projects, working on compile-time performance prediction, data locality, and parallel programming models.

May 1995 - July 1996
Research Associate
CyberMarche, Inc., Morgantown WV
Responsible for design, implementation, and testing in several projects: Additional responsibilities included system administrator for a computer network consisting of Sun and IBM-PC computers.
Responsible for design, implementation, and testing in several projects targeted towards knowledge gathering and sharing for software project management.

Aug. 1993-May 1995
Graduate Research Assistant
Concurrent Engineering Research Center-WVU, Morgantown, WV

Jun. 1991 - Aug. 1993
Research Associate
IPA (Institute for Design in Automation), Cluj-Napoca, Romania

Education

PhD in Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, June 2000
Thesis Title: Compile-time Performance Prediction of Scientific Programs. Available as UIUC Technical Report UIUCDCS-R-2000-2167.
MS in Computer Science, West Virginia University, Morgantown, WV, May 1995
Thesis Title: Optimizing Communication in Parallel Compilers
MS in Computer Engineering, Technical University Cluj-Napoca, Romania, June 1991
Thesis Title: Sistem pentru Analiza si Recunoasterea Semnalului Vocal (Speech Analysis and Recognition System)

Awards

Keynote Presentations

  1. Qualcomm Symphony: Orchestrating Heterogeneity for Power Aware Computing, Workshop on Architectures and Systems for Real-time Mobile Vision Applications (ASR-MOV) - In conjunction with CGO 2016
  2. Are scripting languages ready for mobile computing?, The 2014 International Symposium on Code Generation and Optimization, CGO 2014
  3. Parallel Programming for Mobile Computing, The 22nd International Conference on Parallel Architectures and Compilation Techniques, PACT 2013
  4. Programming for Mobile Gadgets, The First International Workshop on Parallelism in Mobile Platforms, PRISM-1 - In conjunction with HPCA 2013
  5. Power Programming, Programming Models for Emerging Architectures (PMEA) - In conjunction with PACT 2010

Publications

Google Scholar citations: 3368 (as of 9/12/2016).

Five most relevant recent publications:
  1. Deoptimization for dynamic language JITs on typed, stack-based virtual machines
    Madhukar N. Kedlaya, Behnam Robatmili, Calin Cascaval, Ben Hardekopf
    Virtual Execution Environments (VEE 2014), Mar 2014. Best Paper Award.

  2. Zoomm: A Parallel Web Browser Engine for Multicore Mobile Devices
    Calin Cascaval, Seth Fowler, Pablo Montesinos, Wayne Piekarski, Mehrdad Reshadi, Behnam Robatmili, Michael Weber, and Vrajesh Bhavsar
    Proceedings of The 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2013), Shenzhen, China, Feb 2013.

  3. How Much Parallelism is There in Irregular Applications?
    Milind Kulkarni, Martin Burtscher, R. Inkulu, Keshav Pingali, Calin Cascaval
    Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'09, Raleigh, NC, February 2009

  4. ACM DL Author-ize serviceSoftware transactional memory: why is it only a research toy?
    Calin Cascaval, Colin Blundell, Maged Michael, Harold W. Cain, Peng Wu, Stefanie Chiras, Sid Chatterjee
    Communications of the ACM, Nov 2008

  5. Bulk Disambiguation of Speculative Threads in Multiprocessors
    Luis Ceze, James Tuck, Calin Cascaval, and Josep Torrellas
    Proceedings of the 33rd Annual International Symposium on Computer Architecture, ISCA 2006, Boston, MA, June 2006

  1. Concurrency in Mobile Browser Engines
    Calin Cascaval, Pablo Montesinos-Ortego, Behnam Robatmili, Dario Suarez-Gracia
    IEEE Pervasive Computing, Vol. 14, Issue 3, July-Sept, 2015

  2. MuscalietJS: rethinking layered dynamic web runtimes
    Behnam Robatmili, Calin Cascaval, Mehrdad Reshadi, Madhukar N. Kedlaya, Seth Fowler, Vrajesh Bhavsar, Michael Weber, Ben Hardekopf
    Virtual Execution Environments (VEE 2014), Mar 2014

  3. Deoptimization for dynamic language JITs on typed, stack-based virtual machines
    Madhukar N. Kedlaya, Behnam Robatmili, Calin Cascaval, Ben Hardekopf
    Virtual Execution Environments (VEE 2014), Mar 2014. Best Paper Award.

  4. Zoomm: A Parallel Web Browser Engine for Multicore Mobile Devices
    Calin Cascaval, Seth Fowler, Pablo Montesinos, Wayne Piekarski, Mehrdad Reshadi, Behnam Robatmili, Michael Weber, and Vrajesh Bhavsar
    Proceedings of The 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2013), Shenzhen, China, Feb 2013.

  5. Automatic Discovery of Performance and Energy Pitfalls in HTML and CSS
    Adrian Sampson, Calin Cascaval, Luis Ceze, Pablo Montesinos, and Dario Suarez-Gracia
    2012 IEEE International Symposium on Workload Characterization (IISWC), Nov 2012

  6. Multidimensional dynamic behavior in mobile computing
    Mehrdad Reshadi, Calin Cascaval
    2012 IEEE International Symposium on Workload Characterization, Nov 2012

  7. A Case for Parallelizing Web Pages
    Haohui Mai, Shuo Tang, Samuel T. King, Calin Cascaval, Pablo Montesinos
    4th USENIX Workshop on Hot Topics in Parallelilsm (HotPar'12), Jun 2012

  8. Heterogenous Systems Programming
    Calin Cascaval, Pablo Montesinos
    2nd Workshop on SoC Architecture, Accelerators and Workloads (SAW-2), Feb, 2011

  9. A Taxonomy of Accelerator Architectures and their Programming Models
    Calin Cascaval, Siddhartha Chatterjee, Hubertus Franke, Kevin Gildea, and Pratap Pattnaik
    IBM Journal of Research and Development, vol 54, issue 5, Sept/Oct 2010

  10. ACM DL Author-ize service The Bulk Multicore Architecture for Improved Programmability
    Josep Torrellas, Luis Ceze, James Tuck, Calin Cascaval, Pablo Montesinos, Wonsun Ahn, Milos Prvulovic
    Communications of the ACM, Dec 2009

  11. Analytical Modeling of Pipeline Parallelism
    Angeles Navarro, Rafael Asenjo, Siham Tabik and Calin Cascaval
    Proceedings of the IEEE International Conference on Parallel Architectures and Compilation Techniques (PACT 2009), Raleigh, NC, Sept 2009

  12. ACM DL Author-ize serviceLoad balancing using work-stealing for pipeline parallelism in emerging applications
    Angeles Navarro, Rafael Asenjo, Siham Tabik, Calin Cascaval
    ICS '09 Proceedings of the 23rd International conference on Supercomputing, 2009

  13. Porting k-means clustering to accelerators with the APGAS runtime
    David Cunningham, Sayantan Sur, George Almasi, Vijay Saraswat, and Calin Cascaval
    Proceedings of the first Workshop on Asynchrony in the Partition Global Address Space Languages, Yorktown Heights, NY, June 2009 (with ICS09)

  14. Parallelization Spectroscopy: Analysis of Thread-level Parallelism in HPC Programs
    Arun Kejariwal and Calin Cascaval
    Proceedings of the The 2nd Workshop on Parallel Execution of Sequential Programs on Multi-core Architectures (PESPMA 2009), Austin, TX, June 2009

  15. Scalable RDMA performance in PGAS languages
    Montse Farreras, George Almasi, Calin Cascaval, and Toni Cortes
    Proceedings of the IEEE International Parallel & Distributed Processing Symposium, Rome, Italy, May 2009

  16. Lonestar: A Suite of Parallel Irregular Programs
    Milind Kulkarni, Martin Burtscher, Calin Cascaval, and Keshav Pingali
    Proceedings of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software, Boston, MA, April 2009

  17. ACM DL Author-ize service How much parallelism is there in irregular applications?
    Milind Kulkarni, Martin Burtscher, R. Inkulu, Keshav Pingali, Calin Cascaval
    Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'09, Raleigh, NC, February 2009

  18. Parallelization Spectroscopy: Analysis of thread level paralellism in HPC programs
    Arun Kejariwal, Calin Cascaval
    Poster presentation at the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'09, Raleigh, NC, February 2009

  19. ACM DL Author-ize service Software transactional memory: why is it only a research toy?
    Calin Cascaval, Colin Blundell, Maged Michael, Harold W. Cain, Peng Wu, Stefanie Chiras, Sid Chatterjee
    Communications of the ACM, Nov 2008

  20. Compiler and Runtime Techniques for Software Transactional Memory Optimization
    Peng Wu, Maged Michael, Christoph von Praun, Takuya Nakaike, Rajesh Bordawekar, Harold W. Cain, Calin Cascaval, Siddhartha Chatterjee, Stefanie Chiras, Rui Hou, Mark Mergen, Xiaowei Shen, Michael F. Spear, Hua Yong Wang , Kun Wang
    In the Journal of Concurrency and Computation: Practice and Experience

  21. Compiler-driven Program Dependence Profiling for To Guide Program Parallelization
    Peng Wu, Arun Kejariwal, and Calin Cascaval
    Proceedings of the 21st International Workshop on Languages and Compilers for Parallel Computing, LCPC'08, Edmonton, AB, 2008


  22. ACM DL Author-ize service Modeling optimistic concurrency using quantitative dependence analysis
    Christoph von Praun, Rajesh Bordawekar, and Calin Cascaval
    Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'08, Salt Lake City, UT, February 2008

  23. Concurrency Control with Data Coloring
    Luis Ceze, Christoph von Praun, Calin Cascaval, Pablo Montesinos, and Josep Torrellas
    Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, Seattle, WA, March 2008

  24. Multidimensional blocking in UPC
    Christopher Barton, Calin Cascaval, George Almasi, Rahul Garg, Jose Nelson Amaral, and Montse Farreras
    Proceedings of the 20th International Workshop on Languages and Compilers for Parallel Computing, LCPC'07, Urbana, IL, 2007

  25. ACM DL Author-ize service Implicit parallelism with ordered transactions
    Christoph von Praun, Luis Ceze, and Calin Cascaval
    Proceedings of the 2007 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'07, San Jose, CA, March 2007

  26. A Characterization of Shared Data Access Patterns in UPC Programs
    Christopher Barton, Calin Cascaval, and Jose Nelson Amaral
    Proceedings of the 19th International Workshop on Languages and Compilers for Parallel Computing, LCPC'06, New Orleans, LA, 2006

  27. ACM DL Author-ize service Bulk Disambiguation of Speculative Threads in Multiprocessors
    Luis Ceze, James Tuck, Calin Cascaval, and Josep Torrellas
    Proceedings of the 33rd Annual International Symposium on Computer Architecture, ISCA 2006, Boston, MA, June 2006

  28. ACM DL Author-ize service Shared memory programming for large scale machinesShared Memory Programming for Large Scale Machines
    Christopher Barton, Calin Cascaval, George Almasi, Yili Zheng, Montse Farreras, Siddhartha Chatterjee, Jose Nelson Amaral
    Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation, PLDI 2006, Ottawa, Canada, June 2006

  29. Performance and environment monitoring for Continuous Program Optimization
    Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, Robert W. Wisniewski
    IBM Journal of Research and Development, Volume 50, Number 2/3, March 2006

  30. Multiple Page Size Modeling and Optimization
    Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, and Robert W. Wisniewski
    Proceedings of the The Fourteenth International Conference on Parallel Architectures and Compilation Techniques (PACT 2005), Sept. 2005, St. Louis, MO

  31. Optimizing NANOS OpenMP for the IBM Cyclops Multithreaded Architecture
    David Rodenas, Xavier Martorell, Eduard Ayguade, Jesus Labarta, George Almasi, Calin Cascaval, Jose Castanos, Jose E. Moreira
    Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05), April 2005, Denver, CO.

  32. Performance and Environment Monitoring for Whole-System Characterization and Optimization
    Robert W. Wisniewski, Peter F. Sweeney, Kartik Sudeep, Matthias Hauswirth, Evelyn Duesterwald, Calin Cascaval, and Reza Azimi
    The first Watson Conference on Interaction between Architecture, Circuits, and Compilers, Oct. 2004, Yorktown Heights, NY.

  33. Characterizing and Predicting Program Behavior and its Variability
    Evelyn Duesterwald, Calin Cascaval, and Sandhya Dwarkadas
    Proceedings of the Twelfth International Conference on Parallel Architectures and Compilation Techniques (PACT 2003), Sept. 2003, New Orleans, LA.

  34. An Overview of the Blue Gene/L System Software Organization
    George Almasi, Ralph Bellofatto, Calin Cascaval, Jose G. Castanos, Luis Ceze, Paul Crumley, C. Christopher Erway, Joseph Gagliano, Derek Lieber, Jose E. Moreira, Alda Sanomiya, and Karin Strauss
    Proceedings of the International Conference on Parallel and Distributed Computing (Euro-Par 2003) Aug. 2003, Klagenfurt, Austria.

  35. Estimating Cache Misses and Locality Using Stack Distances
    Calin Cascaval and David A. Padua
    Proceedings of the International Conference on Supercomputing (ICS 2003), June 2003, San Francisco, California, USA.

  36. Evaluation of OpenMP for the Cyclops Mulithreaded Architecture
    George Almasi, Eduard Ayguade, Calin Cascaval, Jose Castanos, Jesus Labarta, Fracisco Martinez, Xavier Martorell, Jose Moreira
    Proceedings of the Workshop on OpenMP Applications and Tools (WOMPAT 2003), Jun. 2003, Toronto, Canada.

  37. Full Circle: Simulating Linux Clusters on Linux Clusters
    Luis Ceze, Karin Strauss, George Almasi, Patrick Bohrer, Jose R. Brunheroto, Calin Cascaval, Jose G. Castanos, Derek Lieber, Jose E. Moreira, Alda Sanomiya, and Eugen Schenfeld
    Proceedings of the Fourth LCI International Conference on Linux Clusters, Jun. 2003, San Jose, CA.

  38. System Management in the BlueGene/L Supercomputer
    G. Almasi, L. Bachega, R. Bellofatto, J. Brunheroto, C. Cascaval, J. Castanos, P. Crumley, C. Erway, J. Gagliano, D. Lieber, P. Mindlin, J.E. Moreira, R.K. Sahoo, A. Sanomiya, E. Schenfeld, R. Swetz, M. Bae, G. Laib, K. Ranganathan, Y. Aridor, T. Domany, Y. Gal, O. Goldshmidt, E. Shmueli
    Proceedings of the Third Workshop on Massively Parallel Processing (WMPP 2003), Apr. 2003, Nice, France.

  39. An Overview of the BlueGene/L Supercomputer
    N Adiga et al.,
    In Supercomputing, Nov, 2002

  40. Dissecting Cyclops: A Detailed Analysis of a Multithreaded Architecture
    George Almasi, Calin Cascaval, Jose G. Castanos, Monty Denneau Derek Lieber, Jose E. Moreira, and Henry S. Warren, Jr.
    Proceedings of the Workshop on Chip MultiProcessor: Processor Architecture and Memory Hierarchy Related Issues (MEDEA 2002), Sept. 2002, Charlottesville, VA

  41. Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflops Computer
    George S. Almasi, Calin Cascaval, Jose G. Castanos, Monty Denneau, Wilm Donath, Maria Eleftheriou, Mark Giampapa, Howard Ho, Derek Lieber, Jose E. Moreira, Dennis Newns, Marc Snir, Henry S. Warren, Jr.
    International Journal of Parallel Programming (IJPP), 30(4), Aug. 2002.

  42. Calculating Stack Distances Efficiently
    George Almasi, Calin Cascaval, and David A. Padua
    Proceedings of the Workshop on Memory System Performance (MSP 2002), June 2002, Berlin, Germany

  43. A survey of compiler techniques for energy efficient computing
    Calin Cascaval, Jose G. Castanos, Derek Lieber, and Jose E. Moreira
    Austin Conference on Energy-Efficient Design, Feb. 2002, Austin, TX.

  44. Evaluation of a Multithreaded Architecture for Cellular Computing
    Calin Cascaval, Jose G. Castanos, Luis Ceze, Monty Denneau, Manish Gupta, Derek Lieber, Jose E. Moreira, Karin Strauss, Henry S. Warren, Jr.
    Proceedings of the 8th International Symposium on High-Performance Computer Architecture, Feb 2002, Cambridge, MA, USA

  45. Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflop Computer
    George S. Almasi, Calin Cascaval, Jose G. Castanos, Monty Denneau, Wilm Donath, Maria Eleftheriou, Mark Giampapa, Howard Ho, Derek Lieber, Jose E. Moreira, Dennis Newns, Marc Snir, Henry S. Warren, Jr.
    Proceedings of the International Conference on Supercomputing (ICS 2001), June 2001, Sorrento, Italy

  46. Blue Gene: A vision for protein science using a petaflop supercomputer
    IBM Blue Gene team
    IBM Systems Journal, Volume 40, Number 2, 2001

  47. Compile-time Based Performance Prediction
    Calin Cascaval, Luiz DeRose, David Padua, Daniel Reed
    Proceedings of the Twelfth International Workshop on Languages and Compilers for Parallel Computing (LCPC99).

  48. MATmarks: A Shared Memory Environment for MATLAB Programming
    George Almasi, Calin Cascaval, and David A. Padua
    Poster presentation to HPDC'99.

  49. PACT - A Software Package to Manage Projects and Coordinate People
    K.J. Cleetus, Calin Cascaval, and K. Matsuzaki
    Proceedings of the fifth WETICE, Stanford, CA, June 1996, published by the IEEE Computer Society Press.

  50. Web* - A Technology to Make Information Available on the Web
    George Almasi, Anca Suvaiala, Ion Muslea, Calin Cascaval, Tad Davis, V. "Juggy" Jagannathan
    Proceedings of the forth workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises, Berkeley Springs, West Virginia, April 1995, published by the IEEE Computer Society Press.

  51. TclDii: A TCL Interface to the Orbix(TM) Dynamic Invocation Interface
    George Almasi, Anca Suvaiala, Cristian Goina, Calin Cascaval, V. "Juggy" Jagannathan
    OOPSLA 1995

  52. A Collaborative Environment for Independent Verification and Validation of Software
    Raghu Karinthi, Kankanahalli Srinivas, Sumitra Reddy, Ramana Reddy, Calin Cascaval, Walter Jackson, Srinivasan Venkatraman, Honglan Zheng
    Proceedings of the third workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises, Morgantown, West Virginia, April 1994, published by the IEEE Computer Society Press.

Books Co-Edited

Issued Patents

  1. US9372836 HTML5 I-frame extension Reshadi, Cascaval
  2. US9171097 Memoizing web-browsing computation with DOM-based isomorphism, Ceze, Cascaval, Wang, Mahan, Dhillon, Ruotsi, Mandyam
  3. US9092327 System and method for allocating memory to dissimilar memory devices using quality of service, De, Stewart, Cascaval, Chun
  4. US9003380 Execution of dynamic languages via metadata extraction, Cascaval, Reshadi.
  5. US8886887 Uniform external and internal interfaces for delinquent memory operations to facilitate cache optimization, Cascaval, Gao, Martin, Mendell.
  6. US8595443 Varying a data prefetch size based upon data usage, Arimilli, Cascaval, Sinharoy, Speight, Zhang.
  7. US8572341: Overflow handling of speculative store buffers, Blundell, Cain, Cascaval, Michael.
  8. US8539486: Transactional block conflict resolution based on the determination of executing threads in parallel or in serial mode, Cain, Cascaval, Michael.
  9. US8510237: Machine learning method to identify independent tasks for parallel layout in web browsers, Cascaval, Sampson, Wang
  10. US8392694: System and method for software initiated checkpoint, Blundell, Cain, Cascaval, Michael.
  11. US8266381: Varying an amount of data retrieved from memory based upon an instruction hint, Arimili, Cascaval, Sinharoy, Speight, Zhang.
  12. US8255913: Notification to task of completion of GSM operations by initiator node, Arimili, Blackmore, Cascaval, Rajamony.
  13. US8255626: Atomic commit predicated on consistency of watches, Blundell, Cain, Cascaval, Michael.
  14. US8250307: Sourcing differing amounts of prefetch data in response to data prefetch requests, Arimili, Cascaval, Sinharoy, Speight, Zhang.
  15. US8239879: Notification to task of completion of GSM operations at target node, Arimili, Blackmore, Cascaval, Rajamony.
  16. US8136103: Combining static and dynamic compilation to remove delinquent loads, Cascaval, Gao, Kielstra, Stoodley.
  17. US8122439: Method and computer program product for dynamically and precisely discovering deliquent memory operations, Cascaval, Gao, Yotov.
  18. US7954094: Method for improving performance of executable code, Cascaval, Chatterjee, Duesterwald, Kielstra, Stoodley.
  19. US7610266: Method for vertical integrated performance and environment monitoring, Cascaval, Duesterwald, Sweeney, and Wisniewski.
  20. US7596680: System and method for encoding and decoding architecture registers, Cascaval and Chatterjee.
  21. US7380086: Scalable runtime system for global address space languages on shared and distributed memory machines, Archambault, Bolmarcich, Cascaval, Chatterjee, Elefteriou, Mak.
  22. US7376808: Method and system for predicting the performance benefits of mapping subsets of application data to multiple page sizes, Cascaval, Duesterwald, Sweeney, Wisniewski.
  23. US7289939: Mechanism for on-line prediction of future performance measurements in a computer system, Cascaval, Duesterwald, and Dwarkadas.
  24. US7072805: Mechanism for on-line prediction of future performance measurements in a computer system, Cascaval, Duesterwald, and Dwarkadas.

Service

Steering Committee
Principles and Practices of Parallel Programming (PPoPP), Chair (2012-), Member (2011-)
Computing Frontiers (2011-2014)
Editorial
IEEE Micro Special Issue on Mobile Systems, Feb 2015
PhD Committee Member
Daniel Ahn, PhD, UIUC, 2012
Luis Ceze, PhD, UIUC, 2007
PhD Student Mentor
Luis Ceze, UIUC
Christopher (Kit) Barton, Univ. of Alberta
Karin Strauss, UIUC
Technical Program Committee
ASPLOS (2013-2016)
CGO (2004)
CPC (2009)
HotPar (2012)
ICPP (2008)
ICS (2012)
IEEE Micro Top Picks (2006, 2007)
IPDPS (2011)
LCPC (2005-2009, 2012-2015)
MICRO (2009)
PACT (2010, 2015, 2016)
PGAS (2009, 2010)
PLDI (2014),
PPoPP (2009, 2013)
SC (2007), numerous workshops
Organizing Committee
LCPC 2006, 2013 (General Chair)
SBAC-PAD 2012 (Program Vice Chair)
Computing Frontiers 2011 (General Chair)
PPoPP 2011 (General Chair)
ICPP 2011 (Program Vice Chair)
CPC 2009(General Co-Chair)
PPoPP 2008 (Publications Chair)
PACT 2007 (Finance Chair)
PPoPP 2006 (Local Arrangements Chair)
External Review Committee
ASPLOS (2010, 2012, 2017)
ISCA (2010, 2016)
PLDI (2015)
PPoPP (2010)
Advisory Board Member
Center for Programmable Extreme-Scale Computing at Univ. of Illinois
NSF panelist
Guest Editor for IEEE Micro Special Issue on Mobile Systems.
Mentored 3 PhD students and several colleagues in IBM.
Steering Committee chair for PPoPP (2012-present)
General Chair PPoPP 2011, Computing Frontiers 2011, LCPC 2013, CPC 2009.
Program Committee Vice-Chair ICPP 2011.
Program Committee Member for ASPLOS, CGO, ICPP, ICS, IPDPS, LCPC, MICRO, PACT, PGAS, PLDI, PPoPP, SBAC-PAD, SC, and many workshops.
Organizing Committee Member for PPoPP, PACT, LCPC, and SBAC-PAD.
Panelist for NSF.
Reviewer for numerous journals and technical conferences.