Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences


UC Berkeley


2009 Research Summary

System-level Design and Analysis of Fault Tolerant Heterogeneous Systems

View Current Project Information

Mark Lee McKelvin Jr, Claudio Pinello and Alberto L. Sangiovanni-Vincentelli

Gigascale Systems Research Center NSF ITR CCR-0225610, University of California Mentored Research Award and Intel

Mission-critical applications are increasingly using electronic hardware and software content that must be designed to be robust and reliable despite faults that may occur during operation. Example applications may be found in the automotive, avionics, industrial process control, and medical industries. Furthermore, as the complexity of such systems increase, there is a corresponding increase in heterogeneity due to the expanded range of applications that are realized. Yet, system designers must meet stringent time-to-market deadlines at an acceptable performance and cost of the resulting system design.

Traditional approaches to the design of mission-critical systems first design the system, secondly analyze the system for reliability, and finally exchange critical components or introduce fault tolerant mechanisms in order to satisfy given reliability constraints. Unfortunately, this design procedure may lead to suboptimal designs concerning desired fault tolerant and reliability properties.

In this project, we propose to address fault tolerant and reliability properties early in the design process at the highest level of abstraction as system-level constraints on the system performance. By addressing the design of fault-tolerant and reliable systems early in the design phase at the system level, we expect to explore the design space by automatically synthesizing architectures that yield potentially better solutions with respect to cost and performance faster and more efficiently than traditional design approaches. In our previous work, we have investigated the use of fault tree anlaysis to evaluate a given architecture [1,2], and we have addressed the modeling of fault tolerant systems using a well-defined computational model called Fault-Tolerant Data Flow [3].

M. L. McKelvin, Jr., C. Pinello, S. Kanajan, J. Wysocki, and A. Sangiovanni-Vincentelli, "Model-Based Design of Heterogeneous Systems for Fault Tree Analysis," 24th International System Safety Conference, July 2006, pp. 400-409.
M. L. McKelvin, Jr., G. Eirea, C. Pinello, S. Kanajan, and A. Sangiovanni-Vincentelli, "A Formal Approach to Fault Tree Synthesis for the Analysis of Distributed Fault Tolerant Systems," Proc. 5th ACM International Conference on Embedded Software, Jersey City, NJ, September 2005, pp. 237-246.
M. McKelvin, J. Sprinkle, C. Pinello, and A. Sangiovanni-Vincentelli, "Fault Tolerant Data Flow Modeling Using the Generic Modeling Environment," 12th Annual IEEE International Conference and Workshop on the Engineering of Computer Based Systems (ECBS), Greenbelt, MD, April 2005.