Fault Tolerant Systems: Assignment 1

Reading:

First three items in suggested reading (two papers and a web reference).


The purpose of this home work is to test your fault-free programming skills.

Exercises:

  1. A chemical plant mixes two kind of fluids. To keep things simple, we will call them ``HOT'' and ``COLD.'' A proper mix is to be maintained for safety reasons. The flow of HOT and COLD can be controlled by issuing commands like ``INCREASE HOT,'' ``INCREASE COLD,'' ``DECREASE HOT,'' ``DECREASE COLD,''and ``NO CHANGE.'' The ratio of HOT and COLD is monitored constantly by a ``mix-tester'' at the output of mixer. The possible outputs produced by mix-tester are: ``too-much HOT,'' ``just right,'' and ``too-much COLD.'' Write a monitor program to take input from the mix-tester and generate flow control signals for HOT and COLD. Your goal is to write a program which has no design faults. The control algorithm is executed continuously.
  2. Explain the difference between fault, error, and failure and related them to the three universe (physical, information, and external) model. Take three different examples from three very diffferent application areas and show the correspondence.
  3. Fault can have five attributes. Give two examples of faults and illustrate these attributes.
  4. You have been just employed by a large software company and their major problem is that software they supply fails very often. You job is to help the company to improve the quality and reliability of the software products. Prepare a one paragraph (about 10-20 lines) proposal identifying steps you would take and how you propose to approach this problem.
  5. Practice HIMAP: Use HIMAP program to model a system with two components connected in parallel. The system fails when both the components fail. The mission time is 10 hours. You may assume that both components are identical. Use the failure rate of Module as 0.0001 and repair rate as 0.01 with variation equal to 0.0 for both. Report, unreliability and MTBCF.

    In order to do the last problem, you will run HIMAP on a PC. First create a project directory using "Utils" and "Create/Change Project" menu. Then create a library of components you want to use in project using "Library" and "New library file" menu. This file contains name and parameters of components you want to use in developing the model. Components are added by selecting "New" button. You provide name, a short explanation of the component, and the failure repair rates using appropriate boxes. Keep the component names to 4 to 6 characters and use appropriate failure and repair rates (you may not use repairs in the beginning). Notice that you need only one component for this problem. System offers default values (except for name and explanation) and you may just select those in the beginning. The components in library files can be edited afterwards. After the library is saved (Save and Done), create a new model file using "Model" and "New Model" file menu. Name the file appropriately. Draw a model using one "Fbox" on the top, an "AND" gate below it and two "Event" boxes below the AND gate like a tree. Boxes are connected using "Link" button. To connect two boxes, select Link and click on "to" box and the on "from" box (it is slightly counter-intuitive).

    You can associate explanation to boxes using "Expl" button and by clicking on the box itself to which you are attaching an explanation. (If you provided explanation in library file, the explanation will automatically appear in even boxes when a mapping is done from library to picture.) Once picture is complete, save it (in "Model" menu). Then select "Map". A new window will appear and library components will be shown in one box. You can attach an instance of a library component to an event in the picture file. To do so, select a component, and name that instance (any four to six character unique name). Keep Redundancy level to one (default value) and press "Associate". Now you can select an "Event" on the picture (not AND gate or FBOX). An association is established. You can again select another (or the same) component, but name the instance differently, and associate this with the second event. You will see the associations in top box in "Map" window. Now press "Done" in "Map" window and save the model again. Using "Tools" menu, select "Solve fault tree" command and the system will ask you for a mission time. Give a value 10. You will see the reliability and unreliability numbers. Convince yourself that the numbers look appropriate.

    PLAY, PRACTICE and GOOD LUCK.