










| |
When was the last time you experienced a computer related failure and what
was the consequence? You may not remember because the effect was not big. But,
what about a similar failure which might affect your bank's computer and you
cannot withdraw your money. Or worse, you are a patient and your hospital's
patient monitoring system fails or you are flying and your hear the following
message. "This is your captain speaking. We have just discovered that our
fly-by-wire system is not operational. We are investigating and will report to
you in about one hour time." May be then you care. May be you would like to
know causes of such failures and understand how to design these systems better.
Course Outline:
Dependable computer systems are required in applications which involve human
life or large economics. In this course we study the theory and practice of
design of such system both at hardware and software level. We will cover the
following topics.
 | Dependability concepts: dependable system, techniques for achieving
dependability, dependability measures, fault, error, failure, faults and
their menifestation, classification of faults and failures.
 | Fault tolerant strategies: Fault detection, masking, containment,
location, reconfiguration, and recovery.
 | Fault tolerant design techniques: Hardware redundancy, software
redundancy, time redundancy, and information redundancy.
 | Testing and Design for Testability.
 | Self-checking and fail-safe circuits.
 | Infomation Redundancy : coding techniques, error detection and
correction codes, burst error detection and correction, unidirectional codes
 | Fault tolerance in distributed systems: Byzantine General problem,
consensus protocols, checkpointing and recovery, stable stoage and RAID
architectures, and data replication and resiliency.
 | Dependability evaluation techniques and tools: Fault trees, Markov
chains; HIMAP tool.
 | Analysis of fault tolerant hardware and software architectures.
 | System-level fault tolerance and low overhead high-availability
technique
 | Fault tolerance in real-time systems: Time-space tradeoff, fault
tolerant scheduling algorithms.
 | Faul tolerant interconnection networks: hypercube, star graphs, and
fault tolerant ATM switches.
 | Dependable communication: Dependable channels, survivable networks,
fault-tolerant routing.
 | Case studies of fault tolerant multiprocessor and distributed systems.
 | Reading of some of the state-of-the-art research material.
Anything you want to discuss
Anything I may find interesting
|
| | | | | | | | | | | | | |

| |
|