Citation page for article published by ACM, the First Society in Computing. © ACM, Inc.

Reliable scheduling in a TMR database system

Pittelli, Frank M.
Garcia-Molina, Hector

ACM Transactions on Computer Systems
Vol.7, No. 1 (Feb. 1989), pp. 25-60


General Terms

ALGORITHMS, DESIGN, PERFORMANCE, RELIABILITY

Categories and Subject Descriptors

C.4 : Computer Systems Organization, PERFORMANCE OF SYSTEMS, Reliability, availability, and serviceability. H.2.0 : Information Systems, DATABASE MANAGEMENT, General. D.4.1 : Software, OPERATING SYSTEMS, Process Management, Scheduling.


From Computing Reviews ...

Jason Gait

Triple modular redundant (TMR) systems replicate data and processing on three nodes. A TMR scheduler guarantees that nonfaulty nodes execute the same transactions in the same order. The authors propose a high-performance scheduler that uses premature scheduling, null transactions, and message batching to support dynamic policy changes as load varies.

A TMR scheduler guarantees that transactions submitted to perfect nodes are eventually scheduled at all perfect nodes and that transactions submitted to faulty nodes are either scheduled at all perfect nodes or never scheduled at any perfect node. The scheduler depends on a reliable broadcast algorithm that guarantees determinism and prevents faulty nodes from spoofing remote schedulers.

The scheduler prevents a faulty node from swamping the system with a stream of spoofed messages by dropping excess messages. Perfect nodes obey a backoff discipline that limits the messages they initiate. The scheduler uses a fixed delay backoff, but it appears that a variable delay could be used if needed.

A sequence number discipline allows premature scheduling when there are no faulty nodes. (A faulty node can interdict premature scheduling by mangling the sequence numbers.) Premature scheduling depends on a steady stream of transactions, so for light loads null transactions are generated to induce premature scheduling. Each transaction generates four messages, so at heavy loads transaction batching improves performance. The cumulative improvement amounts to a factor of five in throughput and response.

The authors introduce a hybrid algorithm that transitions between optimal policies at the critical points and performs optimally across the range of loads studied. They do not study fine-scale behavior at the critical points where thrashing between policies could occur. A final version of the scheduler might introduce hysteresis in transitioning between policies to avoid thrashing.