THIS IS A LEGACY PAGE WHOSE CONTENT HAS NOT BEEN UPDATED SINCE 2011, IT WILL BE UPDATED EVENTUALLY
Our Research Project’s Published Professional Documents
This page contains an annotated bibliography of (primarily) my FY 2003-2008 research projects’ documents co-authored by my research colleagues and I, which have been published (or have been accepted and are awaiting publication) in professional society journals and conference proceedings. Most are about expanding the scope and formal basis of my TUF/UA timeliness paradigm, but some are about other topics of interest to my research project— e.g., real-time failure recovery, multiprocessor scheduling, real-time P2P and Pub/Sub protocols, lock- and wait-free synchronization, software transactional memory. A few earlier papers are included because of their relevance to my recent projects.
In most cases the entry includes abstracts, my comments, and links that allow downloading the documents in Acrobat PDF format.
Anderson et al. 07
Consensus-Driven Distributable Thread Scheduling in Networked Embedded Systems
Jonathan Anderson, Binoy Ravindran, and E. Douglas Jensen
Proc. IFIP International Conference on Embedded and Ubiquitous Computing, December 2007
Abstract.We demonstrate an improved consensus-driven utility accrual scheduling algorithm (DUA-CLA) for distributable threads which execute under run-time
uncertainties in execution time, arrival models, and node crash failures. The DUA-CLA algorithm’s message complexity (O(fn)), lower time complexity bound (O(D + fd + nk)), and failure-free execution time (O(D + nk)) are established, where D is the worst-case communication delay, d is the failure detection bound, n is the number of nodes, and f is the number of failures. DUA-CLA is shown to have the “lazy-abort” property—abortion of currently-infeasible tasks is deferred until there is no possibility of completing the task on time. Further, it exhibits “schedule-safety.” Segments (and therefore, threads) proposed as
feasible for execution by a node which fails during the consensus decision will be removed from the consensus set and will not cause an otherwise-feasible segment to be excluded. These properties mark improvements over earlier strategies in common- and worst-case performance. Finally, we present our implementation of DUA-CLA in a
DRTSJ-compliant Java virtual machine, and quantitative results are presented validating fundamental properties of the algorithm.
Comments: TBD
Anderson and Jensen 06
The Distributed Real-Time Specification for Java: Status Report
Jonathan Anderson and E. Douglas JensenProc.
4th International Workshop on Java Technologies for Real-time and Embedded Systems,2006
Abstract.The Distributed Real-Time Specification for Java (DRTSJ) is under development within Sun’s Java Community Process (JCP) as Java Specification Request 50 (JSR-50), lead by the MITRE Corporation. We present the engineering considerations and design decisions settled by the Experts Group, the current and proposed form of the Reference Implementation, and a summary of open issues. In particular, we present an approach to integrating the distributable threads programming model with the Real-Time Specification for Java and discuss the ramifications for composing distributed, real-time systems in Java. The Expert Group plans to release an initial Early Draft Review (EDR)
for previewing the distributable threads abstraction in the coming months, which we describe in detail. Along with that EDR, we will make available a demonstration application from Virginia Tech, and a DRTSJ-compatible RTSJ VM from Apogee.
Update 30 March 08:After several years of designing and implementing the DRTSJ on our (mostly Jonathan Anderson’s) personal time, the DRTSJ reference implementation became functional enough to meet my research project’s critical needs. But the EDR was still incomplete. I decided in October 07 to make an unplanned diversion of some money from my research project for some people to put some funded time on the DRTSJ in hopes of getting the EDR out by December 07 and then extended to February 08. Unfortunately, being able to pay for the uniquely qualified programmers required for this effort doesn’t
mean that I can actually get their time. Jonathan Anderson was an overloaded Ph.D.student, and Yun Zhang was distracted by other work deemed higher priority by
her management. By the end of February 08 I could no longer justify the DRTSJ funds, and so I terminated the work. The reference implementation needs at most a
person-month of work, and the written spec likewise—by us, probably more if performed by the few people not already intimately familiar with the concepts and implementation. The nearly complete EDR and ancillary software are available to anyone who wishes to seek it from the JSR-50 Expert Group. Update [date]: TBD
Comments: TBD
Boucher et al. 94
Toward a Multilevel-Secure, Best-Effort Real-Time Scheduler
Peter K. Boucher, Raymond K. Clark, Ira B. Greenberg, E. Douglas Jensen, and Douglas M. Wells
Proc. Int. Conf. on Dependable Computing for Critical Applications, 1994
Abstract. Some real-time missions that manage classified data are so critical that mission failure might be more damaging to national security than compromising the data. The conflicts between computer security requirements and timeliness requirements are described in the context of large, distributed, supervisory-control systems that are intended for use in such critical missions. The Multilevel (MLS) Secure Alpha approach to addressing these conflicts is introduced. A prototype tradeoff mechanism is described, as are the results of testing the mechanism.
Comments: This paper reveals a non-obvious multilevel security (MLS) capability of time/utility functions. At one time, the NSA Orange Book required that for computer systems at the B3 level and higher, covert timing channels with bandwidths over one hundred bits per second must be eliminated (or reduced), and those with bandwidths between ten and one hundred bits per second must be audited, but covert channels with bandwidths less than ten bits per second may be tolerated. (Subsequently, people came to
realize that high-capacity covert timing channels, particularly those based on characteristics of shared hardware components, appear to be inevitable in modern trusted systems. Thus, the NSA relaxed the specification to require that a thorough search be conducted for covert timing channels.) Traditional techniques for reducing the bandwidth of covert timing channels include randomizing the scheduling—obviously a non-starter for real-time systems. This paper (highly controversial in the DoD MLS security community which prized MLS over all else at the time) describes an approach that monitors and controls the bandwidth available through the scheduler covert timing channel, using the time/utility function scheduling functionality of the Alpha real-time distributed OS [Clark et al. 93]. In particular, the scheduling covert channel bandwidth limits can be adjusted dynamically in accordance with mission goals. For example, if the destination of a flight is classified, but the classification is automatically reduced upon arrival, the resolution mechanism could be configured to ease restrictions on the scheduling covert channel bandwidth when the flight arrives within some threshold distance from the destination. These resolution rules will be protected and enforced by the Trusted Computing Base. The Secure Alpha Rapid Prototype (SARP) [ ] is an operating system rapid prototype that contains a Multilevel Secure TUF/UA Scheduler, as well as other security features, and upon which the MLS Alpha project’s feasibility demonstration was built. This project was a collaboration between Concurrent and SRI. The SARP experiment demonstrated that a covert channel bandwidth limitation mechanism can be effectively incorporated into a TUF/UA scheduler to allow enforcement of the MLS Alpha security policy’s resolution component. It was delivered and demonstrated to the USAF Rome Laboratory at the completion of the MLS Alpha project in March 1993.
Channakeshava et al. 06
Utility Accrual Real-Time Channel Establishment in Multi-hop Networks
Karthik Channakeshava, Binoy Ravindran, and E. Douglas Jensen
IEEE Transactions on Computers, 2006
Abstract. We consider Real-Time CORBA 1.2 (Dynamic Scheduling) distributable threads operating in multi-hop networks. When distributable threads are subject to time/utility function time constraints, and timeliness optimality criteria such as maximizing accrued system-wide utility is desired, utility accrual real-time channels must be established. Such channels transport messages that are generated as distributable threads transcend nodes, in a way that maximizes system-wide, message-level utility. We present (1) a localized utility accrual channel establishment algorithm called Localized Decision for Utility accrual Channel Establishment (or LocDUCE) and (2) a distributed utility accrual channel establishment algorithm called Global Decision for Utility accrual Channel Establishment (or GloDUCE). Since the channel establishment problem is NP-complete, LocDUCE and GloDUCE heuristically compute channels with LocDUCE making decisions based on local information pertaining to the node and GloDUCE making global decisions. We simulate the performance of both the algorithms and compare them with the Open Shortest Path First (OSPF) routing algorithm and the optimal algorithm. We also implement these algorithms in a prototype test-bed and experimentally compare their performance with OSPF. Our simulation and experimental measurements reveal that GloDUCE and LocDUCE accrues significantly higher utility than OSPF, and also perform close to the optimal for some cases. Furthermore, GloDUCE outperforms LocDUCE under high downstream traffic.
Comments: TBD
Channakeshava et al. 05
IP Quality of Service Support for Soft Real-Time Applications
K. Channakeshava, K. Phanse, L. A. DaSilva, B. Ravindran, S. F. Midkiff, and E. D. Jensen
Abstract. To obtain acceptable timeliness performance for emerging large-scale distributed real-time control applications operating in large IP internetworks, a scalable quality of service (QoS) architecture is needed. In this paper, we propose a scalable QoS architecture (abbreviated as RTQoS) in support of real-time systems, one that implements real-time scheduling at end-hosts and stateless QoS in the core routers. We address challenges and explore potential benefits achieved by integrating network services with real-time systems, through the use of a network testbed. Experimental evaluation demonstrates the RTQoS architecture as a promising approach for soft real-time applications that are subject to time/utility function time constraints and utility accrual optimality criteria.
Comments: TBD
Cho et al. 10
T-N Plane-Based Real-Time Scheduling for Homogeneous Multiprocessors
Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
Journal of Parallel and Distributed Computing, to appear 2010
Abstract. We consider optimal real-time scheduling of periodic tasks on multiprocessors —i.e., satisfying all task deadlines, when the total utilization demand does not exceed the utilization capacity of the processors. We introduce a novel abstraction for reasoning about task execution behavior on multiprocessors, called T-N plane, and present T-N plane-based real-time scheduling algorithms. We show that scheduling for multiprocessors can be viewed as scheduling on repeatedly occurring T-N planes, and feasibly scheduling on a single T-N plane results in an optimal schedule. Within a single T-N plane, we analytically show a sufficient condition to provide a feasible schedule. Based on these, we provide two examples of T-N plane-based real-time scheduling algorithms, including non-work-conserving and work-conserving approaches. Further, we establish that the algorithms have bounded overhead. Our simulation results validate our analysis of the algorithm overhead. In addition, we experimentally show that our approaches have a reduced number of task migrations among processors, compared to a previous algorithm.
Comments:TBD
Cho et al. 08
Lock-Free Synchronization for Dynamic Embedded Real-Time Systems
Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
ACM Transactions on Embedded Computing Systems, to appear 2009
Abstract. We consider lock-free synchronization for dynamic embedded real-time systems that are subject to resource overloads and arbitrary activity arrivals. We model activity arrival behaviors using the unimodal arbitrary arrival model (or UAM). UAM embodies a stronger “adversary” than most traditional arrival models. We derive an upper bound on lock-free retries under the UAM with utility accrual scheduling—the first such result. We establish the tradeoffs between lock-free and lock-based sharing under UAM. These include conditions under which activities’ accrued timeliness utility is greater under lock-free than lock-based, and the consequent lower and upper bound on the total accrued utility that is possible with lock-free and lock-based sharing. We confirm our analytical results with a POSIX RTOS implementation.
Comments: This is the journal-length version of Cho et al. 06. It also was selected to appear as “On Lock-Free Synchronization for Dynamic Embedded Real-Time Software,” in the book Design, Automation, and Test in Europe: The Most Influential Papers of 10 Years, ISBN: 978-1-4020-6487-6, Rudy Lauwereins and Jan Madsen (Editors), Section 1: System Level Design, Springer-Verlag, January 2008.
Cho et al. 07b
On Scheduling Garbage Collector in Dynamic Real-Time Systems with Statistical Timeliness Assurances
Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
Abstract. We consider garbage collection (GC) in dynamic real-time systems. We consider the time-based GC approach of running the collector as a separate, concurrent thread, and focus on real-time scheduling to obtain assurances on mutator timing behavior, while ensuring that memory is never exhausted. We present a scheduling algorithm called GCUA. The algorithm considers mutator activities that are subject to time/utility function time constraints, variable execution time demands, the unimodal arbitrary arrival model that allows a strong adversary, and resource overloads. We establish several properties of GCUA, including probabilistically-satisfied utility lower bounds for each mutator activity, a lower bound on the system-wide total accrued utility, bounded sensitivity for the assurances to variations in mutator execution time demand estimates, and no memory exhaustion at all times. Our simulation experiments validate our analytical results and confirm the algorithm’s effectiveness and superiority.
Comments: TBD
Cho et al. 07a
Synchronization for an Optimal Real-Time Scheduling Algorithm on Multiprocessors
Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
Proc. of the IEEE Second International Symposium on Industrial Embedded Systems
Abstract. We consider several object sharing synchronization mechanisms including lock-based, lock-free, and wait-free sharing for LNREF [Cho et al. 06e], an optimal real-time scheduling algorithm on multiprocessors. We derive LNREF’s minimum-required space cost for wait-free synchronization using the space-optimal wait-free algorithm. We then establish the feasibility conditions for lock-free and lock-based sharing under LNREF, and the concomitant tradeoffs. While the tradeoff between wait-free versus the other sharing is obvious—i.e., space and time costs—we show that the tradeoff between lock-free and lock-based sharing for LNREF hinges on the cost of the lock-free retry, blocking time under lock-based. Finally, we numerically evaluate lock-free and lock-based sharing for LNREF.
Comments: TBD
Cho et al. 07
Space-Optimal Wait-Free Real-Time Synchronization
Hyeonjoong Cho, Binoy Ravindran, Haisang Wu, and E. Douglas Jensen
IEEE Transaction on Computers, Volume 56, Number 3, pages 385-401, March 2007
Abstract. We present a wait-free protocol for the single-writer/multiple-reader problem in small-memory embedded real-time systems. We analytically establish that our protocol requires lesser (or equal) number of buffers than previously best wait-free protocols for this problem. Further, we prove that our protocol is space-optimal—the first space optimality established for wait-free protocols that consider a’ priori knowledge of preemptions. Our evaluation studies and implementation measurements using the SHaRK RTOS kernel confirm the protocol’s superiority and effectiveness.
Comments: TBD
Cho et al. 06d
An Optimal Real-Time Scheduling Algorithm for Multiprocessors
Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
Proc. of the 2006 Real-Time Systems Symposium
Abstract. We present an optimal real-time scheduling algorithm for multiprocessors— one that satisfies all task deadlines, when the total utilization demand does not exceed the utilization capacity of the processors. The algorithm called LLREF, is designed based on a novel abstraction for reasoning about task execution behavior on multiprocessors: the Time and Local Execution Time Domain Plane (or T-L plane). LLREF is based on the fluid scheduling model and the fairness notion, and uses the T-L plane to describe fluid schedules without using time quanta, unlike the optimal Pfair algorithm (which uses time quanta). We show that scheduling for multiprocessors can be viewed as repeatedly occurring T-L planes, and feasibly scheduling on a single T-L plane results in the optimal schedule. We analytically establish the optimality of LLREF. Further, we establish that the algorithm has bounded overhead, and this bound is independent of time quanta (unlike Pfair). Our simulation results validate our analysis on the algorithm overhead.
Comments: TBD
Cho et al. 06c
On Multiprocessor Utility Accrual Real-Time Scheduling With Statistical Timing Assurances
Hyeonjoong Cho, Binoy Ravindran, Haisang Wu, and E. Douglas Jensen
Proc. of the IFIP International Conference on Embedded And Ubiquitous Computing, 2006
Abstract. We present the first Utility Accrual (or UA) real-time scheduling algorithm for multiprocessors, called gMUA. The algorithm considers an application model where real-time activities are subject to time/utility function time constraints, variable execution time demands, and resource overloads where the total activity utilization demand exceeds the total capacity of all processors. We consider the scheduling objective of (1) probabilistically satisfying lower bounds on each activity’s maximum utility and (2) maximizing the system-wide, total accrued utility. We establish several properties of gMUA including optimal total utility (for a special case), conditions under which individual activity utility lower bounds are satisfied, a lower bound on system-wide total accrued utility, and bounded sensitivity for assurances to variations in execution time demand estimates. Our simulation experiments validate our analytical results and confirm the algorithm’s effectiveness and superiority.
Comments: TBD
Cho et al. 06b
On Utility Accrual Processor Scheduling with Wait-Free Synchronization for Embedded Real-Time Software
Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
Proc. of the ACM Symposium On Applied Computing, 2006
Abstract. We present the first wait-free utility accrual (UA) real-time scheduling algorithms for embedded real-time systems. UA scheduling algorithms allow application activities to be subject to time/utility function (TUF) time constraints, and optimize criteria such as maximizing the sum of the activities’ attained utilities. We present UA algorithms that use wait-free synchronization for mutually exclusively accessing shared data objects. We derive upper bounds on the maximum possible increase in activity utility with wait-free over their lock-based counterparts, while incurring the minimum possible additional space costs, and thereby establish the tradeoffs between wait-free and lock-based object sharing for UA scheduling. Our implementation measurements on a POSIX RTOS reveal that (during under-loads), the wait-free algorithms yield optimal utility for step TUFs and significantly higher utility (than lock-based) for non-step TUFs.
Comments: TBD
Cho et al. 06a
On Scheduling A Garbage Collector in Dynamic Real-Time Systems With Statistical Timeliness Assurances
Hyeonjoong Cho, C. Na, Binoy Ravindran, and E. Douglas Jensen
Proc. of the IEEE International Symposium on Object and component-oriented Real-time distributed Computing (ISORC), 2006
Abstract. We consider garbage collection (GC) in dynamic real-time systems. We consider the time-based GC approach of running the collector as a separate, concurrent thread, and focus on real-time scheduling to obtain assurances on mutator timing behavior, while ensuring that memory is never exhausted. We present a scheduling algorithm called GCUA. The algorithm considers mutator activities that are subject to time/utility function time constraints, variable execution time demands, the unimodal arbitrary arrival model that allows a strong adversary, and resource overloads. We establish several properties of GCUA including probabilistically-satisfied utility lower bounds for each mutator activity, a lower bound on the system-wide total accrued utility, bounded sensitivity for the assurances to variations in mutator execution time demand estimates, and no memory exhaustion at all times. Our simulation experiments validate our analytical results and confirm the algorithm’s effectiveness and superiority.
Comments: TBD
Cho et al. 06
Lock-Free Synchronization for Dynamic Embedded Real-Time Systems
Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
Proc. of the ACM Design Automation and Test in Europe, 2006
Abstract. We consider non-blocking synchronization for dynamic embedded real-time systems such as those that are subject to resource overloads and arbitrary activity arrivals. The multi-writer/multi-reader problem inherently occurs in such systems, when their activities must concurrently and mutually exclusive share data objects. We consider lock-free synchronization for this problem under the unimodal arbitrary arrival model (or UAM). UAM embodies a “stronger” adversary than most traditional arrival models. We establish the fundamental tradeoffs between lock-free and lock-based object sharing under UAM—the first such result. Our tradeoffs include analytical conditions under which activities’ accrued timeliness utility is greater under lock-free than lock-based, and the consequent upper bound on the increase in accrued utility. Our implementation experience on a POSIX RTOS reveals that the lock-free scheduling algorithm yields higher accrued utility, by as much as 65%, and critical time satisfactions, by as much as 80%, over lock-based.
Comments: TBD
Cho et al. 05
A Space-Optimal, Wait-Free Real-Time Synchronization Protocol
Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
Proc. of the 17th Euromicro Conference on Real-Time Systems, 2005
Abstract. We present a wait-free protocol for the single-writer, multiple-reader problem in small-memory, embedded real-time systems. We analytically establish that our protocol requires lesser (or equal) number of buffers than what is needed by previously best wait-free protocols for this problem. Further, we prove that our protocol is space-optimal—the first such bound established for wait-free protocols that consider a’ priori knowledge of preemptions. Our evaluation studies and implementation measurements (from an implementation using the SHaRK real-time OS kernel) confirm the protocol’s superiority and effectiveness.
Comments: TBD
Clark et al. 04
Software Organization to Facilitate Dynamic Processor Scheduling
Raymond K. Clark, E. Douglas Jensen, and Nicolas F. Rouquette
Proc. of the IEEE Workshop on Parallel and Distributed Real-Time Systems, 2004
Abstract. The NASA Jet Propulsion Laboratory (JPL) is developing the Mission Data System (MDS) for potential use in future space missions, where it is expected to reduce the complexity and effort required to produce mission and ground support software, while also enabling greater autonomy for remotely deployed systems, including future planetary rovers. The MDS software architecture includes two levels of processor scheduling: a high-level planner and a low level execution manager. JPL and MITRE are jointly investigating the use of Time/Utility Functions to manage low-level scheduling, since they promise to enable adaptive, short term processor scheduling based on mission utility. Recent development work on MDS has focused on organizing the software so that low-level processor scheduling will be most effective. This paper describes the major organizing principles that have shaped this work.
Comments: This paper is the first public announcement of the new collaboration between the MITRE Corporation and NASA JPL to implement time/utility function time constraints and utility accrual scheduling optimality criteria in JPL’s Mission Data System [ ]. This paper summarizes our initial goal and principles for adapting the MDS software to accommodate dynamic scheduling in general and TUF/UA scheduling in particular. The work was performed in the last quarter of FY03. Our on-going work in FY04 was to include the design and implementation of the TUF/UA scheduler in the MDS, and its experimental evaluation in the JPL prototype Mars rover. That work has been deferred to FY05 due to new management at JPL making major changes to the 2009 Mars Science Lab mission, which was expected to be the first application of the MDS.
Clark et al. 02
The Distributed Real-Time Specification for Java: Overview and Status
Raymond Clark, E. Douglas Jensen, Doug Wells, and Andy Wellings
TBD
Abstract. None, see Comments below.
Comments: The Distributed Real-Time Specification for Java (DRTSJ) is being developed under Sun’s Java Community Process. It is focused on adaptively supporting application-specific predictability of end-to-end timeliness for sequentially distributed computations (e.g., chains of invocations) in dynamic distributed object systems. This presentation includes new and revised material about the objectives and direction of the DRTSJ, plus updated material on the technical approach and status. For details about some of the technical approach, see [Wellings et al 02]. The DRTSJ work slowed significantly for about 18 months due to higher priority demands for the principals at MITRE. It resumed in September 2004 due to one of the principals (Jonathan Anderson) becoming a Ph.D. student at Virginia Institute of Technology (with whom Jensen is associated. Jonathan will focus his first year on getting the DRTSJ completed, with participation by people at VA Tech, MITRE, and several other organizations.
Clark et al. 99
An Adaptive, Distributed Airborne Tracking System
(“Process the Right Tracks at the Right Time”)
Raymond Clark, E. Douglas Jensen, Arkady Kanevsky, John Maurer, Paul Wallace, Thomas Wheeler, Yun Zhang, Douglas Wells, Tom Lawrence, and Pat Hurley
Proc. of the IEEE Workshop on Parallel and Distributed Real-Time Systems, April 1999
Abstract. This paper describes a United States Air Force sponsored Advanced Technology Demonstration performed jointly by The MITRE Corporation, The Open Software Foundation Research Institute (OSF/RI), and the Digital Equipment Corporation. The project goal was to employ time/value function, utility accrual (TUF/UA), based scheduling, together with the distributed thread programming abstraction, to produce an adaptive, distributed tracking component appropriate for consideration by the E3 version of the Airborne Warning and Control System (AWACS) program. This tracker was designed to evaluate application-specific Quality of Service (QoS) metrics to quantify its tracking services in a dynamic environment and to derive scheduling parameters directly from these QoS metrics to control tracker behavior. The prototype tracker was implemented on the OSF/RI MK7-based operating system kernel [ ]. The OSF/RI MK7 microkernel began with a version of the CMU CS Mach microkernel [ ]. To that, the project added the optional native TUF/UA-based processor scheduling paradigm and a distributed thread programming abstraction introduced in the Alpha microkernel [ ] developed by Jensen’s CMU CS research program [ ]. The AWACS E3 tracker notional prototype updates all of the tracked-object records when the system is not overloaded, and gracefully degrades when it is. The prototype has performed extremely well during demonstrations to AWACS operators and tracking system designers. Quantitative results are presented.
Comments: Recently people have begun to accept the notion that QoS is a valid and valuable concept above the network UDP layer where the term originated and is most frequently used (referring to bit error rate, latency, bandwidth, jitter, etc.). In particular, QoS is proving to be an extremely effective technique for expressing application level functional and non-functional behavioral requirements and driving resource management to achieve them. However, virtually all the papers in the literature about application level QoS focus on multimedia applications, where the application level QoS properties are essentially identical to the network level QoS properties. This paper is a rare example of using QoS at the application level of a non-multimedia application—i.e., track quality and number of dropped tracks in an airborne tracking system. Since this paper was published, the application has been rewritten in Java and in RTSJ-compliant real-time Java [ ], and the performance of these implementations quantitatively evaluated against the original C++ implementation reported on in this paper. The Java version’s performance with respect to the AQoS metrics was only slightly lower than that of the C++ version.
Clark et al. 93a
Effects of Multilevel Security on Real-Time Applications
Raymond K. Clark, Douglas M. Wells, Ira B. Greenberg, E. Douglas Jensen, Peter K. Boucher, and Teresa F. Lunt
Proc. of the IEEE Computer Security and Applications Conference, 1993
Abstract. This paper presents a brief overview of a notional airborne application scenario that requires both NSA multilevel security (MLS) and real-time processing. It was used to guide decisions related to formation of the MLS policy interpretation, the operating system interface, and the system services design for a multilevel secure real-time distributed operating system (MLS RT DOS) [ ] based on the Alpha operating system kernel [ ]. We compare secure and nonsecure application designs for the scenario to assess the effects of MLS RT DOS security features on the structure of the application. The comparison reveals that the MLS features make real-time scheduling and the construction of robust applications far more difficult for this scenario.
Comments: Aside from the obvious interest this paper has for distributed real-time systems that must meet the NSA Orange Book requirements for multilevel security, it also provides insight into the application of distributed threads.
Clark et al. 93
An Architectural Overview of the Alpha Real-Time Distributed Kernel
Raymond K. Clark, E. Douglas Jensen and Franklin D. Reynolds
Proc. of the USENIX Workshop on Microkernels and other Kernel Architectures, pp 200-208, 1993
Abstract. Alpha [ ] is a non-proprietary experimental operating system kernel which began as part of Jensen’s CMU CS research program [ ]. It extends the real-time domain to encompass distributed applications, such as for telecommunications, factory automation, and defense. Distributed real-time systems are inherently asynchronous, dynamic, and non-deterministic, and yet are nonetheless mission-critical. The increasing complexity and pace of these systems precludes the historical reliance solely on human operators for assuring system dependability under uncertainty. Traditional real-time OS technology is based on attempting to assert or impose determinism of not just the ends but also the means, for centralized low-level sampled-data monitoring and control, with an insufficiency of hardware resources. Conventional distributed OS technology is primarily based on two-party client/server hierarchies for explicit resource sharing in networks of autonomous users. These two technological paradigms are special cases which cannot be combined and scaled up cost-effectively to accommodate distributed real-time systems. Alpha’s new paradigm for real-time distributed computing is founded on best-effort management of all resources directly with computation completion time constraints which are expressed as benefit functions; and multiparty, peer-structured, trans-node computations for cooperative mission management.
Comments: This is the last and best known of the papers about the Alpha real-time distributed OS kernel. Alpha was literally unique in the annals of distributed real-time OS research for the concepts it pioneered, and was even unusual in its implementation directly on the bare hardware of each multiprocessor node of the LAN (instead of the usual academic practice of layering a research OS on a UNIX or modifying a UNIX). Alpha was one of the first, and certainly by far the most ambitious and unconventional distributed real-time OS kernel. In particular, Alpha pioneered: a comprehensive scheduling framework for pluggable application-specific scheduling disciplines (earlier work by others was far more limited in the range of disciplines); the first incorporation of time/utility function (originally called time/value function) scheduling disciplines in an OS (they were devised and implemented in the mid-70’s for a classified military program [Kain 77], where they were part of the application running on bare hardware without an OS); and the distributed thread programming model. These keystone concepts and techniques which debuted in Alpha have not languished in academic obscurity as have so many other good research results—instead, they have continued to advance beyond Alpha and this book, to this very day. See the History section of this web site. For example: KSR and Concurrent Computer implemented a Version 2 of Alpha (but didn’t survive long enough to productize it); the time/utility function model as defined on this web site is more completely and correctly developed; publications of academic research on time/utility functions are increasing; versions of distributed threads called “migrating threads” were at the heart of several research OS microkernels; the Open Group’s (ne’e Open Software Foundation’s) MK7.3A operating system incorporated more sophisticated versions of Alpha’s distributed threads and scheduling framework (as did IBM’s unreleased Workplace OS for PowerPC, and DEC’s unreleased Libra ORB and OS); the Distributed Real-Time Specification for Java and OMG’s Real-Time CORBA 2/Dynamic Scheduling standard include versions of distributed threads. [Links to many of these will appear on my Real-Time Resources page as I get time.]
Clark 90
Scheduling Dependent Real-Time Activities
Raymond K. Clark
Ph.D. Thesis, CMUCS-90-155, School of Computer Science, Carnegie Mellon University, 1990.
Abstract. A real-time application is typically composed of a number of cooperating activities that must execute within specific time intervals. Since there are usually more activities to be executed than there are processors on which to execute them, several activities must share a single processor. Necessarily, satisfying the activities’ timing constraints is a prime concern in making the scheduling decisions for that processor.
Unfortunately, the activities are not independent. Rather, they share data and devices, observe concurrency constraints on code execution, and send signals to one another. These interactions can be modeled as contention for shared resources that must be used by one activity at a time. An activity awaiting access to a resource currently held by another activity is said to depend on that activity, and a dependency relationship is said to exist between them. Dependency relationships may encompass both precedence constraints and resource conflicts.
No algorithm solves the problem of scheduling activities with dynamic dependency relationships in a way that is suitable for all real-time systems. This thesis provides an algorithm, called DASA, that is effective for scheduling the class of real-time systems known as supervisory control systems.
Simulation experiments that account for the time required to make scheduling decisions demonstrate that DASA provides equivalent or superior performance to other scheduling algorithms of interest under a wide range of conditions for parameterized, synthetic workloads. DASA performs particularly well during overloads, when it is impossible to complete all of the activities.
This research makes a number of contributions to the field of computer science, including: a formal model for analyzing scheduling algorithms; the DASA scheduling algorithm, which integrates resource management with standard scheduling functions; results that demonstrate the efficacy of DASA in a variety of situations; and a simulator. In addition, this work may improve the current practices employed in designing and constructing supervisory control systems by encouraging the use of modern software engineering methodologies and reducing the amount of tuning that is required to produce systems that meet their real-time constraints while providing improved scheduling, graceful degradation, and more freedom in modifying the system over time.
Thesis Summary: Download (PDF)
Comments: This was the second of two theses my CMU CS Ph.D. students produced about time/value(née time/benefit) function scheduling. The first was Locke’s thesis, which was preceded by the Jensen et al. paper in RTSS 85—the first publicly published work on this topic since I initiated it for a classified application in 1977 [Jensen 77, Kain et al. 77]. Two significant features distinguish Locke’s and Clark’s theses: Locke’s algorithm allows almost arbitrary non-convex functions, but did not allow dependencies among processes; Clark’s algorithm allowed only downward step functions, but allowed dependencies. Clark’s algorithm, being second generation, also out-performs Locke’s in some cases. Like Doug Locke, Ray Clark returned to graduate school after working (albeit not for as many years) on industrial real-time computing systems. Hence, he too learned from the real world that traditional real-time computing concepts and technologies are severely limited, and appreciated the needs for a more general, dynamic, and adaptive model of timeliness. Ray was one of the principal designers of the Alpha real-time distributed OS (e.g., [Clark et al. 93, Maynard et al. 88, Northcutt 87]) and has continued to work with me on this topic — for example, see [Boucher et al. 94, Clark et al. 93a, Clark et al. 99].
Curley et al. 09
Recovering from Distributable Thread Failures in Real-Time Java
Ed Curley, Binoy Ravindran, Jonathan Anderson, and E. Douglas Jensen
ACM Transactions on Embedded Computing Systems, to appear
Abstract. We consider the problem of recovering from failures of distributable threads (“threads”) in distributed real-time systems that operate under run-time uncertainties including those on thread execution times, thread arrivals, and node failure occurrences. When a thread experiences a node failure, the result is broken thread having an orphan. Under a termination model, the orphans must be detected and aborted, and exceptions must be delivered to the farthest, contiguous surviving thread segment for resuming thread execution. Our application/scheduling model includes the proposed distributable thread programming model for the emerging Distributed Real-Time Specification for Java (DRTSJ), together with an exception handler model. Threads are subject to time/utility function (TUF) time constraints and an utility accrual (UA) optimality criterion. A key underpinning of the TUF/UA scheduling paradigm is the notion of “best-effort” (as opposed to UDP’s least effort) where higher importance threads are always favored over lower importance ones, irrespective of thread urgency as specified by their time constraints. We present a thread scheduling algorithm called HUA and a thread integrity protocol called TPR. We show that HUA and TPR bound the orphan cleanup and recovery time with bounded loss of the best-effort property. Our implementation experience of HUA/TPR in the Reference Implementation of the proposed programming model for the DRTSJ demonstrates the algorithm/protocol’s effectiveness.
Comments: TBD.
Curley et al. 07
On Best-Effort Real-Time Assurances for Recovering from Distributable Thread Failures
Ed Curley, Binoy Ravindran, and E. Douglas Jensen
2007 IEEE International Symposium on Object and component-oriented Real-time distributed Computing
Abstract. We consider the problem of recovering from failures of distributable threads in distributed real-time systems that operate under run-time uncertainties including those on thread execution times, thread arrivals, and node failure occurrences. When a thread encounters a node failure, it causes orphans. Under a termination model, the orphans must be detected and aborted, and exceptions must be delivered to farthest, contiguous surviving thread segment for resuming thread execution. Our application/scheduling model includes distributable threads and their exception handlers that are subject to time/utility function (TUF) time constraints and an utility accrual (UA) optimality criterion. A key underpinning of the TUF/UA scheduling paradigm is the notion of “best-effort” where high importance threads are always favored over low importance ones, irrespective of thread urgency. (This is in contrast to classical admission control models which favor feasible completion of admitted threads over admitting new ones, irrespective of thread importance.) We present a scheduling algorithm called HUA and a thread integrity protocol called TPR. We show that HUA and TPR bound the orphan cleanup and recovery time with bounded loss of the best-effort property. Our implementation experience of HUA/TPR using the emerging Reference Implementation of Sun’s Distributed Real-Time Specification for Java demonstrates the algorithm/protocol’s effectiveness.
Comments: TBD.
Curley et al. 06
Recovering from Distributable Thread Failures with Assured Timeliness in Real-Time Distributed Systems
E. Curley, J. Anderson, B. Ravindran, and E. D. Jensen
Proc. IEEE Symposium on Reliable Distributed Systems, October 2006
Abstract. We consider the problem of recovering from failures of distributable threads with assured timeliness. When a node hosting a portion of a distributable thread fails, it causes orphans—i.e., segments of distributable threads that are disconnected from the thread’s root. We consider a termination model for recovering from such failures, where the orphans must be detected and aborted, resources held by them must be released and rolled back to safe states, and exceptions must be delivered to farthest, contiguous surviving thread segment from where execution can be resumed. Since distributable threads are subject to time constraints in real-time distributed systems, such recovery must be conducted with assured timeliness. Toward this, we present 1) a real-time scheduling algorithm called AUA, and 2) a distributable thread integrity protocol called TP-TR. We show that AUA and TP-TR bound the orphan cleanup and recovery time (thereby bounding thread starvation durations), maximize total thread accrued timeliness utility, and satisfy thread mutual exclusion constraints. We implement AUA and TPTR in a real-time middleware that supports distributable threads. Our experimental studies with the implementation validate the algorithm/protocol’s time-bounded recovery property and confirm their effectiveness.
Comments: TBD.
Fahmy et al. 12
Implementing Distributable Real-Time Threads in the Linux Kernel: Programming Interface and Scheduling Support
Sherif Fahmy, Binoy Ravindran, and E. Douglas Jensen
ACM 27th Symposium on Applied Computing, Operating System Track
Abstract. We present an implementation of Real-Time CORBA’s distributable threads (DTs) as a first-class, end-to-end real-time programming and scheduling abstraction in the Linux kernel. We use Ingo Molnar’s PREEMPT RT kernel patch, which enables nearly complete kernel pre-emption, and add local (real-time) scheduling support to the Linux kernel, atop which we build DT scheduling support. We implement DTs using Linux’s threading capabilities. Our implementation of a suite of independent and collaborative DT schedulers confirm the effectiveness of our implementation.
Comments: TBD.
Fahmy et al. 09a
On Bounding Response Times under Software Transactional Memory in Distributed Multiprocessor Real-Time Systems
Sherif Fahmy, Binoy Ravindran, and E. Douglas Jensen
ACM Design, Automation, and Test in Europe, Real-Time Systems Track, April 2009
Abstract. We consider multiprocessor distributed real-time systems where concurrency control is managed using software transactional memory (or STM). For such a system, we propose an algorithm to compute an upper bound on the response time. The proposed algorithm can be used to study the behavior of systems where node crash failures are possible. We compare the result of the proposed algorithm to a simulation of the system being studied in order to determine its efficacy. The results of our study indicate that it is possible to provide timeliness guarantees for multiprocessor distributed systems programmed using STM.
Comments: TBD.
Fahmy et al. 09
Response time analysis of software transactional memory-based distributed real-time systems
Sherif Fahmy, Binoy Ravindran, and E. Douglas Jensen
ACM Symposium on Applied Computing (SAC), Track on Operating Systems, March 2009
Abstract. We consider distributed real-time systems where concurrency control is managed using software transactional memory (or STM). For such a method we propose an algorithm to compute an upper bound on the response time. The proposed algorithm can be used to study the behavior of systems where node crash failures are possible. We compare the result of the proposed algorithm to a simulation of the system being studied in order to determine its efficacy. The results of our study indicate that it is possible to provide timeliness guarantees for systems programmed using STM.
Comments: TBD.
Fahmy et al. 08d
On Scalable Synchronization for Distributed Embedded Real-time Systems
Sherif Fahmy, Binoy Ravindran, and E. Douglas Jensen
Abstract. We consider the problem of programming distributed embedded real-time systems with distributed dependencies. We show that the de facto standard of using locks and condition variables in conjunction with threads can have significant overhead and semantic difficulty and suggest alternative programming abstractions to alleviate these problems. We also discuss several alternatives for implementing these programming abstractions and discuss the algorithms and protocols needed.
Comments: TBD.
Fahmy et al. 08c
Scheduling Dependent Distributable Real-Time Threads in Dynamic Networked Embedded Systems
Sherif Fahmy, Binoy Ravindran, and E. Douglas Jensen
IFIP Working Conference on Distributed and Parallel Embedded Systems, September 2008
Abstract. We consider scheduling distributable real-time threads with dependencies (e.g, due to synchronization) in partially synchronous systems in the presence of node failures. We present a distributed real-time scheduling algorithm called DQBUA. The algorithm uses quorum systems to coordinate nodes’ activities when constructing a global schedule. DBQUA detects and resolves distributed deadlock in a timely manner and allows threads to access resources in order of their potential utility to the system. Our main contribution is handling resource dependencies using a distributed scheduling algorithm.
Comments: TBD. A full-length version of the paper is available as a technical report here.
Fahmy et al. 08b
On Collaborative Scheduling of Distributable Real-Time Threads in Dynamic, Networked Embedded Systems
Sherif Fahmy, Binoy Ravindran, and E. Douglas Jensen
Abstract. Some emerging networked embedded real-time applications have relatively long reaction time magnitudes—e.g., milliseconds to minutes. These longer execution time magnitudes allow opportunities for more computationally expensive scheduling algorithms than what is traditionally considered for device-level real-time control sub-systems. In this paper, we review recent research conducted on collaborative scheduling algorithms in such systems that are subject to dynamic behavior such as transient and sustained resource overloads, arbitrary activity arrivals, and arbitrary node failures and message loss. We show that collaborative scheduling algorithms have an advantage over non-collaborative scheduling algorithms.
Comments: TBD
Fahmy et al. 08a
Fast Scheduling of Distributable Real-Time Threads with Assured End-to-End Timeliness
Sherif Fahmy, Binoy Ravindran, and E. Douglas Jensen
13th International Conference on Reliable Software Technologies – Ada-Europe 2008, March 2008
Abstract. We consider networked, embedded real-time systems that operate under run-time uncertainties on activity execution times and arrivals, node failures, and message losses. We consider the distributable threads abstraction for programming and scheduling such systems, and present a thread scheduling algorithm called QBUA. We show that QBUA satisfies (end-to-end) thread time constraints in the presence of crash failures and message losses, has efficient message and time complexities, and lower overhead and superior timeliness properties than past thread scheduling algorithms. Our experimental studies validate our theoretical results, and illustrate the algorithm’s effectiveness.
Comments: TBD. A full length version is available here.
Fahmy et al. 08
Scheduling Distributable Real-Time Threads in the Presence of Crash Failures and Message Losses
Sherif Fahmy, Binoy Ravindran, and E. Douglas Jensen
ACM Symposium On Applied Computing (Track on Embedded Systems), March 2008
Abstract. We consider the problem of scheduling distributable real-time threads under run-time uncertainties including those on thread execution times, thread arrivals, node failures, and message losses. We present a distributed scheduling algorithm called ACUA that is designed under a partially synchronous model, allowing for probabilistically-described message delays. We show that ACUA satisfies thread time constraints in the presence of crash failures and message losses, is early-deciding, and has an efficient message and time complexity. The algorithm has also better “best-effort” real-time property than past thread scheduling algorithms.
Comments:
A full-length version is here.
Feizabadi et al. 05
MSA: A Memory-Aware Utility Accrual Scheduling Algorithm
S. Feizabadi, B. Ravindran, and E. D. Jensen
ACM Symposium On Applied Computing (Track on Embedded Systems), March 2005
Abstract. Whereas fairness can be the basis for general-purpose operating system scheduling policies, timeliness is the primary concern for real-time systems. As such, real-time schedulers permit uninterrupted, exclusive access to the CPU by a specific task to ensure its timely completion of execution. Only a subset of tasks, however, can satisfy their timing constraints during processor overload conditions in a real-time system. Utility accrual (UA) scheduling disciplines assure scalability and graceful performance degradation by identifying—based on a pre-defined set of criteria—the subset of tasks to be granted the heavily contended system resources. Furthermore, whereas general-purpose operating systems treat memory monolithically and indiscriminately service dynamic allocation requests, UA schedulers can benefit from special memory management considerations during memory overload conditions. MSA, the scheduling algorithm here presented is the first of its kind to treat memory as a UA scheduler-managed resource. The scheduler is made aware of memory allocation requirements of each task throughout runtime and accordingly makes appropriate CPU and resource scheduling decisions. The algorithm is well-suited for use in resource constrained embedded systems in a soft real-time environment. We have implemented MSA in a POSIX real-time operating environment and measured its performance under various load conditions. Our experimental results show overall performance gains over other memory-unaware UA scheduling algorithms during memory overload.
Comments: This paper discusses the first utility accrual algorithm that takes memory requirements into account.
Goldberg et al. 95
Adaptive Fault Resistant Systems
Jack Goldberg, Ira Greenberg, Raymond Clark, E. Douglas Jensen, Kane Kim, and Douglas M. Wells
Technical Report CSL-95-02, Computer Science Lab, SRI International, Menlo Park, CA. January 1995
Abstract. In future distributed computing systems, data-arrival rates, fault types, resource availability and user preferences among performance and dependability objectives, all may vary dynamically during operation. The report describes architectural approaches and techniques for building computer systems that dynamically adapt their structure to changing operating conditions and user preferences. The concepts of stability management and system identification from modern control theory are given digital-system interpretations. These are reflective control and real-time fault diagnosis. Several cases were studied and demonstrated, including adaptive distributed recovery blocks, and adaptive distributed thread maintenance. A formal scheme was developed for adaptive, hybrid byzantine-backup fault tolerance.
Comments: This technical report includes the first publication of techniques for adaptive maintenance of the distributed threads programming abstraction pioneered in the Alpha distributed real-time OS kernel [Clark et al. 93].
Jensen 77 in Kain et al. 77
Chapter 3, Radar Scheduling: Section 1, The Scheduling Problem
Section 7, Future Needs; Subsections 7.1.4 – 7.1.6
Mohammed G. Gouda, Yi-Wu Han, E. Douglas Jensen, Wesley D. Johnson, Richard Y. Kain (Editor)
Distributed Data Processing Technology, Vol. IV, Applications of DDP Technology to BMD: Architectures and Algorithms, Honeywell Systems and Research Center, Minneapolis, MN. September 1977.
Abstract. to be supplied
Download (PDF) not available
Comments: This is the unclassified (and now unlimited distribution) summary technical report that first mentioned Jensen’s concept of scheduling based on time/utility function time constraints. The initial intended application was the Ballistic Millisle Defense Office (BMDO)’s Ballistic Missile Defense radar—specifically, the U.S. Army Safeguard Command’s AN/FPQ-16 Perimeter Acquisition Radar System at what is now the USAF Space Command’s Cavalier Air Force Station, in Cavalier, North Dakota. The radar was scheduled by a supercomputer of that time, but the efficiency of the radar was unacceptably low. Jensen devised the concept of time/utility functions and utility accrual, and the team simulated scheduling the radar with both software and hardware implementations of these functions. The simulated radar efficiency was much higher. The BMDO anti-ballistic missile defense system was terminated about that same time. Hence, the technology was not deployed for that radar, and the detailed technical report about our research on this topic remain classified. (However, the TUF/UA paradigm was subsequently implemented in more recent radars.) When I left Honeywell’s Systems and Research Center and joined the Computer Science faculty at Carnegie Mellon University, I resuscitated the concept. Two of my students (Douglas Locke and Raymond Clark) wrote their Ph.D. theses on the topic [Locke 87][Clark 90]. CMU and General Dynamics (the Ft. Worth division, now being part of Lockheed Martin) successfully applied time/utility function scheduling to a notional coastal air defense battle management application [Maynard et al. 88] (see the public GD BM/C2 demo page). Then MITRE applied time/utility function scheduling to an AWACS airborne radar surveillance mode application with great effectiveness [Clark et al. 99]; see the public AWACS ATD page. Advancement of the theory and application (including, but not limited to radar scheduling) of time/utility functions continued for eight years at MITRE and a number of universities, especially collaboratively between MITRE and Virginia Institute of Technology (as can be seen by the many papers we have published in conference proceedings and journals). (More about this is gradually appearing on my time/utility function history page.)
Han 08a
RTRD: Real-Time and Reliable Data Delivery in Ad Hoc Networks
Kai Han, Guanhong Pei, Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
Proc. of the IEEE Wireless Communications and Networking Conference, March/April 2008
Abstract. In this paper, we present a reliable real-time data delivery (communication) mechanism for ad-hoc networks, called RTRD. The mechanism makes use of a proactive wireless routing protocol (DSDV) for path finding and maintenance, and timely delivers data through a’ priori bandwidth reservation. In addition, to be robust to network failures, or to deliver large data chunks, it simultaneously delivers data in multiple paths. The simulation results conducted by NS-2 validate RTRD’s effectiveness.
Comments: To be supplied.
Han 08
LRTG: Scheduling Distributed Real-Time Tasks in Unreliable and Untrustworthy Systems
Kai Han, Binoy Ravindran, and E. Douglas Jensen
Proc. of the IEEE International Symposium on Frontiers in Networking with Applications (FINA), IEEE International Conference on Advanced Information Networking and Applications, March 2008
Abstract. We consider scheduling distributed real-time tasks in unreliable (e.g., those with arbitrary node and network failures) and untrustworthy systems (e.g., those with Byzantine node behaviors). We present a distributed real-time scheduling algorithm called LRTG. The algorithm makes two novel contributions. First, LRTG uses gossip for reliably propagating task scheduling parameters and for discovering task execution nodes. Second, the algorithm guards against potential disruption of message propagation due to Byzantine attacks using a mechanism called LASIRC. By doing so, the algorithm provides assurances on task timeliness behaviors, despite system unreliability and untrustworthiness. Our performance evaluation shows LRTG’s effectiveness.
Comments: To be supplied.
Han 07c
RTG-L: Dependably Scheduling Real-Time Distributable Threads in Large-Scale, Unreliable Networks
Kai Han, Binoy Ravindran, and E. Douglas Jensen
Proc. of the IEEE Pacific Rim International Symposium on Dependable Computing, December 2007
Abstract. We consider scheduling real-time distributable threads in the presence of node/link failures and message losses in large-scale network systems. We present a distributed scheduling algorithm called RTG-L. The algorithm uses gossip-based communication for dynamically and dependably discovering eligible nodes. Traditionally, gossip protocols incur high message overhead. We explain that this problem is not that serious. We present a gossip-based message propagation protocol with lower message overhead. In scheduling local thread sections, RTG-L exploits slacks to optimize gossip time utilization. Thereby, it satisfies end-to-end time constraints with application-specific probabilistic assurance. Our simulation studies verify our analytical results.
Comments: To be supplied.
Han 07b
Byzantine-Tolerant Point-To-Point Information Propagation in Untrustworthy and Unreliable Networks
Kai Han, Binoy Ravindran, and E. Douglas Jensen
Proc. of the International Conference on Network-Based Information Systems, 2007
Abstract. In a decentralized network system, an authenticated node is referred to as a Byzantine node, if it is fully controlled by a traitor or an adversary, and can perform destructive behavior to disrupt the system. Typically, Byzantine nodes together or individually attack point-to-point information propagation by denying or faking messages. In this paper, we assume that Byzantine nodes are of some intelligence—that is, they can protect themselves from being identified by authentication mechanisms, including encryption mechanisms. Therefore, a node cannot trust any other node in the system. We present an authentication-free, gossip-based application-level propagation mechanism called LASIRC, in which “healthy” nodes utilize Byzantine features to defend against Byzantine attacks. We show that LASIRC is robust against message-denying and message-faking attacks. Our experimental studies verify LASIRC’s effectiveness.
Comments: To be supplied.
Han 07a
Probabilistic, Real-Time Scheduling of Distributable Threads Under Dependencies in Ad Hoc Networks
Kai Han, Binoy Ravindran, and E. Douglas Jensen
Proc. of the IEEE Wireless Communications and Networking Conference , 2007
Abstract. We consider scheduling distributable real-time threads that are subject to dependencies (e.g., due to mutual exclusion constraints) in ad hoc networks, in the presence of node and link failures, message losses, and dynamic node joins and departures. We present a gossip-based distributed scheduling algorithm, called RTG-D. We prove that thread blocking times under RTG-D are probabilistically bounded, thereby probabilistically bounding thread time constraint satisfactions. Our simulation results validate RTG-D’s effectiveness.
Comments: To be supplied.
Han 07
Exploiting Slack for Scheduling Dependent, Distributable Real-Time Threads in Mobile Ad Hoc Networks
Kai Han, Binoy Ravindran, and E. Douglas Jensen
Proc. of the International Conference on Real-Time and Network Systems, pages 225-234, March 2007
Abstract. In a decentralized network system, an authenticated node is referred to as a Byzantine node, if it is fully controlled by a traitor or an adversary, and can perform destructive behavior to disrupt the system. Typically, Byzantine nodes together or individually attack point-to-point information propagation by denying or faking messages. In this paper, we assume that Byzantine nodes are of some intelligence—that is, they can protect themselves from being identified by authentication mechanisms, including encryption mechanisms. Therefore, a node cannot trust any other node in the system. We present an authentication-free, gossip-based application-level propagation mechanism called LASIRC, in which “healthy” nodes utilize Byzantine features to defend against Byzantine attacks. We show that LASIRC is robust against message-denying and message-faking attacks. Our experimental studies verify LASIRC’s effectiveness.
Comments: To be supplied.
Huang 08a
RT-P2P: A Scalable Real-Time Peer-to-Peer System with Probabilistic Timing Assurances
Fei Huang, Binoy Ravindran, and E. Douglas Jensen
Proc. of the IEEE/IFIP International Conference on Embedded And Ubiquitous Computing (EUC), December 2008
Abstract. We present RT-P2P, a real-time peer-to-peer (P2P) system that allows application-level end-to-end timing requirements to be satisfied in P2P systems. P2P systems are fundamentally characterized by: a large number of geographically distributed nodes that require little infrastructural support from the underlying network; an unbounded number of nodes (a permanently evolving set of nodes); and consequently no process/node with global knowledge of the system structure. Interesting features of such networks include the fact that they allow a rich set of nodes to act as relay points for other nodes, and a rich set of overlay paths (selected by peers) to be constructed. These features have traditionally made overlay routing—where end hosts have the flexibility of routing traffic to their destinations through the desired choice of intermediate overlay nodes (unlike in IP routing)—a very attractive approach for end-to-end performance optimizations in P2P networks. Key aspects of our RT-P2P infrastructure include a real-time P2P protocol, real-time communication algorithm, and analytical performance models. We analytically establish the timing properties of RT-P2P. Our simulation studies validate our analytical results.
Comments: To be supplied.
Huang 08
Integrated Real-Time Scheduling and Communication with Probabilistic Timing Assurances in Unreliable Distributed Systems
Fei Huang, Kai Han, Binoy Ravindran, and E. Douglas Jensen
Proc. of the IEEE International Conference on Engineering of Complex Computer Systems, March/April 2008
Abstract. We consider distributed real-time systems that operate under run-time uncertainties including those on execution times and communication delays, and subject to arbitrary node failures and message losses. We present an integrated real-time scheduling and communication algorithm called Real-Time Scheduling with Reliable Data Delivery (RTSRD) that provides probabilistic end-to-end assurances on distributed task timeliness behaviors in such systems. RTSRD considers distributed tasks with end-to-end timing requirements that are expressed using time/utility functions and the optimality criterion of maximizing the total accrued utility. The algorithm decomposes end-to-end time constraints into local time constraints, and uses local slack time for node-local real-time scheduling and node-to-node real-time communication. We analytically establish RTSRD’s properties including probabilistic satisfaction of task time constraints. We also compare RTSRD with a prior algorithm called RTG-L for the same problem. Our comparisons show that RTSRD outperforms RTG-L in terms of timeliness assurances (stronger) and algorithm overhead (lower).
Comments: To be supplied.
Jensen 04b
Application QoS Based Time-Critical Machine-to-Machine Resource Management in BM/C2 Systems
E. Douglas Jensen
Proc. of the 2004 International Command and Control Research and Technology Symposium, September 14-17, 2004
Abstract. This paper summarizes a novel paradigm for expressing, enforcing, and formally reasoning about timliness of machine-to-machine resource management in battle management (BM) and command and control (C2 ) systems. Such systems are largely dynamic and asynchronous, and have time frames of O(10-1) seconds and higher. Thus they fall into a neglected gap between traditional static periodic “real-time” systems, and traditional “any time” scheduling/planning systems (e.g., for logistics). The paradigm uses application-level QoS metrics (such as track quality, circular error probable, etc.) to derive utility functions for completing tasks, and then uses those utility functions for resource management. This paradigm has been successfully employed in several experimental BM/C2 demonstration systems, and is the topic of on-going research by industry and academia.
Comments: This paper exposes the Time/Utility Function, Utility Accrual resource management paradigm to the larger international MoD BM/C2 community. Almost the same paper was accepted by the 2004 Command and Control Research and Technology Symposium for a primarily U.S. DoD audience.
Jensen 04a
Application QoS Based Time-Critical Machine-to-Machine Resource Management in BM/C2 Systems
E. Douglas Jensen
Proc. of the 2004 Command and Control Research and Technology Symposium, May 12-14, 2004
Abstract. This paper summarizes a novel paradigm for expressing, enforcing, and formally reasoning about timliness of machine-to-machine resource management in battle management (BM) and command and control (C2 )systems. Such systems are largely dynamic and asynchronous, and have time frames of O(10-1) seconds and higher. Thus they fall into a neglected gap between traditional static periodic “real-time” systems, and traditional “any time” scheduling/planning systems (e.g., for logistics). The paradigm uses application-level QoS metrics (such as track quality, circular error probable, etc.) to derive utility functions for completing tasks, and then uses those utility functions for resource management. This paradigm has been successfully employed in several experimental BM/C2 demonstration systems, and is the topic of on-going research by industry and academia.
Comments: This paper exposes the Time/Utility Function, Utility Accrual resource management paradigm to the larger DoD BM/C2 community. Almost the same paper was accepted by the International Command and Control Research and Technology Symposium for an international MoD audience.
Jensen 04
Timeliness in Mesosynchronous Real-Time Distributed Systems
E. Douglas Jensen
Proc. of the 7th IEEE International Symposium on Object-Oriented Real-Time Computing, May 12-14, 2004
Abstract. Traditional real-time computing concepts and techniques are focused on static, synchronous, relatively small-scale, mostly centralized, device-level subsystems. Many real-time systems, particularly distributed ones, are relatively large-scale, above the device level, and at least partially dynamic and asynchronous. In this paper, we call such systems “mesosynchronous.” For example, mesosynchronous systems often are found in military surveillance and force projection platforms, and in network-centric warfare (plus civilian domains). Hence the lives of both friends and foes depend on the timeliness properties of such systems being dependably acceptable according to application- and situation-specific criteria. The real-time research community has historically failed to perceive and appreciate this—admittedly difficult and domain-knowledge intensive—problem, especially for end-to-end timeliness in distributed mesosynchronous real-time systems.
Comments: This is my invited position paper for a panel session titled “Major push needed in real-time distributed computing research.” It being a position paper enabled me to include some of my viewpoints that would otherwise not fit well into a conference proceedings paper.
Jensen 03c
Application QoS-Based Time-Critical Automated Resource Management in Battle Management Systems
E. Douglas Jensen
Proc. of the IEEE Workshop on Object-Oriented Real-Time Dependable Systems, October 1-3, 2003
Abstract. This paper summarizes some of our unclassified work on concepts and techniques for performing automated run-time time-critical resource management (especially scheduling) in large scale, dynamic, control systems such as for battle management. The approach described here is based on application-level quality of service metrics, such as track quality and weapon spherical error probable. These metrics are used to derive parameters for thread time constraints in the form of time/utility functions. Threads are scheduled according to application-specific optimality criteria that seek to maximize accrued utility to the system. Two worked examples illustrate the cost-effectiveness of this approach for this class of application.
Comments: This is an update of my paper in the Proceedings. The major revision is improvements in the worked examples sections.
Jensen 03b
The Evolution of Real-Time CORBA: The Good, the Bad, and the Ugly
E. Douglas Jensen
OMG Workshop on Distributed Object Computing for Real-time and Embedded Systems, July 15, 2003
Abstract. None — see Comments below
Comments: This invited keynote presentation established a new precedent for an OMG technical meeting. Each OMG event has a sponsor, almost always a CORBA vendor. The sponsor is allowed to give a 60-minute keynote presentation. Traditionally, an employee of the sponsor gives somewhat of a marketing presentation for his company. The sponsors of this workshop were PrismTech and DARPA, so each was allowed a 30-minute presentation. PrismTech decided that instead of the traditional vendor marketing presentation, they would create a precedent and invite someone else—me—to give a non-marketing presentation. The topic they specified was “The Evolution of Real-Time CORBA;” I added “The Good, the Bad, and the Ugly.” The rationale for the topic was that as real-time CORBA has gotten increasingly popular, and consequently as the participation in this annual workshop has grown and diversified, many if not most of the attendees are relative new-comers to real-time CORBA. Most of them didn’t participate in the creation of the real-time CORBA 1.0 and 1.2 standards, much less from the beginning of the process; and thus they know little if anything about how it happened. I was a major contributor to the real-time CORBA 1.0 and 1.2 standards from the beginning, so I was PrismTech’s choice to provide a historical overview. These slides without the narrative leave a lot of material out, so eventually I’ll add some notes at least. (Doug Schmidt provided an overview of relevant DARPA programs.)
Jensen 03a
A Timeliness Paradigm for Mesosynchronous Real-Time Systems
E. Douglas Jensen
9th Embedded and Real-Time Applications and Systems Symposium, May 2003.
Abstract. None — see Comments below
Comments: This was an invited plenary presentation that covers a little of the material on this web site. It includes the first use of the term mesosynchronous. By mesosynchronous real-time systems I mean those that are in the middle ground between
|
|
|
The real world has many real-time systems at various points in this mesosynchronous middle ground. The derivation of “mesosynchronous” from “mesodynamics” also reflects that synchronous real-time computing, like classical physics, is well understood, but asynchronous real-time computing, like quantum mechanics, requires a paradigm shift.
Jensen 03
Real-Time for the Real World
E. Douglas Jensen
NASA JPL Center for Space Mission Information and Software Systems, IT Seminar Series, January 22, 2003
Abstract. None — see Comments below
Comments: This presentation is a brief condensation of some pages of this web site.
Jensen et al. 02a
Guest Editors’ Introduction to Special Section on Asynchronous Real-Time Distributed Systems
E. Douglas Jensen and Binoy Ravindran
IEEE Transactions on Computers, August 2002
Abstract. None — see Comments below
Download (PDF) — requires IEEE Xplore or IEEE Computer Society Digital Library subscription
Comments: The August 2002 issue of IEEE Transactions on Computers (ToC) [ ] contains a special section on asynchronous distributed real-time computers, edited by Binoy Ravindran and myself. It also contains a paper on that topic (involving time/utility functions) by Hegazy and Ravindran that was omitted from the special section to avoid any potential for perceived conflict of interest. (There are also two unrelated Brief Contributions.) To my knowledge, this is the first compilation of papers on this topic. It is a short compilation: the ToC has a strict page limit; and suitable papers were not easy to find because there is not yet much research on this topic, and most of that is either more empirical than theoretical (and hence at least perceived by those researchers to be inappropriate for submission to ToC) or too much more related to the theory of distributed computing than to the theory of real-time computing. This set of papers clearly illuminated the interesting fact that there are two distinct schools of thought about asynchronous distributed real-time computer systems, each firmly held and avidly defended by its proponents; we discuss this in the Guest Editors’ Introduction to the Special Section. We decided to do something unusual (perhaps unprecedented) in ToC: with the permission of the ToC Editor and the Special Section authors, we sent each author copies of the other papers and asked them to include some comparative material in their own papers. We believe that this provided for valuable insight about the topic of asynchronous distributed real-time computer systems to the readers of that ToC issue. The abstracts of the five papers (in the order in which they appear in the issue):
Shivakant Mishra, Christof Fetzer, and Flaviu Cristian, The Timewheel Group Communication System |
|
Yun Wang, Emmanuelle Anceaume, Francisco Brasileiro, FabÃola Greve, and Michel Hurfin, Solving the Group Priority Inversion Problem in a Timed Asynchronous System |
|
Paulo VerÃssimo and Antonio Casimiro, The Timely Computing Base Model and Architecture |
|
Jean-François Hermant and Garard Le Lann, Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems |
|
Tamir Hegazy and Binoy Ravindran, Using Application Benefit for Proactive Resource Allocation in Asynchronous Real-Time Distributed Systems |
Jensen et al. 02
The Distributed Real-Time Specification for Java: A Status Report
E. Douglas Jensen, Andy Wellings, Ray Clark, and Doug Wells
Proc. Embedded Systems Conference/San Francisco, 2002
Abstract (slightly revised). This paper summarizes the status as of December 2001 of the Distributed Real-Time Specification for Java (DRTSJ) [ ], being developed by the JSR-50 Expert Group in Sun’s Java Community Process. The paper first defines what “real-time,” “distributed,” and “distributed real-time” mean in the context of the DRTSJ, since those terms are commonly used in widely different ways. The DRTSJ is focused on supporting predictability of end-to-end timeliness for sequentially trans-node behaviors (e.g., chains of invocations) in dynamic distributed object systems. The DRTSJ introduces the Distributed Real-Time Remote Method Invocation (RMI) [ ] model, based on the Alpha distributed real-time OS kernel’s distributed threads model [ ]. The OMG Real-Time CORBA/Dynamic Scheduling specification also employs a subset of the distributed threads model, so these two distributed real-time computing system programming model standards are relatively congruent. The DRTSJ consists of extensions to the Real-Time Specification for Java (RTSJ), Java’s RMI, and the RTSJ Java Virtual Machine [ ].
Comments: This paper contains the material from [Wellings et al. 01] plus substantial motivational material about real-time, distributed systems, and distributed real-time. For additional detail about the DRTSJ status as of December 2001, see [Wellings et al. 02]; for additional detail about the DRTSJ status as of September 2002, see [Clark et al. 02]. The DRTSJ work, including the reference implementation, was in hibernation for 18 months due to MITRE staff over commitment, but resumed in Fall 2004 and is expected to be released in Spring 2005.
Jensen 01
The Distributed Real-Time Specification for Java—An Initial Proposal
E. Douglas Jensen
Journal of Computer Systems Science and Engineering, March 2001
Abstract (slightly revised). Work has begun on the Distributed Real-Time Specification for Java (DRTSJ) [ ], as part of Sun’s Java Community Process. This paper summarizes some ideas about an initial approach to the specification. The approach is based on providing a natural and minimal mechanistic extension to Remote Method Invocation (RMI) to support the end-to-end timeliness (and other) properties of distributed—in the sense of trans-node—behaviors. These timeliness properties must be preserved for any distributed real-time computing system, regardless of its application programming model —whether RPC, mobile objects, or whatever. In particular, the proposed extension facilitates writing real-time distributed Java programs using a distributed threads programming abstraction [Jensen 00b] derived from the Alpha OS kernel’s distributed threads model [Clark et al. 93], and very similar to the distributable threads subset in the OMG specification for Real-Time CORBA 2/Dynamic Scheduling.
Comments: This paper describes the initial proposal submitted to Sun as Java Specification Request 50 [ ]. For a status report as of December 2001, see [Jensen et al. 02] and for additional detail see [Wellings et al. 02]; for additional detail about the DRTSJ status as of September 2002, see [Clark et al. 02]. The DRTSJ work, including the reference implementation, was in hibernation for 18 months due to MITRE staff over commitment, but resumed in Fall 2004 and is expected to be released in Spring 2005.
Jensen 00
Utility Functions: A General Scalable Technology for Software Execution Timeliness as a Quality of Service
Part 1: Motivation
E. Douglas Jensen
Dagstuhl Seminar 00271: Stochastic and Dynamic Real-Time Systems, 2000
Abstract. Traditional real-time computing system concepts, theories, and technologies are almost all focused on classical device-level, simple, static, centralized, subsystems for monitoring and first order control. Increasingly many of the most important real-time (i.e., application time-constraint driven) control problems are at higher levels of an enterprise, more complex, more dynamic, more distributed, and higher order control. Traditional real-time computing does not scale up to address these problems—a different paradigm is needed. This presentation intends to help motivate the need for such a new timeliness paradigm.
Comments: This is the first of a 2-part presentation. The second part is about the time/utility function, utility accrual, model of timeliness [Jensen 00a], and has been superseded by the material on this web site.
Jensen 00a
Utility Functions: A General Scalable Technology for Software Execution Timeliness as a Quality of Service
Part 2: The Utility Function Model of Timeliness
E. Douglas Jensen
Dagstuhl Seminar 00271: Stochastic and Dynamic Real-Time Systems, 2000
Abstract. This presentation is motivated by the preceding presentation [Jensen 00], and summarizes the time/utility (originally called time/value) function, utility accrual, model of timeliness.
Comments: This presentation is the second part of a 2-part presentation (for part 1 see [Jensen 00]) but has been superseded by the material on this web site.
Jensen 00b
Distributed Threads: Technology for Structuring Distributed Real-Time Software
E. Douglas Jensen
MITRE 12th Software Technology Conference, April 2000
Abstract.It can be highly effective to structure a software application to be distributed in the sense that its components reside on a multiplicity of physically dispersed computing nodes. Distributed threads provide for the end-to-end timeliness of such a distributed application to be as predictable as possible, given the characteristics of the underlying infrastructure (such as the node OS’s and the network). They are the basis for the forthcoming Distributed Real-Time Specification for Java and for the OMG Real-Time CORBA 2/Dynamic Scheduling standard.
Comments: This presentation about distributed threads is my most current one, although the typical presentation time allowed forces a lot of details to be omitted. For use cases, see the C2 Demo technical report [Maynard et al. 88], the AWACS Adaptive Tracker paper [Clark et al. 99], and the real-time multilevel security paper [Clark et al. 93a]. The USENIX 93 paper [Clark et al. 93] is still a good introduction to how distributed threads fit into a distributed real-time OS kernel.
Jensen 92
Asynchronous Decentralized Real-Time Computer Systems
E. Douglas Jensen
Proc. NATO Advanced Study Institute on Real-Time Computing, Springer-Verlag, October 1992
Abstract. A new generation of real-time computer systems performs physically and logically decentralized mission management—such as semi-autonomous entities collaboratively performing manufacturing, maintenance, combat, etc. These entities must robustly accommodate significant run-time uncertainties in their application environment and system resource state, by being dynamically adaptive. In particular, such systems have mission-critical time constraints which must be satisfied acceptably—as specified by the application—given the current application and system circumstances. Thus, they violate the static, deterministic, synchronous premises on which most conventional real-time computing concepts and techniques are founded. We are developing a different and more appropriate paradigm for timeliness in asynchronous (aperiodic), decentralized, real-time mission management computing systems. It is scalable to encompass a wide spectrum of real-time “hardness” and “softness” in a unified way, and permits the use of best-effort resource management algorithms. The progenitor of this timeliness paradigm was publicly introduced in the Alpha decentralized real-time OS kernel [ ]; and its current incarnation is being incorporated into a new real-time version of the Mach 3 kernel [ ].
Comments: TBD
Jensen et al. 85
A Time-Driven Scheduling Model for Real-Time Systems
E. Douglas Jensen, C. Douglass Locke, and Hideyuki Tokuda
Proc. of the Real-Time Systems Conference, IEEE, December 1985
Abstract. Process scheduling in real-time systems has almost invariably used one or more of three algorithms: fixed priority, FIFO, or round robin. The reasons for these choices are simplicity and speed in the operating system, but the cost to the system in terms of reliability and maintainability have not generally been assessed. This paper originates from the notion that the primary distinguishing characteristic of a real-time system is the concept that completion of a process or a set of processes has a value to the system which can be expressed as a function of time. This notion is described in terms of a time-driven scheduling model for real-time operating systems and provides a tool for measuring the effectiveness of most of the currently used process schedulers in real-time systems. Applying this model, we have constructed a multiprocessor real-time system simulator with which we measure a number of well-known scheduling algorithms such as Shortest Process Time (SPT), Deadline, Shortest Slack Time, FIFO, and a fixed priority scheduler, with respect to the resulting total system values. This approach to measuring the process scheduling effectiveness is a first step in our longer term effort to produce a scheduler which will explicitly schedule real-time processes in such a way that their execution times maximize their collective value to the system, either in a shared memory multiprocessing environment or in multiple nodes of a distributed processing environment.
Comments: This was the first publicly published work on this topic since I initiated it for a classified application in 1977 [Jensen 77, Kain et al. 77]. It was a precursor to two of my students’ Ph.D. theses—first Doug Locke’s [Locke 86] and subsequently Ray Clark’s [Clark 90]. This timeliness model was first implemented in the Alpha distributed real-time OS kernel [Clark et al. 93, Maynard et al. 88, Northcutt 87]. Beginning in 2003, I began collaborating with Binoy Ravindran and his students at Virginia Institute of Technology to continue research on this topic; see the papers here by my research colleagues, Ph.D. students, and myself.
Li et al. 06a
Utility Accrual Real-Time Resource Access Protocols with Assured Individual Activity Timeliness Behavior
Peng Li, Haisang Wu, Binoy Ravindran, and E. Douglas Jensen
International Conference on Real-Time and Network Systems, May 2006
Abstract. We present a class of utility accrual resource access protocols for real-time embedded systems. The protocols consider application activities that are subject to time/utility function time constraints, and mutual exclusion constraints for concurrently sharing non-CPU resources. We consider the timeliness optimality criteria of probabilistically satisfying individual activity utility lower bounds and maximizing total accrued utility. The protocols allocate CPU bandwidth to satisfy utility lower bounds; activity instances are scheduled to maximize total utility. We establish the conditions under which utility lower bounds are satisfied.
Comments. comment text.
Li et al. 06
A Utility Accrual Scheduling Algorithm for Real-Time Activities With Mutual Exclusion Resource Constraints
Peng Li, Haisang Wu, Binoy Ravindran, and E. Douglas Jensen
IEEE Transactions on Computers, 2006
Abstract. This paper presents a uni-processor real-time scheduling algorithm called the Generic Utility Scheduling algorithm (which we will refer to simply as GUS). GUS solves a previously open real-time scheduling problem—scheduling application activities that have time constraints specified using arbitrarily shaped time/utility functions, and have mutual exclusion resource constraints. A time/utility function is a time constraint specification that describes an activity’s utility to the system as a function of that activity’s completion time [ ]. Given such time and resource constraints, we consider the scheduling objective of maximizing the total (e.g., summed) utility that is accrued by the completion of all activities. Since this problem is NP-hard, GUS heuristically computes schedules with a polynomial-time cost of O(n3) at each scheduling event, where n is the number of activities in the ready queue. We evaluate the performance of GUS through simulation and by an actual implementation on a real-time POSIX-compliant [ ] operating system. Our simulation studies and implementation measurements reveal that GUS performs close to, if not better than, the existing algorithms for the cases that they apply. Furthermore, we analytically establish several properties of GUS.
Comments. comment text.
Li et al. 05a
Stochastic Utility Accrual Real-Time Scheduling with Task- Level and System-Level Timeliness Assurances
Peng Li, Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen
International Symposium on Object-oriented Real-time distributed Computing, May 2005
Abstract. Heuristic algorithms have enjoyed increasing interest and success in the context of Utility Accrual (UA) scheduling. However, few analytical results, such as bounds on task-level and system-level accrued utilities are known. In this paper, we propose the S-UA algorithm that can provide probabilistic bounds on task-level accrued utilities. Lower bound on system-level accrued utility ratio is also derived and maximized by S-UA.
Comments. comment text.
Li et al. 05
Utility Accrual Resource Access Protocols with Assured Timeliness Behavior for Real-Time Embedded Systems
Peng Li, Binoy Ravindran, and E. Douglas Jensen
ACM SIGPLAN/SIGBED 2005 Conference on Languages, Compilers, and Tools for Embedded Systems , June 2005
Abstract. We present a class of utility accrual resource access protocols for real-time embedded systems. The protocols consider real-time application activities that are subject to time/utility function time constraints, and mutual exclusion constraints for concurrently accessing shared, non-CPU resources. We consider the timeliness optimality criteria of probabilistically satisfying individual activity utility lower bounds and maximizing total accrued utility. The protocols use an approach, where CPU bandwidth is allocated to activities to satisfy utility lower bounds, and activity instances are scheduled to maximize total utility. We analytically establish upper bounds on resource access blocking times, and establish the conditions under which utility lower bounds are satisfied.
Comments. We have previously published UA scheduling algorithms that allow concurrent sharing of resources. However, these algorithms do not provide any timeliness assurances on individual activity behavior, only on collective behavior. On the other hand, we have previously published UA scheduling algorithms that provide timeliness assurances on individual activities that provide probabilistically assured satisfaction of lower bounds on individual activities’ accrued utilities. However, these algorithms do not allow concurrent resource sharing. Thus, prior UA scheduling algorithms address two extremes: (1) those that allow concurrent resource sharing, but provide no timeliness assurances on individual activity behavior; and (2) those that provide timeliness assurances on individual activity behavior, but without allowing concurrent resource sharing. There have been no UA algorithms that bridge these extremes by providing individual activity timeliness assurances under concurrent resource sharing. We solve this exact problem in this paper.
Li et al. 04c
Utility Accrual Scheduling of Distributable Threads: The Tempus Approach
Peng Li, Binoy Ravindran, and E. Douglas Jensen
Eighth Annual Workshop on High Performance Embedded Computing, September 2004
Abstract. We have developed the Tempus middleware that supports Distributed Threads as a programming and scheduling abstraction for system-wide, end-to-end scheduling in real-time distributed systems. Distributed Threads in Tempus can be subject to time constraints including those specified using arbitrarily-shaped TUFs and timeliness optimality criteria including maximizing accrued utility.
Comments. This is a brief overview of the Tempus testbed at Virginia Tech. Tempus provides the Real-Time CORBA 1.2 distributable threads programming model, and the Real-Time CORBA 1.2 scheduling framework into which arbitrary scheduling disciplines can be plugged. Tempus allows experimentation with scheduling distributable threads— especially using time/utility functions and utility accrual disciplines—using the Real-Time CORBA 1.2 API’s without requiring all the rest of Real-Time CORBA.
Li et al. 04b
Adaptive Time-Critical Resource Management Using Time/Utility Functions: Past, Present, and Future
Peng Li, Binoy Ravindran, and E. Douglas Jensen
28th International Computer Software and Applications Conference (COMPSAC), September 2004
Abstract. Time/utility function time constraints (TUFs) and utility accrual (UA) scheduling optimality criteria, constitute, arguably, the most effective and broadest approach for adaptive, dynamic time-critical resource management. A TUF, which is a generalization of the classical deadline constraint, specifies the utility of completing an application activity as an application- or situation-specific function of that activity’s completion time. With TUF time constraints, timeliness optimality criteria can be specified in terms of accrued (e.g., summed) activity utilities. This paper overviews past and recent advances on adaptive resource management for dynamic time-critical systems using UA algorithms. Emerging challenges and new research directions are also identified.
Comments: This extended abstract provides a brief overview of the history of time/utility functions and utility accrual scheduling disciplines from their invention by Jensen at Honeywell for radar scheduling, through their initial exposition by two of Jensen’s Ph.D. students at CMU, to the present major advances by a collaboration between Virginia Institute of Technology and the MITRE Corporation, to a glimpse of some of the next steps. The History section of this web site provides more, and more current, information.
Li et al. 04a
Utility Accrual Real-Time Scheduling with Probabilistically Assured Timeliness Performance
Peng Li, Binoy Ravindran, and E. Douglas Jensen
Proc. of the 2004 International Workshop on Probabilistic Analysis Techniques for Real-Time and Embedded Systems, in conjunction with the 4th International Conference on Embedded Software, September 2004
Abstract. We present time/utility function (TUF) algorithms that provide probabilistic assurance on timeliness behavior. A TUF, which is a generalization of the classical deadline constraint, specifies the utility of completing an application activity as a function of that activity’s completion time. The algorithms consider a stochastic model where activity execution times and arrivals are probabilistically described. Further, activity time constraints are specified using TUFs. We consider the dual optimization objective of probabilistically satisfying application specified lower bounds on individual activity utility, and maximizing system-wide total utilities. We present algorithms that achieve this dual objective.
Comments. This paper briefly summarizes the first steps toward a major generalization of the TUF/utility accrual scheduling paradigm. Until this work, all extant UA algorithms sought only maximal accrued utility. So, for example, two tasks T1 and T2 might be scheduled such that they contributed utilities U1 and U2 accruing (typically summing) to U12. This schedule might result in task T3 contributing utility U3 < U12 or even preclude task T3 from being scheduled at all. Although this is to be expected, and indeed is the intent of the optimality criterion, it is not always what is desired. Suppose that the maximum possible utilities of T1 and T2 are less than that of T3—say 4, 5, and 8, respectively. If the utility magnitudes were assigned to reflect the relative importance of the tasks, a more appropriate scheduling optimality criterion might take these maximum utilities into account—e.g., with weighting. But there is a more expressive and powerful generalization of TUF/UA scheduling that encompasses not just this case but many others. The first generalization is a 2-criterion approach: allow lower bounds to be specified for each TUF’s attained utility and for the accrued utility; the scheduling discipline must provide for the appropriate application-specific trade-off between these two criteria since in general they will not both be satisfied optimally. The second generalization is to make the first generalization stochastic: in addition to specifying the lower bounds on utilities, specify lower bounds for probabilities that the utility lower bounds will be satisfied. This paper addresses a subset of both these generalizations: for non-increasing TUF’s, an algorithm has been devised which allows probabilistic satisfaction of task utility lower bounds, while also maximizing accrued utility. Theorems are proved about the timeliness properties of this algorithm. A next step toward these two generalizations—accommodating probabilistic accrued utility—plus additional TUF/UA scheduling advances, appears in Peng Li’s Ph.D. thesis. Probabilistic lower bounds on individual and accrued utilities are also accommodated—along with utility-based management of system energy—by an algorithm described in two papers by Wu et al., Wu et al. 04b and Wu et al. 04c.
Li et al. 04
Scheduling Distributable Real-Time Threads in Tempus Middleware
Peng Li, Binoy Ravindran, Hyeonjoong Cho, and E. Douglas Jensen
Proc. of the 10th International Conference on Parallel and Distributed Systems, July 2004
Abstract.We present the Tempus real-time middleware. Tempus supports the OMG [ ] Real-Time CORBA 2.0’s [ ] distributable threads (DTs) [ ] as a native programming and scheduling abstraction for system-wide, end-to-end scheduling in real-time distributed systems. DTs in Tempus can be subject to time constraints including those specified using time/utility functions (TUFs) [ ], timeliness optimality criteria including maximizing accrued utility, and resource constraints including mutual exclusion constraints. Tempus schedules DTs by propagating their scheduling parameters, as they transit node boundaries. Node-local scheduling algorithms use the propagated parameters to construct local schedules and resolve resource dependencies for local timeliness optimization, toward approximate, system-wide timeliness optimization. Tempus uses an application-level scheduling framework for node-local TUF scheduling on real-time POSIX-compliant [ ] operating systems. Our experimental measurements demonstrate the effectiveness of the middleware in scheduling DTs.
Comments: Tempus is the first publicly available middleware that supports TUF/UA scheduling of distributable threads. Although the OMG Real-Time CORBA 1.2 (ne’e 2.0) specification provides distributable threads and supports TUF/UA scheduling, at this time there are no available implementations that include both of these features (although we know of at least two in progress).
Locke 86
Best-Effort Decision Making for Real-Time Scheduling
C. Douglass Locke
Ph.D. Thesis, CMUCS-86-134, Department of Computer Science, Carnegie Mellon University, 1986 [Locke 86].
Abstract. The objective of this research was to develop a new approach to making scheduling decision in real-time computer systems for such demanding control applications as spacecraft control and industrial automation. The approach used applies not only to scheduling process execution time, but also to scheduling other resources such as physical and logical messages on a communications medium.
Real-time systems differ fundamentally from non-real-time systems in several ways; most significantly in that the performance metric of interest is not throughput, but rather “response time” in the sense of completing the most critical tasks according to application-defined time constraints. Stated another way, we define a real-time system to be one in which there is an application environment defined value to the system for completing a process at a particular time, and in which assertions about this value as a function of time form an essential part of the correctness of the computation.
In response to the proliferation of expensive, fragile systems that are prone to unpredictable behavior and failure under overload conditions, we have defined a scheduling algorithm which explicitly utilizes the time-varying completion value of a process to determine the sequence in which a set of competing processes should be executed. Using these functions, the scheduler attempts to make scheduling decisions which maximize the total value to the system. Processes are assumed to be independent and preemptible, and to exhibit a stochastic processing time. The computational environment consists of one or more processors executing a single set of processes.
Having evaluated this scheduler in a real-time system simulator, we conclude that in a heavily loaded real-time system (either a transient or persistent load), this time-driven, value-function controlled algorithm provides a consistently high total value for a large number of task sets. The algorithm produced this high total value under extremely widely varying load conditions, process execution times and distributions, execution time knowledge, numbers of processors, periodic and aperiodic processes, either steady or intermittent load levels, and with a wide variety of either homogeneous or non-homogeneous value functions.
Comments: This was the first of two theses my CMU CS Ph.D. students produced about time/utility (ne’e time-value) function scheduling. Locke’s thesis was preceded by the Jensen et al. paper in RTSS 85 — the first publicly published work on this topic since I created it for a classified application in 1977 [Jensen 77] [Kain et al. 77]. Douglas Locke was an unusual Ph.D. student—and unusual in the real-time computing research community—in that he performed his research after having had more than two decades of industrial experience building non-trivial real-time computing systems for DoD applications. That experience in the real world enabled him to perceive the liabilities of traditional real-time computing concepts and technologies, and the need for a new, more general, dynamic, and adaptive, timeliness model. Locke’s algorithm was first implemented in the Alpha distributed real-time OS kernel [Clark et al. 93, Maynard et al. 88, Northcutt 87]. The second of the two theses was Raymond Clark’s. Two significant features distinguish Locke’s and Clark’s theses: Locke’s algorithm allows almost arbitrary non-convex functions, but did not allow dependencies among processes; Clark’s algorithm allowed only downward step functions, but allowed dependencies. Clark’s algorithm, being second generation, also out-performs Locke’s in some cases.
Maynard et al. 88
An Example Real-Time Command, Control, and Battle Management Application for Alpha
David P. Maynard, Samuel E. Shipman, Raymond K. Clark,
J. Duane Northcutt, Russell B. Kegley, Betsy A. Zimmerman,
Peter J. Keleher
Archons Project TR-88121, CMU Computer Science Dept., December 1988
Abstract. This report describes the design, implementation, and evaluation of a demonstration battle management/command and control (BM/C2) system developed jointly by Jensen’s Archons project researchers at Carnegie Mellon University and by technical staff members at General Dynamics Corporation/Ft. Worth Division. Over the course of a few months, this team defined a realistic air defense scenario, designed a distributed, real-time BM/C2 system for this scenario, and developed a prototype implementation of the design using an early version of the Alpha operating system. In addition, a support environment was constructed to exercise the application. The primary goals of the application effort were to demonstrate the decentralized, real-time technology incorporated in the Alpha operating system and to evaluate that technology in a realistic context. To that end, the application exercises the distribution, real-time, and survivability mechanisms provided by Alpha. By transferring Alpha technology from a research environment into an industrial setting, Archons researchers were able to obtain feedback from professional designers and system builders. This feedback validated the belief that the programming model and key support mechanisms provided by Alpha are well-suited to the design and construction of large, distributed, real-time control applications.
Comments: This technical report documents a non-trivial application of the Alpha real-time distributed OS kernel [Clark et al. 93, Northcutt 87]. It illustrates the effective use of both time/utility functions and distributed threads. Another effective use of time/utility functions in a non-trivial application can be seen in the AWACS tracking application [Clark et al. 99]. These two demonstrations were the precursor to a number of successful technology transitions to DoD classified systems.
Na et al. 06
Garbage Collector Scheduling in Dynamic, Real-Time Multiprocessor Systems
C. Na, H. Cho, B. Ravindran, and E. D. Jensen
Proc. IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, August 2006
Abstract. We consider garbage collection in dynamic, multiprocessor real-time systems, and present a scheduling algorithm called GCMUA. The algorithm considers mutator activities that are subject to time/utility function time constraints, stochastic execution-time and memory demands, and overloads. We establish that GCMUA probabilistically lower bounds each mutator activity’s accrued utility, lower bounds the system-wide total accrued utility, and upper bounds the timing assurances—sensitivity to variations in mutator execution-time and memory demand estimates. Our simulation experiments validate our analytical results and confirm GCMUA’s effectiveness and superiority.
Comments:TBD
Northcutt 87
Mechanisms for Reliable Distributed Real-Time Operating Systems — The Alpha Kernel
J. Duane Northcutt
Academic Press, 1987. Out of print, available used.
Abstract. To be supplied
Comments: This book is essentially the thesis by J. Duane Northcutt, one of my CMU Ph.D. students. [To be continued…]
Pei et al. 08a
Self-organizing and Self-reconfigurable Event Routing in Ad Hoc Networks with Causal Dependency Awareness
Guanhong Pei, Binoy Ravindran, and E. Douglas Jensen
ACM Transactions on Autonomous and Adaptive Systems, December 2008
Abstract. Publish/subscribe (P/S) [ ]is a communication paradigm of growing popularity for information dissemination in large-scale distributed systems. The weak coupling between information producers and consumers in P/S systems is attractive for loosely coupled and dynamic network infrastructures such as ad hoc networks. However, achieving end-to-end timeliness and reliability properties when P/S events are causally dependent is an open problem in ad hoc networks. In this paper, we present, evaluate benefits of, and compare with past work, an architecture design that can effectively support timely and reliable delivery of events and causally related events in ad hoc environments, and especially in mobile ad hoc networks (MANETs). With observations from both realistic application model and simulation experiments, we reveal causal dependencies among events and their significance in a typical use notional system. We also examine and propose engineering methodologies to further tailor an event-based system to facilitate its self-reorganizing capability and self-reconfiguration. Our design features a two-layer structure, including novel distributed algorithms and mechanisms for P/S tree construction and maintenance. The trace-based experimental simulation studies illustrate our design’s effectiveness in both cases with and without causal dependencies.
Comments:This article is based on the paper that appeared as “On A Self-organizing MANET Event Routing Architecture with Causal Dependency Awareness,” Guanhong Pei, Binoy Ravindran, and E.D. Jensen, Second IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO), Venice, October 20-24, 2008.
Pei et al. 08
On A Self-organizing MANET Event Routing Architecture with Causal Dependency Awareness
Guanhong Pei, Binoy Ravindran, and E. Douglas Jensen
IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO), October 2008. (Full version of the paper is available as a Technical Report.)
Abstract. Publish/subscribe (P/S) [ ] is a communication paradigm of growing popularity for information dissemination in large scale distributed systems. The strong decoupling between information producers and consumers in P/S systems is attractive for loosely coupled and dynamic network infrastructures such as mobile ad hoc networks (MANETs). However, achieving end-to-end timeliness and reliability properties when P/S events are causally dependent is an open problem in MANETs. In this paper, we propose an architecture design for event routing in MANET that can effectively support timely and reliable event delivery with awareness of event causal dependencies. Our design features a two layer structure, including novel distributed algorithms and mechanisms for P/S tree construction and maintenance with self-organization capabilities. Our simulation-based experimental studies illustrate the architecture’s effectiveness.
Comments: To be supplied.
Ravindran et al. 07b
Assured-Timeliness Integrity Protocols for Distributable Real-Time Threads with in Dynamic Distributed Systems
Binoy Ravindran, Edward Curley, Jonathan Anderson, and E. Douglas Jensen
Abstract. Networked embedded systems present unique challenges for system designers composing distributed applications with dyanmic, real-time, and resilience requirements. We consider the problem of recovering from failures of distributable threads with assured timeliness in dynamic systems with overloads, and node and (permanent/transient) network failures. When a distributable thread encounters a failure that prevents its timely execution, the thread must be terminated. Thread termination involves detecting and aborting thread orphans, and delivering exceptions to the farthest, contiguous surviving thread segment for possible execution resumption. Thread termination operations must optimize system-wide timeliness. We present a scheduling algorithm called HUA and two thread integrity protocols called D-TPR and W-TPR. We show that they bound the orphan cleanup and recovery time with bounded loss of the best-effort property—i.e., high importance threads are always favored over low importance ones (for feasible completion), irrespective of thread urgency. Our implementation experience using the emerging Reference Implementation of Sun’s Distributed Real-Time Specification for Java (DRTSJ) [ ] demonstrates the algorithm/protocol’s effectiveness.
Comments: To be supplied.
Ravindran et al. 07a
On Distributed Real-Time Scheduling in Networked Embedded Systems in the Presence of Crash Failures
Binoy Ravindran and E. Douglas Jensen
Abstract. We consider the problem of scheduling distributable real-time threads in networked embedded systems that operate under run-time uncertainties including those on thread execution times, thread arrivals, and node failure occurrences. We present a distributed scheduling algorithm called CUA. We show that CUA satisfies thread time constraints in the presence of crash failures, is early-deciding, has an efficient message complexity of O(fn) (where f is the number of crashes that actually occur and n is the number of nodes), and is time-optimal with a time lower bound of O(D+fd+nk) (where D is the message delay upper bound, d is the failure detection bound, and k is the maximum number of threads). In crash-free runs, the algorithm constructs schedules within O(D + nk), and yields optimal total utility if nodes are also not overloaded. The algorithm is also “best-effort” (as opposed to UDP’s least effort) in that a high importance thread that may arrive at any time has a very high likelihood for feasible completion (in contrast to classical admission control algorithms which favor feasible completion of admitted threads over admitting new ones, irrespective of thread importance).
Comments: To be supplied.
Ravindran et al. 07
On Scheduling Exception Handlers in Dynamic Real-Time Systems
Binoy Ravindran, Ed Curley, and E. Douglas Jensen
International Conference on Embedded Software and Systems, Springer LNCS 4523, pages 510-529, May 2007
Abstract. We consider the problem of scheduling exception handlers in real-time systems that operate under runtime uncertainties including those on execution times, activity arrivals, and failure occurrences. The application/scheduling model includes activities and their exception handlers that are subject to time/utility function (TUF) time constraints and an utility accrual (UA) optimality criterion. A key underpinning of the TUF/UA
scheduling paradigm is the notion of “best-effort” (as opposed to UDP’s least effort) where high importance activities are always favored over low importance ones, irrespective of activity urgency. (This is in contrast to classical admission control models which favor feasible completion of admitted activities over admitting new ones, irrespective of activity
importance.) We consider a transactional style activity execution paradigm, where handlers that are released when their activities fail (e.g., due to time constraint violations) abort the failed activities after performing recovery actions. We present a scheduling algorithm called Handler-assured Utility accrual Algorithm (or HUA) for scheduling activities and their handlers. We show that HUA’s properties include bounded-time
completion for handlers and bounded loss of the best-effort property. Our implementation experience on a RTSJ (Real-Time Specification for Java) Virtual Machine demonstrates the algorithm’s effectiveness.
Comments: To be supplied.
Ravindran et al. 05
On Recent Advances in Time/Utility Function Scheduling and Resource Management
Binoy Ravindran, E. Douglas Jensen, and Peng Li
International Symposium on Object-oriented Real-time distributed Computing, May 2005
Abstract. We argue that the key underpinning of the current state-of-the real-time practice—the priority artifact—and that of the current state-of-the real-time art— deadline-based timeliness optimality—are entirely inadequate for specifying timeliness objectives, for reasoning about timeliness behavior, and for performing resource management that can dependably satisfy timeliness objectives in many dynamic real-time systems. We argue that time/utility functions and the utility accrual scheduling paradigm provide a more generalized, adaptive, and flexible approach. Recent research in the utility accrual paradigm have significantly advanced the state-of-the-art of that paradigm. We survey these advances.
Comments: During the years from 2003 through 2004, a MITRE/Virginia Tech research team performing collaborative research on time/utility functions and utility accrual based scheduling made major advances in the state of the art as documented in eighteen published scholarly papers. This paper summarizes those advances into a single reference. The team has many additional papers still in the reviewing process and to be written, but has not included information about them in this paper. The collaboration lasted eight years and produced 86 papers published in IEEE and ACM journals and conference proceedings—listed in this section of my web site.
Ravindran et al. 04
The Case for TUFs and UA Scheduling in RT UML Profile: A Real-Time Scheduling/Operating System Perspective
Binoy Ravindran, Peng Li, Haisang Wu, E. Douglas Jensen, and Shahrooz Feizabadi
Workshop on the Usage of the UML Profile for Scheduling, Performance and Time, IEEE Real-Time and Embedded Technology and Applications Symposium, May 2004
Abstract. This position paper makes the case for incorporating time/utility functions (TUFs) and the paradigm of utility accrual (UA) real-time scheduling in the planned, updated version of the UML Profile for Schedulability, Performance, and Time. The case is made by arguing that the key underpinning of the current state-of-the real-time practice— the priority artifact—and that of the current state-of-the real-time art—deadline-based timeliness optimality—are grossly inadequate for specifying application timeliness objectives, for reasoning about timeliness behavior, and for performing resource management that can dependably satisfy timeliness objectives in many large-scale, dynamic real-time systems. We argue that TUFs and utility accrual scheduling provide a more generalized, adaptive, and exible approach. Further, new research on utility accrual scheduling have significantly advanced the state-of-the-art of that paradigm. We survey these recent advances to provide the rationale for our case.
Comments: TUF/UA scheduling is now included as an issue in the Finalization Task Force for OMG’s UML Profile for Modeling Quality of Service and Fault Tolerance Characteristics and Mechanisms. The OMG process states that “A Finalization Task Force is responsible for drafting the changes that turn an Adopted Specification into an Available Specification. An Issue is recorded against an adopted specification when someone in industry wishes to bring before the FTF one of the following: a recommended technical change (e.g., resolving technical error or omission or introducing enhancement); an editorial change.”
Wellings et al. 02
A Framework for Integrating the Real-Time Specification for Java and Java’s Remote Method Invocation
Andy Wellings, Raymond Clark, E. Douglas Jensen, and Doug Wells
Proc. of the 5th IEEE International Symposium on Object Oriented Real-Time Distributed Computing, April 2002
Abstract. The Distributed Real-Time Specification for Java (DRTSJ) [ ] is being developed under Sun’s Java Community Process [ ]. It is focused on supporting predictable, end-to-end timeliness for sequentially distributed computations (e.g., chains of invocations) in dynamic distributed object systems. This paper reports on an investigation to integrate and extend the existing Real-Time Specification for Java (RTSJ) [ ] and Java’s Remote Method Invocation facility [ ] to provide the basis for the DRTSJ.
Comments: This paper is a summary of one of the work products produced by the authors of the DRTSJ [ ]. From experience we’ve discovered that it requires the readers to have a fairly good understanding of the RTSJ [ ], Java [ ], RMI [ ], and real-time distributed computing—especially of the Distributed Thread model from the Alpha distributed real-time OS kernel, and of the issues that programming model addresses [Clark 93][Jensen 00b]. This ISORC paper was preceded by a very short identically named paper at the Work in Progress session of the Real-Time Systems Symposium (RTSS), 2001 [Wellings et al. 01]. Concurrently with this ISORC paper was a paper at the Embedded Systems Conference/San Francisco 2002 [Jensen 02] which included the material from the RTSS paper plus substantial motivational material about real-time and distributed systems and real-time distributed systems.
Wu et al. 07a
Utility Accrual, Real-Time Scheduling Under the Unimodal Arbitrary Arrival Model with Energy Bounds
Haisang Wu, Binoy Ravindran, and E. Douglas Jensen
IEEE Transactions on Computers, October 2007
Abstract. In this paper, we consider timeliness and energy optimization in battery-powered, dynamic, embedded real-time systems, which must remain functional during an operation/mission with a bounded energy budget. We consider application activities that are subject to time/utility function time constraints, statistical assurance requirements on timeliness behavior, and an energy budget, which cannot be exceeded at run-time. To account for the inevitable variability in activity arrivals in dynamic systems, we describe arrival behaviors using the unimodal arbitrary arrival model (or UAM). For such a model, we present a DVS (dynamic voltage scaling)-based, CPU scheduling algorithm called Energy-Bounded Utility Accrual Algorithm (or EBUA). Since the scheduling problem is intractable, EBUA allocates CPU cycles, scales clock frequency, and heuristically computes schedules using statistical estimates of cycle demands, in polynomial-time. We analytically establish EBUA’s properties including satisfaction of energy bounds, statistical assurances on individual activity timeliness behavior, optimal timeliness during under-loads, and bounded time for mutually exclusively accessing shared non-CPU resources. Our simulation experiments validate our analytical results and illustrate the algorithm’s effectiveness and superiority over past algorithms.
Comments: This is the extended and significantly improved journal version of the conference paper Wu et al. 06a.
Wu et al. 07
Utility Accrual Real-Time Scheduling Under Variable Cost Functions
Haisang Wu, Umut Balli, Binoy Ravindran, and E. Douglas Jensen
IEEE Transactions on Computers, March 2007
Abstract. We present a utility accrual real-time scheduling algorithm called CIC-VCUA, for tasks whose execution times are functions of their starting times (and potentially other factors). We model such variable execution times using variable cost functions (VCFs). The algorithm considers application activities that are subject to time/utility function time constraints, execution times described using VCFs, and mutual exclusion constraints on concurrent sharing of non-CPU resources. We consider the two-fold scheduling objective of (1) assure that the maximum interval between any two consecutive, successful completions of job instances in an activity must not exceed the activity period (an application-specific objective), and (2) maximizing the system’s total accrued utility, while satisfying mutual exclusion resource constraints. Since the scheduling problem is intractable, CIC-VCUA is a polynomial-time heuristic algorithm. The algorithm statically computes worst-case task sojourn times, dynamically selects tasks for execution based on their potential utility density, and completes tasks at specific times. We establish that CIC-VCUA achieves optimal timeliness during under-loads, and tightly upper bounds inter- and intra-task completion times. Our simulation experiments confirm the algorithm’s effectiveness and superiority.
Comments: This is the longer and significantly improved journal version of Wu et al. 05b.
Wu et al. 06a
Utility Accrual, Real-Time Scheduling with Energy Bounds
Haisang Wu, Binoy Ravindran, and E. Douglas Jensen
Proc. ACM Symposium On Applied Computing, 2006
Abstract. In this paper, we consider timeliness and energy optimization in battery-powered, dynamic embedded real-time systems, which must remain functional during an operation/mission with a bounded energy budget. We consider application activities that are subject to time/utility function time constraints, statistical assurance requirements on timeliness behavior, and an energy budget, which cannot be exceeded at run-time. To account for the inevitable variability in activity arrivals in dynamic systems, we describe arrival behaviors using the unimodal arbitrary arrival model. For such a model, we present a CPU scheduling algorithm, called the Energy-Bounded Utility Accrual Algorithm (or EBUA). EBUA is a polynomial-time algorithm that satisfies energy bounds, and provides statistical assurances on individual activity timeliness behavior. We analytically establish several timeliness properties of EBUA. Further, our simulation experiments confirm the algorithm’s effectiveness and superiority.
Comments: TBD.
Wu et al. 06
Energy-Efficient, Uspan style=”font-family: Verdana; font-size: x-small;”tility Accrual Scheduling under Resource Constraints for Mobile Embedded Systems
Haisang Wu, Binoy Ravindran, E. Douglas Jensen, and Peng Li
ACM Transactions on Embedded Computing Systems, V5N3, August 2006
Abstract. We present an energy-efficient, utility accrual, real-time scheduling algorithm called ReUA. ReUA considers an application model where activities are subject to time/utility function time constraints, mutual exclusion constraints on shared non-CPU resources, and statistical performance requirements on individual activity timeliness behavior. The algorithm targets mobile embedded systems where system-level energy consumption is also a major concern. For such a model, we consider the scheduling objectives of (1) satisfying the statistical performance requirements and (2) maximizing the system-level energy efficiency, while respecting resource constraints. Since the problem is NP-hard, ReUA allocates CPU cycles using statistical properties of application cycle demands, and heuristically computes schedules with a polynomial-time cost. We analytically establish several timeliness and non-timeliness properties of the algorithm. Further, our simulation experiments illustrate ReUA’s effectiveness and superiority.
Comments: TBD.
Wu et al. 05b
Utility Accrual Real-Time Scheduling Under Variable Cost Functions
Haisang Wu, Umut Balli, Binoy Ravindran, and E. Douglas Jensen
Abstract. We present an utility accrual real-time scheduling algorithm called VCUA, for tasks whose execution times are functions of their starting times. We model such variable execution times with variable cost functions (or VCF). The algorithm considers application activities that are subject to time/utility function time constraints, VCFs, and the multi-criteria scheduling objective of assuring that the maximum interval between any two consecutive, successful completion of jobs of a task must not exceed a specified bound, and maximizing the system’s total utility. Since the scheduling problem is intractable, VCUA off-line selects tasks based on their potential utility density, and dynamically promotes jobs to accrue more utility, in polynomial-time. We establish that VCUA achieves optimal timeliness during under-loads, and identify the conditions under which timeliness assurances hold. Our simulation experiments illustrate VCUA’s superiority.
Comments: TBD.
Wu et al. 05a
Time/Utility Function Decomposition Techniques for Utility Accrual Scheduling Algorithms in Real-Time Distributed Systems
Haisang Wu, Binoy Ravindran, and E. Douglas Jensen
IEEE Transactions on Computers, Volume 54, Number 9, pages 1138 – 1153, September 2005
Abstract. We consider Real-Time CORBA 1.2’s distributable threads (DTs), whose time constraints are specified using time/utility functions (TUFs), operating in legacy environments. In legacy environments, system node resources — both physical and logical — are shared among time-critical DTs and local applications that may also be time-critical. Hence, DTs that are scheduled using their propagated TUFs, as mandated by Real-Time CORBA 1.2’s Case 2 approach, may suffer performance degradation, if a node utility accrual (UA) scheduler achieves higher locally accrued utility by giving higher eligibility to local threads than to DTs. To alleviate this, we consider decomposing TUFs of DTs into “sub-TUFs” for scheduling segments of DTs. We present decomposition techniques called UT, SCEQF, SCALL, OPTCON, and TUFS, which are specific to different classes of UA algorithms, such as those that use utility density, and those that use deadline as their key decision metric. Our experimental studies show that OPTCON and TUFS perform best for utility density-based UA algorithms, and SCEQF and SCALL perform best for deadline-based UA algorithms.
Comments: TBD.
Wu et al. 05
Energy-Efficient Scheduling with Statistically-Assured Performance for Dynamic Real-Time Systems
Haisang Wu, Binoy Ravindran, and E. Douglas Jensen
Proc. Design, Automation, and Test in Europe, March 2005
Abstract. In the context of battery-powered dynamic embedded real-time systems, the constraints of energy and timeliness should be addressed. In this paper, we consider repeatedly occurring application activities with the unimodal arbitrary arrival model (UAM) to describe the inevitable variability of activity (e.g., task) arrivals in such systems. The activities are subject to time/utility function time constraints and statistical performance requirements. For battery powered embedded systems, system-level energy consumption is also a primary concern. We present a scheduling algorithm, called Energy-efficient Utility Accrual Algorithm* (or EUA*). EUA* considers utility maximization under energy constraints during system overloads, and considers the dual criterion problem, minimizing energy while achieving the given utility within the given time constraint, during under-loads. We analytically establish several timeliness properties of the algorithm. Further, our simulation experiments illustrate EUA’s effectiveness.
Comments: See the Comments for the first [Wu et al. 04c] and second [Wu et al. 04d] papers in this series. Deterministically recurring task models, such as the periodic task model, are not appropriate for describing the inevitable variability of activity arrivals in dynamic real-time applications. Thus, in this paper, we extend the task arrival model from a periodic model to a unimodal arrival model (UAM), which describes the maximal number of a task’s instance arrivals during any sliding time window. We consider independent tasks with non-increasing TUFs. The algorithm here is EUA*, and it considers utility maximization under energy constraints during system overloads, and addresses the dual optimality criterion problem of minimizing energy while achieving the given utility within the given time constraint, during under-loads.
Wu et al. 04d
Energy Efficient, Utility Accrual Scheduling under Resource Constraints for Mobile Embedded Systems
Haisang Wu, Binoy Ravindran, E. Douglas Jensen, and Peng Li
Proc. Fourth ACM International Conference on Embedded Software, September 2004
Abstract. We present an energy-efficient, utility accrual, real-time scheduling algorithm called the Resource-constrained Energy-Efficient Utility Accrual Algorithm (or ReUA). ReUA considers an application model where activities are subject to time/utility function (TUF) time constraints, resource dependencies including mutual exclusion constraints, and statistical performance requirements including activity (timeliness) utility bounds that are probabilistically satisfied. Further, ReUA targets mobile embedded systems where system-level energy consumption is also a major concern. For such a model, we consider the scheduling objectives of (1) satisfying the statistical performance requirements; and (2) maximizing the system-level energy efficiency. At the same time, resource dependencies must be respected. Since the problem is NP-hard, ReUA makes resource allocations using statistical properties of application cycle demands and heuristically computes schedules with a polynomial-time cost. We analytically establish several timeliness and non-timeliness properties of the algorithm. Further, our simulation experiments illustrate the algorithm’s effectiveness.
Comments:
See the Comments for the first paper in this series [Wu et al. 04c]. In this paper, we extend the work of Wu et al. 04c to non-increasing TUFs, and consider tasks with resource dependencies. We describe our Resource-constrained, Energy-efficient Utility Accrual algorithm (ReUA) in this paper. ReUA has the same timeliness properties as the EUA algorithm in Wu et al. 04c. In this paper, we additionally establish its non-timeliness properties, such as deadlock-freedom and correctness.
An extended journal version of Wu et al 04c and Wu et al 04d has been submitted for publication. In that, we provide more complete experimental results, and analytically establish more timeliness and non-timeliness properties, such as sufficiency on probabilistic satisfaction of timeliness utility lower bounds, lower-bounded system-wide utility, bounded blocking time for resource access, deadlock-freedom, and correctness.
Wu et al. 04c
CPU Scheduling for Statistically-Assured Real-Time Performance and Improved Energy-Efficiency
Haisang Wu, Binoy Ravindran, and E. Douglas Jensen
Proc. IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, September 2004
Abstract. We present an energy-efficient, utility accrual, real-time scheduling algorithm called the Resource-constrained Energy-Efficient Utility Accrual Algorithm (or ReUA). ReUA considers an application model where activities are subject to time/utility function (TUF) time constraints, resource dependencies including mutual exclusion constraints, and statistical performance requirements including activity (timeliness) utility bounds that are probabilistically satisfied. Further, ReUA targets mobile embedded systems where system-level energy consumption is also a major concern. For such a model, we consider the scheduling objectives of (1) satisfying the statistical performance requirements; and (2) maximizing the system-level energy efficiency. At the same time, resource dependencies must be respected. Since the problem is NP-hard, ReUA makes resource allocations using statistical properties of application cycle demands and heuristically computes schedules with a polynomial-time cost. We analytically establish several timeliness and non-timeliness properties of the algorithm. Further, our simulation experiments illustrate the algorithm’s effectiveness.
Comments: The two major contributions of this paper are: generalization of the conventional utility accrual optimality criterion to include probabilistic lower bounds on individual and accrued utilities (see the Comment for Li et al. 04a); and the addition of a second utility-based optimality criterion, maximizing the system-level energy efficiency.
Thus far we have published three papers on energy-efficient utility accrual scheduling for embedded real-time systems [Wu et al. 04c][Wu et al 04d][Wu et al. 05]; an extended version of the first two of these papers has been submitted for publication, and currently a fifth paper is in preparation for a conference. These papers reflect our sequence of advances in utility accrual/energy-efficiency scheduling: from downward step TUFs to non-increasing TUFs; from periodic tasks to tasks with the unimodal arrival model; from independent tasks to tasks with resource dependencies; from unlimited energy budget to energy-bounded systems. Usually the latter papers are extending the earlier ones in some aspects of embedded real-time systems.
In all the papers, we consider system-level energy consumption instead of only the CPU’s energy consumption. We adopt Martinâ’s energy consumption model, where each system component’s energy consumption is individually modeled and aggregated to obtain system-level energy consumption. To better account for uncertainties in activity behaviors, we consider a stochastic model, where activity execution demand is stochastically expressed using mean and variance. Our scheduling objectives are (1) provide statistical assurances on individual activity timeliness behavior, including application-defined, probabilistic, lower bounds on individual activity utility; and (2) maximize system-level energy efficiency.
This is our first paper published on this topic. We describe our Energy-efficient Utility Accrual algorithm (EUA) in this paper, to deal with independent periodic tasks whose time constraints are specified using binary-valued, downward step TUFs. We analytically establish several timeliness properties of EUA in this paper. These include timeliness optimality during under-loads, and assurances on timeliness behavior including lower bounds on individual activity utility and system-wide collective utility.
Wu et al. 04b
Utility Accrual Scheduling under Arbitrary Time/Utility Functions and Multiunit Resource Constraints
Haisang Wu, Binoy Ravindran, E. Douglas Jensen, and Umut Balli
Proc. of the 10th Real-Time and Embedded Computing Systems and Applications, August 2004
Abstract. We present a uni-processor real-time scheduling algorithm called Resource-contrainted Utility Accrual algorithm (or RUA). RUA considers an application model, where activities can be subject to arbitrarily-shaped time/utility function (TUF) time constraints and resource constraints including mutual exclusion under a multi-unit resource request model. For such a model, we consider the scheduling objective of maximizing the total utility accrued by all activities. This problem was previously open. Since the problem is NP-hard, RUA heuristically computes schedules with a polynomial-time cost. We analytically establish several timeliness and non-timeliness properties of the algorithm, including upper bound on blocking time (under multi-unit request model) and deadlock-freedom. We also implement RUA on a POSIX-compliant real-time OS and conduct experimental comparisons with other TUF scheduling algorithms that address a subset of RUA’s model. Our implementation measurements show that RUA performs generally better than, or as good as, other TUF algorithms for the applicable cases.
Comments: This paper extends the authors’ previous work to include multi-unit resources.
Wu et al. 04a
Utility Accrual Scheduling Under Joint Utility and Resource Constraints
Haisang Wu, Binoy Ravindran, and E. Douglas Jensen
Proc. of the 7th IEEE International Symposium on Object-Oriented Real-Time Distributed Computing, May 2004
Abstract. We extend time/utility functions and utility accrual model with the concept of joint utility functions (or JUFs) that allow an activity’s utility to be described as a function of the completion times of other activities and their progress. We also specify the concept of progressive utility that generalizes the previously studied imprecise computational model, by describing an activity’s utility as a function of its progress. Given such an extended utility accrual model, we consider the scheduling criterion of maximizing the weighted sum of completion time, progressive, and joint utilities. We present an algorithm called the Combined Utility Accrual algorithm (or CUA) for this criterion. Experimental measurements with an implementation of CUA on a POSIX-compliant real-time OS illustrate the effectiveness of JUFs in a class of applications of interest to us.
Comments: TBD.
Wu et al. 04
On the Joint Utility Accrual Model
Haisang Wu, Binoy Ravindran, and E. Douglas Jensen
Proc. of the 12th International Workshop on Parallel and Distributed Real-Time Systems, April 2004
Abstract. We extend Jensen’s time/utility functions and utility accrual model with the concept of joint utility functions (or JUFs) that allow an activity’s utility to be described as a function of the completion times of other activities and their progress. We also specify the concept of progressive utility that generalizes the previously studied imprecise computational model, by describing an activity’s utility as a function of its progress. Given such an extended utility accrual model, we consider the scheduling criterion of maximizing the weighted sum of completion time, progressive, and joint utilities. We present an algorithm called the Combined Utility Accrual algorithm (or CUA) for this criterion. Experimental measurements with an implementation of CUA on a POSIX-compliant real-time OS illustrate the effectiveness of JUFs in a class of applications of interest to us.
Comments: This paper is part of the workshop’s special session on TUF/UA scheduling. Although its main contribution is to introduce the concept of joint utility functions, it also shows that Liu’s “imprecise computation” [ ] model is a special case of our progressive utility model.
Zhang et al. 08
RTQG: Real-Time Quorum-based Gossip Protocol for Unreliable Networks
Bo Zhang, Kai Han, Binoy Ravindran, and E. Douglas Jensen
Proc. of the 12th IEEE International Conference on Availability, Reliability and Security, March 2008
Abstract. We consider scheduling real-time tasks in the presence of message loss and Byzantine node failures in unreliable networks. We present scheduling algorithms called RTQG and RTQG-B. The algorithms use quorum-based gossip communication strategies for dynamically and dependably discovering eligible nodes. Compared with its predecessors, our protocol exhibits better performance. RTQG utilizes quorum systems to limit the range of each gossip round. Using the intersection property of quorum systems, RTQG has advantages in message propagation and robustness to Byzantine node failures. Our simulation studies verify our analytical results.
Comments: TBD.
Next: About Me