Outline of Research Activities

John A. Miller

May 3, 1999

Department of Computer Science
University of Georgia
Athens, Georgia 30602-7404

Web-Based Simulation (with JSIM)

In its ideal form, Web-based simulation should allow simulation models as well as simulation results to be as readily distributable and composable as today's Web documents. The rapid advances in Web technology, most notably Java, are helping to make this a possibility. Support for executable Web content, universal portability, component technology, and standard high-level packages for accessing databases and producing graphical user interfaces are important enablers of Web-based simulation. Component-based software can be used to develop highly modular simulation environments supporting high reusability of software components. Because of the potentially large scope of Web-based simulation, greater demands are placed on simulation environments. They should support rapid visual model development, access to local and remote databases, techniques for executing models or federations of models in a variety of ways, and embedding of simulation within larger systems. The use of component technology in the JSIM Web-based simulation environment allows simulation models to be treated as components that can be dynamically assembled to build model federations. Through the use of Java Beans, JSIM's environment is built up from reusable software components that can be dynamically assembled using visual development tools. Model beans can be linked to form model federations and may be linked to environment beans to control their execution and save their results in databases. The way in which JSIM also allows simulation inputs and outputs to be dynamically linked to database systems makes storage of simulation results easy and flexible. The use of JDBC makes it easy to switch between several database management systems. Finally, JSIM provides simple and uniform access to simulation results based on the notion of query driven simulation. The users simply ask for information and it is up to the system to figure out how to get it, e.g., by accessing databases and/or running simulation models.

Find out more.

Simulation Environments: Query Driven Simulation (QDS)

This work involves the the application of advanced database technologies, such as Object-Relational (OR) and Object-Oriented (OO) Database Management Systems (DBMSs), to the problem not only of managing simulation data, but managing simulation models as well. A novel feature of the QDS approach is that it gives the system the appearance of a database. Consequently, end users find QDS systems easy to understand and easy to use. Two loosely coupled prototype QDS systems have been developed: The first prototype coupled the SIMODULA simulation system with a toy object-relational database system called OBJECTR. The second prototype coupled JSIM with any object-relational database system (e.g., Oracle 8) using JDBC. One tightly coupled prototype was built on top of an object-oriented database system called Active KDL. Traditional databases can retrieve data, while advanced databases, such as Active KDL, can also infer or derive information beyond that which is stored. By utilizing a Database Programming Language (DBPL), simulation models become a component of a database to be queried and/or executed. If the database system has a sufficiently powerful meta-data facility which governs query processing, then it can be used to decide, based on the user's preferences, what strategy should be applied to provide an adequate and timely answer. For example, should data simply be retrieved, should rules be applied, and/or should simulation models be instantiated (parameterized and/or composed) and executed to generate additional information from the simulation runs. This will all occur transparently to the user, except for the effects on response time. The user simply formulates queries without worrying whether or not the simulation data exists. It is the job of the system to provide an appropriate answer, hence the name Query Driven Simulation.

Find out more.

High Performance Protocols for Database Transaction Management

This work involves the development of new protocols for database transaction management. The basis for these new protocols is the Time Warp protocol developed by David Jefferson for parallel and distributed simulation. In 1986, he presented a paper at the Data Engineering conference which adapted Time Warp to database transaction management. This work has extended Jefferson's work on Time Warp by developing several variants, proving their correctness, and studying their performance. In addition, a Hybrid protocol that combines the best features of Time Warp and Multiversion Timestamp Ordering has been developed. This work has also addressed performance issues, such as the elimination of cascaded rollbacks from the Hybrid protocol, reducing message traffic by using lazy cancellation and reducing state saving overhead. Extensive simulation and analysis of these new protocols as well as traditional concurrency control protocols (e.g., Two-Phase Locking, Timestamp Ordering, Kung & Robinson Optimism, Multiversion Timestamp Ordering, and Multiversion Two-Phase Locking) have been carried out. Current results show that in systems with sufficient parallelism, the Time Warp and Hybrid protocols can exhibit excellent performance. Recovery protocols (e.g., Pessimistic, Paranoid, Realistic, and Optimistic) have also been studied both independently (enforcing just recoverability) and in combination with concurrency control protocols (enforcing both serializability and recoverability). Modeling techniques used include Simulation Models, Markov Models, and Queueing Network Models.

Find out more.

Web-Based Workflow Management Systems

This work involves research and development on advanced Workflow Management Systems (WfMSs) (distributed workflow, Web-Based workflow, transactional workflow, adaptive workflow and collaborative workflow as well as advanced security, monitoring and repository services). The LSDIS Lab (Sheth, Miller and Kochut) at the University of Georgia has developed multiple implementations of the METEOR workflow model, ORBWork, NEOWork and WebWork. (The total amount code developed exceeds 100,000 lines.) They have been installed and tested/used at several locations: Microelectronics and Computer Technology Consortium (MCC), Connecticut Healthcare Research and Education Foundation (CHREF), Advanced Technology Institute (ATI), Naval Research Lab (NRL), Medical College of Georgia (MCG), Bellcore, and Boeing. All implementations will interoperate and work off of a common graphical designer. Each has its own particular strong points. Both ORBWork and NEOWork are CORBA based. NEOWork emphasizes high reliability, while ORBWork emphasizes fully distributed and adaptive workflows. WebWork emphasizes ease of development of workflow applications, installation and use. Since it uses Web technology, free compilers, and interfaces with common databases, most organizations need not purchase any base technology in order to run WebWork. Furthermore, it has the following desirable characteristics: WebWork supports the creation of flexible workflows that include both manual and automated tasks, transactional and non-transactional tasks as well as sophisticated, yet straightforward, ways of coordinating the overall execution of the tasks. WebWork is designed for heterogeneous distributed environments spanning multiple organizations. Client access is universal since all that is required is a Web browser. WebWork can communicate with a variety of DBMSs. In addition, it has facilities for wrapping legacy applications. WebWork provides low overhead, yet what we believe to be effective recovery mechanisms. We have classified ten different type of errors or failures that WebWork deals with. WebWork supports rapid workflow application development in which all workflow oriented code is automatically generated. Default screens are also generated. In addition, stubs for application tasks are generated so that a workflow application can be prototyped and tested with no coding required. WebWork also interoperates with InfoSleuth which is an agent-based information access facility developed by MCC, one of our partners in our NIST-funded HIIT/HITECC project.

Find out more.

Query Languages and Tools for XML Documents

The eXtensible Markup Language (XML) is a new standard that supports data exchange on the World-Wide Web. It is sophisticated enough so that complex real-world structures and relationships may be captured. Thus, it can be used as the universal format for data interchange. For Web-based systems or applications (e.g., JSIM, WebWork), preliminary work is under way to store/exchange data objects formatted as XML documents. In order to handle the potentially large collections of XML documents associated with these application, modern Database Management Systems (DBMSs) may be used for efficient storage and retrieval. The current state-of-the-art in database management is represented by the Object-Oriented (OO) and Object-Relational (OR) DBMSs each with their own query languages, Object Query Language (OQL, v.2) and Structured Query Language (SQL3), respectively. These database systems are capable of storing Web documents. However, even though Web documents (in this case XML documents) can be stored in these databases, they do not conform to either the relational or object-oriented model, rather they can be better represented with a semistructured data model. In addition, it is anticipated that many future computer users will be familiar with XML syntax. Consequently, query languages and tools based on XML are being developed.

Find out more.

Repository Management

One may view a repository as a valued-added database for storing enterprise meta-data (including database schema and application specifications as well as database and program models). Beyond this aspect, some repositories may have an active role in controlling enterprise operations. This work involved the design and prototype implementation of a repository management system tailored to the needs of the Westinghouse Savannah River Company (WSRC). This effort utilized both the Oracle RDBMS and the Oracle CASE Tools. The advantages of using Object-Oriented or Object-Relational DBMSs were also addressed. In addition, a major report was developed to assist WSRC in procuring repository technology and integrating it into their existing facilities. Issues addressed involved defining repository technology, determining its role within WSRC, developing a checklist of required features/services, and approaches to design, populate and administer their repository.

Find out more.

Functional Object-Oriented Database Systems

This work was the result of a heavy collaboration between Drs. Potter, Kochut and myself. We began with KDL, a Schema Definition Language (SDL) developed by Potter and Kerschberg in 1986. KDL added knowledge representation capabilities to a core based on the Functional Data Model of Shipman and Kerschberg. Our motivation was to add a Query Language (QL), and to allow applications to be coded in a compatible language (avoiding an impedance mismatch), we added a Database Programming Language (DBPL). As part of the process of defining the semantics and implementing the system, we chose to modify the original KDL schema specification language. Since we wanted the new system to support a variety of applications, and in particular support query driven simulation, the database programming language needed to have most of the constructs available in general-purpose programming languages. However, we wished to retain the non-procedural character of the query language within the database programming language. Therefore, in designing the new language, Active KDL, we chose to follow both the functional and object-oriented paradigms. In this work we also analyzed several important design issues: (1) providing consistency between the three sublanguages, the SDL, QL and DBPL; (2) properties of inheritance lattices; (3) advantages of declarative languages; (4) clashes between functional and object-oriented paradigms; (5) defining class operators or language constructs that allow ad-hoc queries to be formulated (much as they are in relational systems); (6) support for active objects as well as passive (ordinary) objects; (7) making the query language closed so that the result of one query can be input to another query or indeed stored as a class; (8) formally defining modeling primitives (or abstraction mechanism) such as generalization, specialization, aggregation and association, and in particular making the definitions conceptually compatible with class operators; (9) how to express (to some extent) expert-system-like if-then rules in a functional programming language; (10) the use of meta-data to tailor the system to the needs of applications and the preferences of users, and to serve as a repository to keep track of data, knowledge and applications (or models); and (11) adding historical extensions.

Find out more.

Performance Evaluation of Genetic Algorithms

This work largely grew out of a successful summer (1989) of research at Oak Ridge National Laboratory by Dr. Potter. He had obtained some promising preliminary results when applying Genetic Algorithms to the NP-Hard problem of Multiple Fault Diagnosis (MFD). Genetic Algorithms are a stochastic search technique that follows the theory of evolution. Because of their robust nature, they have been found to be very useful for problems with irregular state spaces (they are less likely to get struck at local optima than traditional search or optimization techniques). For the last few years, we have empirically analyzed the reliability (percentage of cases in which an optimal solution (diagnosis) is found) of these Genetic Algorithms. We have compared their speed and reliability with several other heuristics and an exact algorithm. Unfortunately, on the larger cases the exact algorithm, exhaustive search, becomes infeasible (one case ran for months on a Sun workstation). The most significant results that we have obtained is that the Genetic Algorithms themselves exhibit consistently good reliability and run in the order of seconds or minutes, yet when combined with local improvement operators the reliability approaches 100%, without a dramatic increase in runtime. Later, genetic algorithms were applied to a much harder problem, that of finding the length of the longest snakes in a hypercube of dimension d. At the time the answer was only known for d less than or equal to 6. After running an efficient exact algorithm on several workstations for several months we found the solution for d = 7 to be 50. We know of no answer for d = 8 or more, but have used genetic algorithms to find long snakes for these higher dimensions.

Find out more.