Department of Computer Science
University of Georgia
Athens, Georgia 30602-7404
 My research has principally involved the specialties of database systems, simulation and workflow. These are often studied in the context of parallel and distributed systems (including the Web in recent years).
A connecting theme of this work is modeling. In simulation, models of the dynamic behavior of systems are developed to study or improve the performance, reliability or functionality of actual or proposed systems. In database, vital enterprise information is modeled and stored. Workflow unites enterprise information with enterprise dynamics (business processes). From a modeling point of view it includes elements from both database and simulation. These three areas are very synergistic. Some of the synergies my work has explored are the following:
In addition to exploring these synergies, my work has also contributed to each specialty individually.
My recent research is highlighted by the following five papers.
 
In its ideal form, Web-based simulation should allow simulation
models as well as simulation results to be as readily
distributable and composable as today's Web documents.
The rapid advances in Web technology, most notably Java,
are helping to make this a possibility.
Support for executable Web content, universal portability,
component technology, and standard high-level packages
for accessing databases and producing graphical user
interfaces are important enablers of Web-based simulation.
Component-based software can be used to develop highly
modular simulation environments supporting high reusability
of software components.
Because of the potentially large scope of Web-based simulation,
greater demands are placed on simulation environments.
They should support rapid visual model development,
access to local and remote databases,
techniques for executing models or federations
of models in a variety of ways, and
embedding of simulation within larger systems.
The use of component technology in the JSIM Web-based
simulation environment allows simulation models to be treated
as components that can be dynamically assembled to build
model federations.
Through the use of Java Beans, JSIM's environment is built up
from reusable software components that can be dynamically
assembled using visual development tools.
Model beans can be linked to form model federations and may
be linked to environment beans to control their execution and
save their results in databases.
The way in which JSIM also allows simulation inputs and outputs
to be dynamically linked to database systems
makes storage of simulation results easy and flexible.
The use of JDBC makes it easy to switch between several
database management systems.
Finally, JSIM provides simple and uniform access to simulation
results based on the notion of query driven simulation.
The users simply ask for information and it is up to the system
to figure out how to get it,
e.g., by accessing databases and/or running simulation models.
Web-Based Simulation (with JSIM)
This work involves the the application of advanced database
technologies, such as Object-Relational (OR) and
Object-Oriented (OO) Database Management Systems (DBMSs),
to the problem not only of managing simulation data,
but managing simulation models as well.
A novel feature of the QDS approach is that it
gives the system the appearance of a database.
Consequently, end users find QDS systems easy to understand
and easy to use.
Two loosely coupled prototype QDS systems have been developed:
The first prototype coupled the SIMODULA simulation system with
a toy object-relational database system called OBJECTR.
The second prototype coupled JSIM with any object-relational
database system (e.g., Oracle 8) using JDBC.
One tightly coupled prototype was built on top of an
object-oriented database system called Active KDL.
Traditional databases can retrieve data,
while advanced databases, such as Active KDL,
can also infer or derive information beyond that
which is stored.
By utilizing a Database Programming Language (DBPL),
simulation models become a component of a database
to be queried and/or executed.
If the database system has a sufficiently powerful
meta-data facility which governs query processing,
then it can be used to decide, based on the user's preferences,
what strategy should be applied to provide an adequate
and timely answer.
For example, should data simply be retrieved,
should rules be applied, and/or should simulation
models be instantiated (parameterized and/or composed)
and executed to generate additional information
from the simulation runs.
This will all occur transparently to the user,
except for the effects on response time.
The user simply formulates queries without
worrying whether or not the simulation data exists.
It is the job of the system to provide an appropriate
answer, hence the name Query Driven Simulation.
Simulation Environments: Query Driven Simulation (QDS)
This work involves the development of new protocols
for database transaction management.
The basis for these new protocols is the Time Warp
protocol developed by David Jefferson for parallel and
distributed simulation.
In 1986, he presented a paper at the Data Engineering
conference which adapted Time Warp to database
transaction management.
This work has extended Jefferson's work on Time Warp
by developing several variants, proving their
correctness, and studying their performance.
In addition, a Hybrid protocol that combines the best features of
Time Warp and Multiversion Timestamp Ordering
has been developed.
This work has also addressed performance issues,
such as the elimination of cascaded rollbacks from the Hybrid
protocol, reducing message traffic by using lazy cancellation
and reducing state saving overhead.
Extensive simulation and analysis of these new protocols
as well as traditional concurrency control protocols
(e.g., Two-Phase Locking, Timestamp Ordering,
Kung & Robinson Optimism, Multiversion Timestamp Ordering,
and Multiversion Two-Phase Locking) have been carried out.
Current results show that in systems with sufficient parallelism,
the Time Warp and Hybrid protocols can exhibit excellent performance.
Recovery protocols (e.g., Pessimistic, Paranoid,
Realistic, and Optimistic) have also been
studied both independently (enforcing just recoverability)
and in combination with concurrency control protocols
(enforcing both serializability and recoverability).
Modeling techniques used include Simulation Models,
Markov Models, and Queueing Network Models.
High Performance Protocols for Database Transaction Management
The state-of-the-art for Genome databases has been
relational database technology.
Because of the complex nature of what is to be stored
and how it is to be manipulated,
Object-Relational (OR) or Object-Oriented (OO) databases
suit the Geneticists' needs much better.
We have developed OR and OO databases to support genetic
and physical mapping as well as DNA sequencing.
Recently, the applications that update and retrieve
from genome databases have been organized into
workflows.
These workflows were demonstrated at an NSF site
visit (Feb 1999) as part a large proposal for
an NSF Science and Technology Center (STC).
The demo utilized the Oracle 8 DBMS and
the METEOR:WebWork WfMS.
Genomic Information Systems
This work involves research and development on
advanced Workflow Management Systems (WfMSs)
(distributed workflow, Web-Based workflow,
transactional workflow, adaptive workflow
and collaborative workflow as well as advanced
security, monitoring and repository services).
The LSDIS Lab (Sheth, Miller and Kochut) at the
University of Georgia has developed
multiple implementations of the METEOR workflow model,
ORBWork, NEOWork and WebWork.
(The total amount code developed exceeds 100,000 lines.)
They have been installed and tested/used at several locations:
Microelectronics and Computer Technology Consortium (MCC),
Connecticut Healthcare Research and Education Foundation (CHREF),
Advanced Technology Institute (ATI),
Naval Research Lab (NRL),
Medical College of Georgia (MCG),
Bellcore, and
Boeing.
All implementations will interoperate and work off of
a common graphical designer.
Each has its own particular strong points.
Both ORBWork and NEOWork are CORBA based.
NEOWork emphasizes high reliability,
while ORBWork emphasizes fully distributed and
adaptive workflows.
WebWork emphasizes ease of development of workflow applications,
installation and use.
Since it uses Web technology, free compilers,
and interfaces with common databases,
most organizations need not purchase any base technology
in order to run WebWork.
Furthermore, it has the following desirable characteristics:
WebWork supports the creation of flexible workflows that
include both manual and automated tasks, transactional and
non-transactional tasks as well as sophisticated, yet
straightforward, ways of coordinating the overall execution
of the tasks.
WebWork is designed for heterogeneous distributed environments
spanning multiple organizations.
Client access is universal since all that is required is a
Web browser.
WebWork can communicate with a variety of DBMSs.
In addition, it has facilities for wrapping legacy applications.
WebWork provides low overhead, yet what we believe to be effective
recovery mechanisms.
We have classified ten different type of errors or failures
that WebWork deals with.
WebWork supports rapid workflow application development in
which all workflow oriented code is automatically generated.
Default screens are also generated.
In addition, stubs for application tasks are generated
so that a workflow application can be prototyped and
tested with no coding required.
WebWork also interoperates with InfoSleuth which is an agent-based
information access facility developed by MCC,
one of our partners in our NIST-funded HIIT/HITECC project.
Web-Based Workflow Management Systems
The eXtensible Markup Language (XML) is a new standard that supports data
exchange on the World-Wide Web.
It is sophisticated enough so that complex real-world structures and
relationships may be captured.
Thus, it can be used as the universal format for data interchange.
For Web-based systems or applications (e.g., JSIM, WebWork),
preliminary work is under way to store/exchange data objects
formatted as XML documents.
In order to handle the potentially large collections of XML documents
associated with these application, modern Database Management Systems (DBMSs)
may be used for efficient storage and retrieval.
The current state-of-the-art in database management is represented by
the Object-Oriented (OO) and Object-Relational (OR) DBMSs each with
their own query languages, Object Query Language (OQL, v.2) and
Structured Query Language (SQL3), respectively.
These database systems are capable of storing Web documents.
However, even though Web documents (in this case XML documents) can be
stored in these databases, they do not conform to either the relational or
object-oriented model, rather they can be better represented with a
semistructured data model.
In addition, it is anticipated that many future computer users will be
familiar with XML syntax.
Consequently, query languages and tools based on XML are being developed.
Query Languages and Tools for XML Documents
One may view a repository as a valued-added database for storing
enterprise meta-data (including database schema and application
specifications as well as database and program models).
Beyond this aspect, some repositories may have an active role
in controlling enterprise operations.
This work involved the design and prototype implementation of a repository
management system tailored to the needs of the Westinghouse Savannah River
Company (WSRC).
This effort utilized both the Oracle RDBMS and the Oracle CASE Tools.
The advantages of using Object-Oriented or Object-Relational DBMSs were
also addressed.
In addition, a major report was developed to assist WSRC in procuring
repository technology and integrating it into their existing facilities.
Issues addressed involved defining repository technology, determining its
role within WSRC, developing a checklist of required features/services,
and approaches to design, populate and administer their repository.
Repository Management