Research Day 2009 - Accepted Posters

1. Web Based Interface for Numerical Simulations of Nonlinear Evolution Equations

Ankit Jain

In Computational Science and Parallel Computing research, model equations have been developed to assist in solving problems in science and engineering. Such equations have aided researchers in developing methods used in the study of weather prediction, optical fiber communication systems, water waves, etc. Often, it is the desire of many researchers to further develop numerical methods and make these model equations and their numerical simulations and plots accessible to users through the Internet. In this poster, we present a web based graphical user interface for numerical simulation of nonlinear evolution equations such as the nonlinear Schrodinger (NLS), NLS (NLS) with periodic dispersion, and modified coupled NLS (CNLS) equations. Sequential and parallel algorithms for each equation were implemented on sequential and multiprocessor machines.

2. Evaluating Notations for the Software Engineering of Concurrency: Verifying Properties of Student Code Using FSP and LTSA

Zhe Zhao (George)

Our research focuses on the evaluation of various notations for modeling concurrent software. We seek to compare the usability of certain types of UML (Unified Modeling Language) diagrams when used by students as an aid for developing concurrent software. In order to compare the benefits of different notations, we must have a method for evaluating the correctness of the code that a student generates. Here, we apply FSP (Finite State Processes) modeling and LTSA (Labeled Transition System Analyzer) to the problem of evaluating student solutions to concurrent programming problems.

3. Aerial Viewer: A framework for air-based video surveillance

Tapan Patwardhan

Many initiatives have been undertaken to find efficient ways to detect motion in surveillance cameras. For many years, research on motion detection was done assuming a stationary camera, but now the focus has shifted towards finding out ways to detect motion from a non-stationary camera. The goal of motion detection is to segment the image into background and a number of foreground regions. When the camera is not stationary, the motion of the camera must be compensated for, to create a background model. Video surveillance using motion detection has diverse applications. It is used by law enforcement agencies as a security measure and by the military to track ground activity. An important application pertains to recovery in emergency situations, wherein, regions struck by natural calamities such as tornadoes, floods or hostage situations would be monitored via some aerial vision. An aerial camera could detect activity in disaster affected areas and assist the involved agencies to conduct directed rescue operations or investigations. We present a framework for a system containing a group of aerial cameras, performing air-based surveillance of emergency affected regions.

4. RNATOPS-W: A Web Server for RNA Structure Searches of Genomes

Yingfeng Wang

RNATOPS-W is a web server to search sequences for RNA secondary structures including pseudoknots. The server accepts an annotated RNA multiple structural alignment as a structural profile and genomic or other sequences to search. It is built upon RNATOPS, a command line C++ software package for the same purpose, in which filters to speed up search are manually selected. RNATOPS-W improves upon RNATOPS by adding the function of automatic selection of a hidden Markov model (HMM) filter and also a friendly user interface for selection of a substructure filter by the user. In addition, RNATOPS-W complements existing RNA secondary structure search web servers that either use built-in structure profiles or are not able to detect pseudoknots. RNATOPS-W inherits the efficiency of RNATOPS in detecting large, complex RNA structures.

5. Fast and accurate search for ncRNA in genomes by their structures (including pseudoknots)

Zhibin Huang

Searching genomes for non-coding RNAs (ncRNAs) by their secondary structure has become an important goal for bioinformatics. For pseudoknot-free structures, ncRNA search can be effective based on the covariance model and CYK-type dynamic programming. However, the computational difficulty in aligning an RNA sequence to a pseudoknot has prohibited fast and accurate search of arbitrary RNA structures. Our work, RNATOPS, introduced a graph model for RNA pseudoknots and proposed to solve the structure sequence alignment by graph optimization. Given k candidate regions in the target sequence for each of the n stems in the structure, we could compute a best alignment in time O(ktn) based upon a tree width t decomposition of the structure graph. However, to implement this method to programs that can routinely perform fast yet accurate RNA pseudoknot searches, we need novel heuristics to ensure that, without degrading the accuracy, only a small number of stem candidates need to be examined and a tree decomposition of a small tree width can always be found for the structure graph. Test result shows that RNATOPS can do fast searches on prokaryotic and eukaryotic genomes for specific RNA structures of medium to large sizes, including pseudoknots, with high sensitivity and high specificity.

6. Modification of LLREF Scheduling Algorithm

Vijaykant R. Nadadur

The LLREF (Largest Local Remaining Execution Feasibility) is an optimal real time scheduling algorithm for multiprocessors. This algorithm emphasizes the local execution rather than the deadline. To understand LLREF, it is essential to understand some of the basic terminologies associated with it.

TL Plane: TL plane is the graphical representation of Time and Local execution time domain.

TCritical moment: Critical moment is the first sub-event time, when more than M tokens simultaneously hit the NLLD (no local laxity diagonal).

TEvent B: Event B is supposed to occur when the selected token hits the bottom of the TL plane.

TEvent C: Event C is supposed to occur when a non-selected token hits the diagonal of TL plane.

TThe LLREF sorts the tasks as per their local execution to schedule them. In our research, we modified this sort step so as to reduce the overhead associated with the context switching. We also observed that with some modification we can handle the scheduling of sporadic tasks for which we introduced a new event, Event A.

7. Adaptive Message Clustering in Distributed Agent Based Simulation Systems

Abhishek Gupta

A Large Scale Agent Based Simulation involves the exchange of messages, which may saturate the network and may impact the scalability and performance of the system. This problem is sometimes called piggybacking. This project is based on Large Scale Distributed Simulation (Agent Based System) and is developed on SASSY, a hybrid simulator that provides an agent-based API on a PDES kernel. We are currently working on a message clustering and de-clustering algorithm to improve the performance of distributed agent based simulations. Potential applications include simulation of biological systems such as ants, bees, schools of fish and multi-robot systems.

8. Individual Decision Making in Human-Machine Multiagent Settings

Xia Qu

In decision making, POMDP (Partially Observable Markov Decision Process) is a model that is used to generate optimal policies when the agent cannot directly observe the environment. However, in multi-agent settings, POMDP cannot provide a satisfying result since it just treats the other agents as noise. I-POMDP (Interactive POMDP) is a framework that extends POMDP to multi-agent settings by introducing agent models into the state space. Compared to POMDP, I- POMDP agents need to maintain beliefs over models of other agents in addition to beliefs over physical states. In I-POMDP recursive reasoning on models of other agents' actions is used to choose the optimal actions. Previous research shows that humans do not tend to ascribe recursive thinking to others. They cannot always take the optimal actions. However, in a sequential fixed-sum, two-player alternating-move game with complete and perfect information, recursive reasoning (what do I think that you think that I think...) is exhibited by subjects during their strategic decision making process. Hence, we can use psychologically-plausible and empirically informed I-POMDP to model human decision process.

9. Automatic Link Adaptation: Stopping Clickbots

Douglas Brewer

The Pay-Per-Click business model of online advertising is vulnerable to fraudulent clicks capable of exhausting an advertiser's budget as well as atrophying the publisher's client-base. We provide a way to reliably detect various types of Clickbot accesses to an advertiser's website. We automatically modify the pages on a website with Decoy Link Design Adaptation and random links. Decoy Link Design Adaptation is reliably able to identify Clickbots that must choose links at random based on some criterion, and replacing all links on a page with random links allows us to stop Clickbots that follow a script of links. Even as Clickbot authors escalate their Clickbots to ever more human like behavior, these methods still provide an effective detection mechanism.

10. Finding Relationships on the Web

Meghana Viswanath

Finding relationships between entities hidden within documents on the web is a research problem that has been eluding us of a satisfactory solution for quite some time now. One very intuitive method that has been worked on is to build and populate an ontology, which was done in the SemDis project in the LSDIS lab at UGA. The ontology can then be traversed to find relationships. Though very intuitive, populating an ontology to find relationships is an expensive task. Another approach that was used by researchers at the IBM T. J. Watson Research Center did not use any ontological knowledge. The system found document pairs which are likely to contain relationships between two entities by assigning weights to terms and identifying the possible connecting terms. We propose to imbibe both these ideas and build a system which can find relationships between entities in web documents on the fly. The system will take the two entities and form a set of query strings that will be given to the Google search engine. We propose to use an existing ontology to help form these query strings by finding aliases for the entities. The system will make use of the Unstructured Information Management Architecture (UIMA) to perform semantic analysis on the web documents to return the set of documents that contain the relationships between the mentioned two entities.

11. Adapting Hierarchical Web Service Compositions Using the Value of Changed Information

John Harney

Environments in which Web service compositions (WSC) operate are often dynamic. We address the problem of which service to query for up-to-date information in order to adapt a hierarchical WSC, given that queries are not free. Previously, the value of changed information (VOC) has been proposed to select those services for querying whose revised non-functional information is expected to bring about the most change in the composition. VOC requires a distribution over the possible values of volatile parameters of the WSs. In this paper, we present an approach for utilizing VOC in the context of a WSC composed of WSs and lower level WSCs, which induces a natural hierarchy over the composition. Because parameters of composite WSs are not directly available, we aggregate these from parameters of component WSs and derive a distribution over the possible parameter values from the distributions for the WSs. We demonstrate empirically that our VOC based querying method for WSC adaptation performs better on average than other querying strategies.

12. Partitioned Scheduling with Fewer Processors

Charulakshmi Vijayagopal

We consider the partitioned Earliest Deadline First (EDF) scheduling of real time periodic tasks on identical multiprocessors in a multiprocessor. We characterize our task sets by two parameters: maximum utilization, Umax and the maximum ratio between consecutive task utilizations, gamma. For a given Umax and gamma we have developed a novel method for determining the maximum number of required processors, M(Umax, gamma). It is guaranteed that any task set with maximum utilization <= Umax and utilization ratio <= gamma can be portioned on to M(Umax, gamma) processors. Compared to the current state of the art, our method requires as much as 35% fewer processors.

13. Prediction of non-coding RNAs using Machine Learning Techniques

Vasim Mahamuda

The term non-coding RNA (ncRNA) is commonly employed for RNA that does not encode a protein, but this does not mean that such RNAs do not contain information nor have function. Although it has been generally assumed that most genetic information is transacted by proteins, recent evidence suggests that the majority of the genomes of mammals and other complex organisms is in fact transcribed into ncRNAs, many of which are alternatively spliced and/or processed into smaller products. These ncRNAs include microRNAs and snoRNAs (many if not most of which remain to be identified), as well as likely other classes of yet-to-be-discovered small regulatory RNAs, and tens of thousands of longer transcripts, most of whose functions are unknown. These RNAs (including those derived from introns) appear to comprise a hidden layer of internal signals that control various levels of gene expression in physiology and development, including chromatin architecture/epigenetic memory, transcription, RNA splicing, editing, translation and turnover. The number of ncRNAs encoded within the human genome is unknown; however recent transcriptomic and bioinformatic studies suggest the existence of thousands of ncRNAs. Since most of the newly identified ncRNAs have not been validated for their function, it is possible that many are non-functional. RNA regulatory networks may determine most of our complex characteristics, play a significant role in disease and constitute an unexplored world of genetic variation both within and between species.

14. Optimal & Power Aware Scheduling Algorithm for DVS Multiprocessors

Mikul Bhatt

This work represents an effective scheduling algorithm for Dynamic Voltage Scaling multiprocessors to minimize power consumption. Energy consumption and battery lifetime are nowadays major constraints in the design of mobile embedded systems. Hence, it is important to design a scheduler which minimizes the power consumption. Because of the dynamic nature of DVS multiprocessor, it would be really hard to optimize a schedule in a way that it can meet all the deadlines in a hard real-time system with the minimum power consumption. Implementation of this schedule will reduce power consumption and improve the battery life in mobile embedded real-time systems.

15. A Comparative Study on Modeling Concurrency

Zhen Li

Designing and maintaining multi-threaded software are complex tasks. Modeling can support users in performing these tasks. We compare the benefits of UML 2.0 state diagrams and sequence diagrams for comprehension and implementation tasks through a controlled study. The comparative study uses a between-subjects, pre-test/post-test design to compare the performance of a group that uses UML 2.0 state diagrams as an aid in software comprehension and implementation tasks with a group that uses UML 2.0 sequence diagrams as an aid in these same tasks. In the study, participants will attend a training session to prepare them with key concepts and terminology. After that, they are divided into two equivalent groups based on their performance on the pre-test. Between the pre-test and the post-test, both groups will attend a lecture during which an instructor reviews the use of both UML state diagrams and UML sequence diagrams as aids to the comprehension of concurrent programs. In the post-test one group will receive UML state diagrams to assist their problem solving while the other group will receive sequence diagrams. Results of this study will enable instructors to make better selections of instructional materials when teaching about concurrency.

16. Autonomous Decision Making for Unmanned Aerial Vehicles in Multi-Agent Settings

Ekhlas Sonu

Unmanned Aerial Vehicles have potential use in numerous fields such as Fighting forest fire, warfare, etc. Particularly in field of warfare reconnaissance, the UAVs have to function in stochastic, sequential, partially observable and multi-agent settings. The presence of hostile agents makes the task for an UAV further difficult. We use Interactive- POMDPs to model the UAVs so that they can make rational decisions by modeling other agents present in the environment. The poster gives a brief idea about how POMDPs and I-POMDPs are solved to give the optimal policy for the agent.

17. A Reconfigurable Interconnection Network for Supercomputers

Arash Jalal Zadeh Fard

To address the ever increasing need for high performance computing in applications such as Bio-Informatics, computational biology, and drug design, parallel machine architectures composed of thousands of processors would have to be used. The main overhead associated with parallel computers is the interconnection network of the system. We believe that Reconfigurable Multi-Ring interconnection Network (RMRN) can potentially be an ideal system for building computers having a large number of processors. Fixed topologies for interconnection networks lead to an inevitable tradeoff between the need for low network diameter and the need to limit the number of inter-processor communication links. RMRN attempts to address this tradeoff. In this novel approach, each node has a bounded degree of connectivity but the network diameter is restricted by allowing the network to reconfigure itself into different configurations. We plan to modify the RMRN topology so that programmers can implement their applications in a transparent manner. The first step is implementing a simulator which would be able to measure the communication performance of different topologies. The results of simulation can be used to assess the efficiency of RMRN in comparison to the other commonly used approaches. It would also be a test-bed for our further investigation on interconnection networks. We also plan to design an adaptive version of RMRN in which time-slot periods among the possible configurations will be adapted to the computation pattern of applications.

18. Classification of web pages based on ontologies and machine learning techniques

Chandana Kaza

The increasing growth of the usage of search engines has made search engines a main area of research. My research helps in further improvising the search of the various topics depending on the user interests and categorizing those using ontologies and machine learning techniques. An ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain. It is a "formal, explicit specification of shared conceptualization". It is a model for describing the world that consists of a set of types, properties, and relationship types. There is also generally an expectation that there is a close resemblance between the real world and the features of the model in an ontology. Machine learning is the subfield of artificial intelligence that is concerned with the design and development of algorithms that allow computers to improve their performance over time based on data. Thus, I intend to use Naive Bayes classification in order to further classify the retrieved pages into various categories and create a taxonomy of these web pages. The classifier will be trained with various retrieved documents and depending on the keywords in the documents, the documents are classified. A wrapper will be created which could do the post processing on the documents that are retrieved, using ontologies and Naive Bayesian classification.

19. Integrating Behavioral Trust in Web Service Compositions

Sharon Myrtle Paradesi

Web service compositions traditionally utilize the functional and quality-of-service parameters of candidate services to decide which services to include in the composition. However, users of a service often form an opinion of that service. User-driven composition techniques will form compositions that likely behave as stated in practice, and are better received by the users. We formalize a model for trust in a WS, which meets many of our intuitions about trustworthy Web services. Furthermore, we show how we may derive trust for compositions from trust models of individual services. We conclude by presenting and evaluating a novel framework, called Wisp, that utilizes the trust models and, in combination with any WS composition tool, chooses compositions to deploy that are deemed most trustworthy.

20. Distributed Event Notification System for Mobile Communities

Jianxia Chen

Today mobile devices are ubiquitous. This results in the natural formation of mobile communities, which are professional, interest-based, or even ad-hoc communities of mobile phone users. They are characterized by humans both publishing to and consuming from each other via a telecom infrastructure. Our project is to build distributed system support for event notification in mobile communities. Current event notification systems simply deliver relevant messages to subscribers. They do not address scenarios in which published data contains noise, messages contain causal relationships, or events contain properties that require persistence. It is necessary to significantly improve current event notification systems. As the pub-sub paradigm is incorporated in real-world mobile event notification applications with human participants, the noise in the event stream is almost a given. The noise can take many forms, including redundant, incomplete, inaccurate, and even malicious event messages. This paper explores the distributed computing issues involved in handling event streams with redundant and incomplete messages. Given a distributed broker overlay-based pub-sub system, we present our ideas for (1) aggregating event information scattered across multiple messages generated by different publishers and (2) eliminating redundant event messages. Key to our approach is the concept of an event-gatherer - a designated broker in the routing graph that acts as a proxy sink for all messages of a particular event located at the graph center of the corresponding routing tree. This paper proposes a novel decentralized algorithm to find this graph center. Early results show that the proposed scheme typically reduces the message load by over 60% with less than 25% time overhead to the subscriber.

21. Characterizing User Groups Derived through the Clustering of Web Access Logs

Naveed Ahmed

We address the problem of determining the characteristics of groups of web site users. Site administrators make use of this information in designing and refining a site to best meet user needs, and in evaluating the impact of site changes on particular user groups. While the problem of clustering user behavior is well know in Data Mining, good methods for user group profiling, capturing the essential characteristics of the users in those groups, are less well-defined in this domain. We propose to characterize user behavior using a richer set of attributes than the bag of URLs employed in other approaches. Specifically, we intend to make use of frequency and order of page visits, as well as staying time at particular pages. Further, we will investigate the use of the Longest Common Subsequence (series of pages visited) and RTOVs (Refined, Transaction-Oriented Views) for their efficacy at capturing essential user characteristics. Domain experts for production web sites, site administrators with whom we collaborate, will perform manual profiling of the user groups for our data set. In addition, we will perform statistical analyses of user attributes to determine statistically significant differences between groups. We will use the results of these analyses to evaluate our approaches to profiling.

22. GlycoBrowser: A Tool for Contextual Visualization of Biological Data and Pathways Using Ontologies

Matthew Eavenson

GlycoBrowser is a dynamic biological pathway exploration and visualization tool. Its capabilities include incremental navigation of pathways, as well as overlaying relevant experimental data about glycans and reactions within a pathway. The use of ontologies not only allows dynamic construction of pathways, but also facilitates more accurate validation of the information retrieved. Because of the complex nature of glycan structures and the difficulty involved in interpreting the associated data, GlycoBrowser is tailored especially to suit the needs of glycobiologists.

23, A Dynamic Deferred Preemption Algorithm For Reducing Overhead

Chiahsun Ho (Alex)

Preemption causes system overhead due to context switching and cache misses. Deferring preemption may allow us to avoid some preemption, thereby reducing overhead. We present a dynamic algorithm that defers preemption without causing deadlines to be missed. Our goal is to reduce overhead due to context switching and cache misses as much as possible. For any given task sets, this algorithm will calculate the maximum length of a preemption deferral without any deadline misses, thereby reducing the overhead due to context switching.

24. Music Composition with Artificial Intelligence

Tomasz Oliwa

Music composition systems based on Time Delay Neural Networks (TDNN) and on Genetic Algorithms (GA) will be presented here. The TDNN approach acquires musical knowledge by inductive learning and is able to infer information about the key elements and the style of the studied musical material. The GA approach uses objective measures for its composite fitness functions. The output of both music composition systems is a musical score (with possibly multiple instruments) and actual music in the MIDI format.

25. Unmanned Aerial Vehicle (UAV) Simulator

Uthayasanker Thayasivam

This poster describes the ongoing development of a simulation tool which is capable of simulating a UAV (Unmanned Aerial Vehicle) airframe and its controller. It is important to simulate a UAV's behavior since UAVs are expensive and their environment is highly unpredictable. Most of the simulators are either expensive or difficult to customize. The goal of this simulator is to provide a platform for testing and tuning UAV control algorithms and to foresee the behavior of the ongoing UAV prototype in different possible scenarios. Most importantly, the simulation environment can be enhanced as ground software to control and monitor the UAV. The control algorithm can be easily embedded in the onboard autopilot. The simulator is capable of executing a flight plan specified in FLIPS while stabilizing the craft. The simulation shows adequate autopilot performance in both longitudinal and lateral controllers.

26. Semantically Enhanced Long-Running Query Processing

Mustafa Veysi Nural

In this work, we address handling long-running (continuous) queries by exploiting semantics of documents. Keyword-based systems offer a low precision because this type of querying does not allow every optimization by its nature. Therefore we enhance documents by annotating entities, certain type of relationships etc. and make this implicit information accessible. We use RDF for storing these semantic annotations for easy querying with SPARQL. We also address issues in integrating these pieces together.

27. Event Information Extraction from SMS using Contextual Data

Kishor Mahajan

Mobile devices are pervasive. Due to tremendous improvements in technology, mobile devices can be used beyond point-to-point communication. One example is social networking by "twitter". Increasing popularity of Short text messages (SMS), enables us to develop a novel application like the Event Notification/Management system. The application is based on the publish/subscribe model of communication. The users interested in particular services, subscribe to those services and receive all the information about related events. In this particular application, the publisher sends a short text message which is processed within a system for noise removal, disambiguation, redundancy elimination, and event completion. After processing, the formatted message is sent to all the interested subscribers. The challenges in this processing are huge and need to be worked on in modules. This particular topic focuses on event information extraction from a short message and completing the event.

28. Ontology subsumption and Categorization

Arpan Sharma

The Semantic Web envisions making web content machine processable, rather than just readable or consumable by human beings. This is accomplished by the use of ontologies which involve entities and their relationships in different domains. As the computing world is moving towards Semantic Web, people are interested in ontologies which form the backbone of semantic web. As ontologies are coming into existence, ontology evaluation is gaining importance thus, allowing the developer to discover areas of improvements, to understand the faults with the ontology created and to compare with other ontologies in the domain in order to understand how well it describes the domain semantically. Another aspect of comparing ontologies of the same domain can be achieved through Ontology Alignment which is the process of determining correspondences between concepts.

For effective alignment of ontologies, the computation of equivalent elements is one aspect of comparing them which is not enough. Subsumption relations play an important role as well. Another aspect of comparing two ontologies can be a method of finding out how a particular ontology fits into a larger one (e.g. Wikipedia). By finding the degree or measure of similarity of a given target ontology with a huge ontology like Wikipedia, the target ontology can also be categorized into a particular domain even without knowing about the target ontology. The proposed approach tries to compare a given target ontology and fits it into Wikipedia and categorizes the target ontology into a particular domain of knowledge.

29. Video Caching in Resource-Constrained Mobile Multimedia Environments

Hari Devulapally

The title of my project is Video Caching in Resource-constrained Mobile Multimedia Environments. Previous work has resulted in the implementation of a video personalization server (VPS) [1], an HLV encoding scheme for mobile multimedia [2] and a Number-of-Common-Clients-Size (NCCS) video caching scheme [1]. The architecture developed is a Client-Server Model with multiple caches in between. A novel video personalization server and cache architecture has been developed, which can efficiently disseminate personalized video to multiple resource-constrained clients. The video personalization server uses an automatic video segmentation and video indexing scheme based on semantic video content, and generates personalized videos based on the client's content preferences and resource constraints. A novel cache design with a novel cache replacement algorithm and Multi-stage client request aggregation, specifically well suited for caching personalized video files generated by the personalization server, have been implemented. The video personalization server and cache architecture is well suited for personalized video dissemination, with low client-experienced latency, to resource-constrained multimedia-enabled mobile devices such as mobile phones, PDAs and pocket PCs. The caching scheme developed, NCCS is compared to various other popular Caching Schemes like, LFU, LRU etc. The parameters that we have taken into consideration are, Hit Ratio, Byte Hit Ratio and Client-Side Latency. For an effective Caching Scheme, the hit ratio and byte hit ratio should be low, where as the latency should be high.

30. SLAM: Simultaneous Localization and Mapping in Robotics

Anousha Mesbah

Just like humans with different levels of uncertainty about the surrounding world, robots are uncertain about their environments. A robot with a low battery or inaccurate sensors will certainly get noisy observations about the environment and will therefore make an imperfect decision. In robotics, SLAM is a probabilistic technique that is used by a robot to localize itself in an unknown environment. Researches take different approaches in implementing SLAM, many of whom use Bayes's Theorem as their fundamental step to calculate the state of a robot at a certain point in time. One of these approaches is particle filtering which is an estimation technique based on simulation. It estimates a set of state hypotheses based on noisy observations. The purpose of this project is to simulate the first floor of the Boyd building, and for a robot to maneuver around in a simulated environment and localize itself in the map using particle filtering.

31. GAMSA Genetic Algorithm for Multiple Stem Alignment

Sal LaMarca

Aligning multiple sequences, structures and stems of RNA from different species is computationally intensive. When RNA sequences include pseudoknots, the alignment challenge is made more difficult. We introduce the Genetic Algorithm for Multiple Stem Alignment, or GAMSA, a stochastic, evolutionary algorithm that aligns RNA stems that may contain pseudoknots. GAMSA is able to multiply align stems that may or may not be present in certain sequences. Due to the complex nature of pseudoknots, many alignment algorithms do not align sequences with pseudoknots and there is not much known about sequences of RNA that contain pseudoknots. GAMSA can be used to align different sequences of RNA with pseudoknots and to find similarities between these sequences so their structures can be further researched.

32. Computational Prediction of Novel Telomerase RNAs in Yeast Genomes

Dong Zhang

Telomerase RNA (TR) is a vital component of the telomerase enzyme that ensures the complete replication of chromosome ends. Experimental identification of TRs in RNA sequence data is computationally expensive and the field lacks a general solution to the alignment problem because TR sequences diverge across many eukaryotes. Computational prediction of TRs can help narrow the number of plausible candidates for further experimental validations. The common core structure found in all known TRs may form the basis for a prediction. However, the structure contains a recently identified triple helix as well as structural elements divergent across many eukaryotes, beyond the capability of existing RNA structure profiling techniques. The structure-based prediction of TRs thus remains challenging.

33. Hybrid Layered Video Encoding for Power Constrained Devices

Naveen Kumar Aitha

Video playback is a computationally intensive task. Since the battery life of a mobile device decreases with time, it is desirable to have a layered-video representation, which adapts itself dynamically to the available battery-life in the device. In this paper, we propose a Hybrid Layered Video (HLV) encoding scheme that comprises both a content-aware, multi-layered video texture and a generative sketch-based video. Different combinations of texture and sketch result in six different states of the video. Experiments have shown that different states require different amounts of power, thus rendering HLV effective for playback of video in mobile devices.

34. MACE: A Dynamic Caching Framework for Mashups

Osama M. Al-Haj Hassan

A recent surge of popularity has established mashups as an important category of Web 2.0 applications. Mashups are essentially Web services that are often created by end-users. They aggregate and manipulate data from sources around the World Wide Web. Surprisingly, there are very few studies on the scalability and performance of mashups. In this paper, we study caching as a vehicle for enhancing the scalability and the efficiency of mashups. Although caching has long been used to improve the performance of Web services, mashups pose some unique challenges that necessitate a more dynamic approach to caching. Towards this end, we present MACE - a cache specifically designed for mashups. The MACE framework offers three technical contributions. First, we introduce a model for representing mashups and analyzing their performance. Second, we propose an indexing scheme that enables efficient reuse of cached data for newly created mashups. Finally, we describe a novel caching policy that considers the costs and benefits of caching data at various stages of different mashups and selectively stores data that is most effective in improving system scalability. We also report the results of experiments that quantify the performance of the MACE system.

35. Ontology Browser

Phani Rohit Mullangi

Ontology is a description of a set of concepts and the relationships between them within a specific domain. Each concept has various attributes. Furthermore, a concept may be have one to several thousands instances. Visual representation of ontologies helps to analyze and comprehend the information better. As a result, there is a growing need for ontology visualization. Unlike existing graph visualization techniques that displays the entire graph and then filter out irrelevant data, our browser provides the tools to incrementally explore and visualize relevant data of RDF ontologies as needed enabling it to explore larger ontologies. A key idea of our approach is also to implement the contextual visualization of ontologies, i.e. to render the nodes as logical entities during the exploration of ontology with the help of user defined graph patterns.

36. Location Recognition Using Multilayered Fingerprints

B. J. Wimpey

We outline a framework that allows an autonomous mobile robot system to recognize and map locations in an environment. Many approaches to this problem use the Scale Invariant Feature Transform (SIFT) as a component of their solution. SIFT is a powerful machine vision technique, but it performs poorly on uniform colored or low contrast objects. We therefore set out to complement SIFT with other machine vision techniques to provide a robust location recognition system in order to create better fingerprints of locations.

37. GlycoVault Database (for Glycomics project)

Sumedha Ganjoo

The principal purpose of GlycoVault is to support the research of glycol biologists in collecting and analyzing data about glycans. In particular, it provides a foundation for the integration and visualization of knowledge and data. GlycoVault provides a means of storing and retrieving data to support glycomics research at the Complex Carbohydrates Research Center (CCRC) at the University of Georgia. These data include quantitative Real-Time Polymerase Chain Reaction (qRTPCR) data as well as basic glycomics data such as biologically relevant parameters. Glyco-Vault consists of databases, ontologies and data files in various formats that are integrated by a sophisticated organizational structure and accessed by a comprehensive, yet easy to use Application Programming Interface (API). The API facilitates the development of methods for querying the knowledge and exporting the results in formats (such as XML) that can be readily digested by external applications The current research focuses on enabling an efficient content repository for GlycoVault that supports a wide range of data formats e.g., relational tables and excel files. The latter part is to provide an easy way to access data using REST based Web services.

38. Semantics and Services Enabled Problem-Solving Environment for T. cruzi

Amir Asiaee

This research has a two-pronged thrust. The first thrust concerns Semantics-driven Query Interface for raw Biological Data project which aims to utilize state-of-the-art semantic technologies for effective querying of multiple databases through creation of a suite of ontologies modeling multiple aspects of the T. cruzi research domain. The second thrust focuses on Natural language query processing to enhance the T. cruzi user interface. A key idea is to explore a way to analyze a question in natural language to discover entities and relationships between them to build RDF triples and execute queries over a knowledge base.

39. SemanticQA: Exploiting Semantic Associations for Cross-Document Question Answering

Samir Tartir

As more data is being semantically annotated, researchers across multiple disciplines typically rely on semantic repositories that contain large amounts of data in the form of ontologies as a compact source of information. A primary concern facing researchers is the lack of easy-to-use interfaces for data retrieval, due to the need special purpose query languages or applications. In addition, the knowledge in these repositories might not be comprehensive or up-to-date due to several reasons, e.g., the discovery of new information in a field after creating the repositories. In this research, we introduce an enhanced version of our SemanticQA system that allows users to query semantic data repositories using natural language questions. If a user question cannot be answered solely from the ontology, SemanticQA detects the failing components and attempts to answer the missing components utilizing web documents. It then plug in partial answers incrementally until reaching the full answer to the whole question, which might involve a repetition of the same process if other parts fail.

40. Efficient Execution of Learned K-nearest neighbor Queries Using Clustering with Caching

Jaim Ahmed

We introduce a new algorithm for K-nearest neighbor queries that uses clustering and caching to improve performance. The main idea is to reduce the distance computation cost between the query point and the data points in the data set. We use a divide-and-conquer approach. First, we divide the training data into clusters based on similarity between the data points in terms of Euclidean distance. Next we use linearization for faster lookup. The data points in a cluster can be sorted based on their similarity (measured by Euclidean distance) to the center of the cluster. Fast search data structures such as the B-tree can be utilized to store data points based on their distance from the cluster center and perform fast data search. The B-tree algorithm is good for range search as well. We also show that we achieve a performance boost using B-tree based data caching with guards.

41. Cloud Computing

Guannan Wang

Personal computers have provided tremendous advantages. Unfortunately they also have some disadvantages, such as requiring users to manage various software installations, configurations and updates. Therefore outsourcing the computing platform issues to a platform provider is a smart solution. Under this computing model, users move their data and applications to a remote "Cloud" and then access them in a simple and pervasive way via the Internet. As a result, Cloud computing emerges as a new computing paradigm which aims to provide reliable, customized and QoS guaranteed dynamic computing environments for end-users. Toward this goal, this project focuses on integrating application on two of the most popular cloud computing platforms and then comparing the differences between the methods of doing the integration.

Slideshow