CATS Spring 2003




Jonathan Myers

Counting Pairs of Sequences by Size of Longest Common Subsequence


A polynomial time algorithm is presented for calculating the number of pairs of sequences of given lengths m and n (over a fixed alphabet) having a given length k for their longest common subsequence. The intended application is finding the most significant overlaps among DNA fragments as a first step in assembly.






Bob Robinson

Counting Feynman Diagrams


A Feynman diagram D of the type considered has a vertex set U of cardinality 2n for some n > 0, along with n undirected V-lines forming a perfect matching on U and 2n directed G-lines forming a permutation on U. Here n is called the order of D. One of the G-lines is designated as the root of D. If D is connected and cannot be disconnected by removing some two G-lines it is called irreducible. The number C(n) of nonisomorphic connected Feynman diagrams of order n has been known under various guises for at least 50 years. However the number I(n) of those which are irreducible appears to be new. The study of C(n) and I(n) is motivated by research which aims to combine Monte Carlo summation techniques with self-consistent high-order Feynman diagram expansions to computationally solve interacting fermion models in quantum physics.

A recently simplified approach to calculating the exact numbers C(n) and I(n) is presented. As time permits related results on the asymptotic behavior of C(n) and I(n) will be discussed, along with methods for generating canonical Feynman diagrams.

The main part of the talk will be an expanded version of a presentation at the ALICE '03 workshop; a detailed abstract is available in PostScript or PDF form.

Note: The research reported is being carried out for the NSF project "ITR/ACS: Stochastic summation of high-order Feynman graph expansions", led by Prof. H.-B. Schuttler of the UGA Physics Dept. (PI) with the speaker and others as co-PIs.






Bob Robinson

Generating Feynman Diagrams -- an Update


An overview and update on generating Feynman diagrams (FDs) is presented. There have been three previous seminars this academic year on generating and counting FDs; for the abstracts, see 26 Aug 2002, 11 Nov 2002, and 27 Jan 2003.

In this talk we present the basics of CAT generation algorithms for canonical FDs and for all labeled FDs, then discuss related open problems. Here CAT stands for "Constant Amortized Time".

Note: The research reported is being carried out for the NSF project "ITR/ACS: Stochastic summation of high-order Feynman graph expansions", led by Prof. H.-B. Schuttler of the UGA Physics Dept. (PI) with the speaker and others as co-PIs.






Rod Canfield

The Cauchy Integral Formula and Enumeration


Cauchy (1789-1857) is credited with discovering the Cauchy Integral Formula. When a function f(z) is defined by a power series

f(z) = \sum_{n=0}^{\infty} a_n z^n,

(z varies over complex numbers) the CIF allows you to express the coefficient a_n in terms of the numbers f(z) as z varies around a circle.

This is a useful technique in combinatorial enumeration, analytic number theory, mathematical physics, etc etc.

The two-part lectures will be intended to make the student, who is assumed to be familiar with complex numbers but NOT to have ever taken a course in complex analysis, an expert on the technique.

These lectures are tied in to the enumeration of bipartite graphs, some research done jointly with Brendan McKay. There was a lecture last Fall about a computer program for calculating said numbers. That lecture is independent of this one, but the overall goal is to work towards presenting the results of that research.






Sue Whitesides

School of Computer Science, McGill University, Montreal

Special Joint CATS/Geometry Seminar

Embedding Problems for Paths and Cycles with Direction Constrained Edges*


We determine the reachability properties of the embeddings in 3D of a directed path, in the graph theoretic sense, whose edges have each been assigned a desired direction (East, West, North, South, Up, or Down) but no length. We ask which points of 3D can be reached by the terminus of an embedding of such a path, by choosing appropriate positive lengths for the edges, if the embedded path starts at the origin, does not intersect itself and respects the directions assigned to its edges. Similarly, we ask which graph theoretic cycles have physical realizations, without self-intersections, that respect the given direction constraints.

These problems arise in the context of extending planar graph embedding techniques and VLSI rectilinear layout techniques from 2D to 3D. We give combinatorial characterizations that yield linear time recognition and layout algorithms.

All are welcome. No special background is assumed.

* joint work with G. Di Battista (U. Roma III), G. Liotta (U. Perugia), and A. Lubiw (U. Waterloo)






Andreas Voigt

How Feynman Diagrams Help to Resolve Mysteries in Physics


The interacting fermion problem is of fundamental importance in a wide range of physics research areas. It includes fields as diverse as electronic structure theory of solids, strongly correlated electron physics, quantum chemistry and the theory of nuclear matter. Especially the strange and still unrevealed nature of the high temperature superconductors has attracted a great deal of attention and remains still unsolved.

I will give an introduction into the problem and an overview about the ongoing project to combine a Monte Carlo summation techniques with a self-consistent high-order Feynman diagram expansions. I will present the basics steps necessary to carry out the task: the formulation of the model Hamiltonians and the use of Feynman diagrams to calculate basic physical quantities like the self-energy. Some interesting results for the Anderson impurity model will be presented.






Aaron Windsor

Upper Tail Estimates and Their Applications


Given a random variable X, an important function is E[X], X's expected value. Typically, computations of E[X] are straightforward. In many applications, however, it's necessary to know a lot more about X than just E[X]. In particular, is X usually close to E[X]? If so, how close and how often? If X can be decomposed into the sum of smaller random variables, this is where the upper and lower tail probabilities are useful. Roughly, the lower tail is the probability that X falls below its expectation by a lot and the upper tail is the probability that X is way above its expectation In the case that X is the sum of independent indicator random variables, the Chernoff bound will give bounds for the upper and lower tail that are asymptotically equivalent. When X is the sum of small products of indicator random variables, there is an analogue of the Chernoff bound for the lower tail, but no matching upper tail bound exists in general. Recently, interesting methods to deal with the upper tail have been developed and applied in Combinatorics and Computer Science. We'll survey some of these bounds, including Azuma's Inequality, Talagrand's Inequality, and Kim-Vu Concentration, as well as some purely combinatorial techniques, as time permits. Most of the background for this talk comes from the recent paper "The Infamous Upper Tail" by Svante Janson and Andrzej Rucinski.






UGA Computer Science Department

Graduate Student Orientation


The orientation program will take place in room 328 Boyd, 3:35-5:35 PM on Monday, March 10.






Yuanxin Liu

Dept. of Computer Science

University of North Carolina at Chapel Hill

Special Joint CATS/Geometry Seminar

Testing Homotopy for Paths in the Plane


We present an efficient algorithm to test if two given paths are homotopic; that is, whether they wind around obstacles in the plane in the same way. For paths specified by $n$ line segments with obstacles described by $n$ points, several standard ways achieve quadratic running time. For simple paths, our algorithm runs in $O(n\log{n})$ time, which we show is tight. For self-intersecting paths, the problem is related to Hopcroft's problem; our algorithm runs in $O(n^(2/3)\log{n})$ time.

Reference: Sergio Cabello, Yuanxin Liu, Andrea Mantler, and Jack Snoeyink. Testing homotopy for paths in the plane. Discrete and Computational Geometry, Special Issue on the 2002 Symposium on Computational Geometry. To appear.






Andrea Mantler

Dept. of Computer Science

University of North Carolina at Chapel Hill

Special Joint CATS/Geometry Seminar

Ununfoldable Polyhedra with Convex Faces


Unfolding a convex polyhedron into a simple planar polygon is a well-studied problem. We study the limits of unfoldability by studying nonconvex polyhedra with the same combinatorial structure as convex polyhedra. In particular, we give two examples of polyhedra, one with 24 convex faces and one with 36 triangular faces, that cannot be unfolded by cutting along edges. We further show that such a polyhedron can indeed be unfolded if cuts are allowed to cross faces. Finally, we prove that "open" polyhedra with triangular faces may not be unfoldable no matter how they are cut.

Reference: Marshall Bern, Erik Demaine, David Eppstein, Eric Kuo, Andrea Mantler, and Jack Snoeyink. Ununfoldable polyhedra with convex faces. Computational Geometry Theory and Applications, Special Issue on the 4th CGC Workshop on Computational Geometry, 24(2):51-62, February 2003.






Peter Dadam and Manfred Reichert

Dept. Databases and Information Systems (DBIS)

University of Ulm, GERMANY

Towards a New Dimension for Process-Aware Information Systems


In the future, success or failure of an enterprise will more and more depend on its ability to flexibly and quickly react to changes at the market, the development, or the manufacturing side. This means that the flow of work within a department, within a company, or even across companies may have to be quickly adapted. To meet this challenge enterprises are developing a growing interest in supporting their business processes more effectively and in streamlining their computer applications such that they behave "process-oriented".

In principle, workflow (WF) technology offers a promising approach for this. It allows to define processes explicitly, to integrate application components, and to deliver a state-aware control service. However, current WF technology is jumping much too short. After being implemented, processes are rather inflexible and later changes in the process schema cannot be mapped to running instances. These and other weaknesses limit the applicability of WF technology significantly and will lead to big headaches for its users.

The target of the ADEPT project is to develop the fundamentals for a WF technology which makes process-aware applications easy to implement and which is much more flexible than today's systems. Very challenging in this context is to achieve this in an efficient manner and without violating consistency and robustness.

In the first part, Peter Dadam will describe the demands for adequate WF technology and illustrate the "technological vision" we are trying to make reality in our research. In the second part, Manfred Reichert will discuss the technological approaches taken in the ADEPT project in order to meet these goals. Among other things, he will present the developed framework for dynamic WF changes, which enables both, the quick and correct propagation of WF type changes to in-progress WF instances and the ad-hoc adaptation of single WF instances.

The presentation will be complemented by a live demonstration of the experimental ADEPT workflow engine.


Peter Dadam is full professor at the University of Ulm and director of the DBIS department. Before he was director of the department for Advanced Information Management (AIM) at the IBM Heidelberg Science Center where he managed the AIM-P project on advanced database technology. Current research areas include cooperative information systems, WF management, and database technology and its use in advanced application areas.

Manfred Reichert is assistant professor in the DBIS department at the University of Ulm. He finished his PhD thesis on flexible WF management in May 2000. Current research topics include enterprise-wide and cross-organizational workflows, enterprise application integration and workflow, and different aspects related to WF technology.






Tarsem Purewal

Minimal Enclosings of Group Divisible Block Designs


The birth of combinatorial design theory can be traced back to Euler's discovery of the Latin Square over 200 years ago. Since that time, Balanced Incomplete Block Designs have become one of the most studied type of design in combinatorial design theory. This can be attributed to the variety of applications they have in the design of statistical experiments. This talk will review the concept of Balanced Incomplete Block Designs, and then introduce a generalization - Group Divisible Designs (where 'Group' refers to an element in a partition of objects). Necessary conditions for the existence of such designs and when such conditions are sufficient will be discussed. The existence problem of minimal enclosings of Group Divisible Designs will be introduced, as will necessary conditions for such enclosings. A few simple examples of minimal enclosings for small Group Divisible Designs will be shown. Generalizations of such enclosings and open problems will be discussed as time permits. No knowledge of design theory or combinatorics will be assumed. An elementary knowledge of some concepts from graph theory might be helpful.






Rod Canfield

Locally Restricted Compositions*


Compositions $n=a_1+a_2+\cdots$, $a_k>0$, have been studied classically. More recently, compositions with the local restriction $a_k\ne a_{k+1}$ (Carlitz compositions) have been studied by various authors. We consider the compositions with more general local-nonequality restrictions, including multiline compositions. We obtain recursions, bounds on growth rate, and other properties of a randomly selected restricted composition.

-----------------------------------

* joint work with Ed Bender at UCSD






Jizhen Zhao

Parameter Estimation for Stochastic Grammar Modeling of RNA Pseudoknots*


Stochastic grammar models of RNA pseudoknots have been introduced to automate RNA pseudoknot prediction. However, the accurate estimation of the probability parameters of such grammar models from training sequences is difficult because of the inherent context-sensitivity in these grammars. In particular, existing algorithms for parameter estimation, such as Inside-outside, are applicable only to stochastic context-free grammar (SCFG) models for RNA stem-loop structures.

We introduce a new parameter estimation algorithm Upward-downward for the stochastic grammar model of RNA pseudoknots developed recently. The algorithm generalizes inside and outside probabilities for pseudoknot substructures by propagating probabilities of crossing double helices upward and downward in the derivation tree to accomplish the computation of inside and outside probabilities. The algorithm is a non-trivial extension of Inside-outside to RNA pseudonot models which was heretofore not available.

*This is joint research with Drs. Liming Cai and Russell Malmberg. The paper is HERE.






Congzhou He

Memory-Efficient Pseudoknot Prediction with Stochastic Grammar Modeling


The prediction of RNA pseudoknotted structure is computationally intractable due to the structural complexity of crossing nucleotide base pairs. Almost all existing prediction algorithms entail O(n^4) memory space, making it unrealistic to predict pseudoknots for RNA molecules of even moderate length. We use techniques that reduce the resource requirements significantly to O(n^2) in memory space without substantial sacrifice in running time, and which avoid an exhaustive search for crossing helices in pseudoknots. Experiments conducted on baterial tmRNA demonstrate that the improved algorithm with O(n^2) space requirement achieves the same prediction accuracy as the optimal prediction algorithm with O(n^4) space requirement.






Junfeng Qu

Bellman-Ford Algorithm and Arbitrage Opportunity


The use of computers in the finance industry has been marked with controversy lately as programmed trading -- designed to take advantage of extremely small fluctuations in prices -- has been outlawed at many Wall Street firms. The ethics of computer programming is a fledgling field with many thorny issues. The presentation tries to use the Bellman-Ford algorithm to discover the arbitrage opportunity in currency exchange. A tentative solution is proposed with further suggestions given.



 CATS