Dr. Suchi Bhandarkar
Graduate Students
Aparna Kadakia
Yash Warke
The emergence of multimedia information
systems has created a need for analysis, integration and organization of
non-traditional data such as digitized images, video and audio. One of
the greatest challenges in the design of multimedia information systems
is overcoming the difficulty of rapidly and reliably extracting ``key''
information from images, video and audio streams which could then be used
for rapid browsing and indexing of the underlying information. In this
project we focus on the analysis of compressed (MPEG) video data
with the intent of extracting suitable key information.
Video parsing or scene/shot change
detection in a video stream is commonly used to extract key frames in a
video stream. These key frames are then used for rapid video browsing and
automatic annotation and indexing of video streams
to support content based query access to large video databases. The video
parsing operation is primarily domain independent i.e., no assumptions
are made about the semantics of the video or its underlying theme. Video
parsing, therefore, is a crucial first step that precedes domain dependent
analysis of the video. Due to the large amount of data involved, video
streams are often compressed for efficient transmission and storage. Video
parsing techniques that are capable of processing compressed video data
directly have a considerable advantage in terms of execution time and memory
requirement over those that require full frame decompression. Consequently,
this project focuses on the design and implementation of video parsing
and video analysis techniques that are capable of processing compressed
data directly.
The figure shows the motion vectors
used in the algorithm
The research thus far has resulted
in the design and implementation of a video parsing algorithm for MPEG
(compressed) video data. The algorithm integrates motion cues and chrominance/luminance
cues from the video data. Motion cues are in the form of motion vectors
which are derived from the motion compensation information encoded in the
MPEG stream. Chrominance and luminance cues are derived from the DC images
in the MPEG stream. The algorithm detects abrupt changes (cuts or shot
boundaries), gradual scene changes (fades and dissolves) and camera motion
parameters such as pans and zooms. Experimental results on MPEG video showed
the algorithm to be fast (~60 frames per second on a 170 MHz SUN UltraSPARC1
workstation) and accurate.
The figure shows the functionality
of the algorithm
Current work in this project deals
with key frame generation and representation in video sequences in particular,
generation of video mosaics from compressed video, annotation of mosaics
with motion based information and extraction of indexing
attributes.
Publications
S.M. Bhandarkar and A.A. Khombadia, Motion based Parsing
of Compressed Video, Proc. IEEE Intl. Wkshp. Multimedia Database
Mgmt. Sys., Dayton, Ohio, August 5-7, 1998, pp. 80-87.
A.A. Khombadia, A Rapid MPEG Navigator, MS Thesis,
Dept. of Computer Science, University of Georgia, 1997.
Y.S. Warke, Integrated Parsing of Compressed Video, MS
Thesis, Dept. of Computer Science, University of Georgia, 1998.