Overview
In 1994, an ambitious project in the multimedia domain
was started
at the University of Mannheim under the guidance of Prof. Dr. W.
Effelsberg. We realized that multimedia applications using continuous
media like video and audio data absolutely require access to semantic
contents of these media types in a manner similar to that for textual
and numerical data. Imagine a situation for textual media in which
large digital collections of books, reports, articles etc. exist but
nobody is able to search for pertinent keywords. Content analysis of
continuous data, especially of video data, is currently based mainly on
manual annotations. This implies that the searchable content is reduced
to the annotated content, which usually does not contain the required
information. The aim of the MoCA project is therefore to extract
structural and semantic content of videos automatically.
During the past years, different applications have been
implemented
and the scope of the project has concentrated on the analysis of movie
material such as can be found on TV, in cinemas and in video-on-demand
databases. This has provided access to a great amount of input data for
our algorithms. The algorithms developed for video and audio analysis
thus concentrate on movie material. However, they are also applicable
to general video and audio material.
Analysis features developed and used within the MoCA
project fall
into four different categories:
- features of single pictures (frames) like brightness,
colors,
text,
- features of frame sequences like motion, video cuts,
- features of the audiotrack like audio cuts, loudness
and
- combination of features of the three classes to
extract e.g.
scenes.
The first two are usually regarded together and called video features.
We have implemented a large number of well-known and new features in
all categories. Details can be found in our
publications.