Previous
|
Students
John Thompson, Music
Mary Li, Elec & Comp Engineering
Lance Putnam, Media Arts & Tech
Jim Kleban, Elec & Comp Engineering
|
Faculty
Advisors
Xinding Sun, Elec & Comp Engineering
JoAnn Kuchera-Morin, Media Arts & Tech
B.S. Manjunath, Elec & Comp Engineering
|
Next
 |
|
Abstract
The goal of this project is to design an intelligent
system that extends the traditional musical instrument and the conventional
performance style. Specifically, a human flutist will interact with
an Intelligent Virtual Being (IVB) in a similar way to another human
player. This will require the use of non-intrusive sensors for perceiving
the actions of the human player and a system for detecting gestures
or patterns in the raw data to further map to a predetermined musical
composition. Specific gestures are to be recognized to cue events,
sequences, and provide continuous control over sound transformation
processes. Another major challenge is to create a system that is
as easily transportable, simple to set up, and robust as a traditional
musical instrument is for a performer.
The technical focus of the IVB will be on developing pattern recognition
(PR) algorithms to convert raw audio and video samples to performance
cues. On the computer vision side, selecting the proper PR techniques
will be of prime importance since many are domain specific. This
involves implementing and evaluating system performance of various
selected techniques. With the constraint of computer hardware and
the goal of achieving real-time response, the 2D image sequences,
i.e. the pixel level raw information, need to be compacted into
useful and representative features, and these features should be
invariant to environmental (lighting, people, and hardware) changes.
To begin with, the IVB will undergo a supervised learning process
to visually recognize pre-defined gestures, including entrance and
exit cues. The audio aspects of the IVB will involve converting
the flutist’s sounds to spectral data, storing the spectral
data in a buffer, analyzing the waveform/spectral data to extract
higher level features and performing transformations on the spectral
data based on the combined features of the audio and a pre-determined
score. The prominent features to be extracted will be pitch, amplitude
dynamics, and voice/noise presence and will be used to trigger events
from a predetermined score. Ideally, during real musical performance,
the IVB will accurately classify new information based on the prior
knowledge base and make a real-time response, such as to initiate
a synthesis process.
By studying the interaction of human musicians we hope to give a
virtual performer some of the intelligence that is demonstrated
in these interactions. Eventually, we hope to teach the virtual
performer to derive other structural features of the musical interaction
from the analysis of the sensor space. At the core of our concern
is to build a system that is robust enough to provide deterministic
results in the context of serious music performance. On a more general
level, musical interaction confronts multi-sensory input, analysis,
and mapping of a complex array of human communication signals, as
well as, the multi-modal output of the resulting media content.
Studying musical human-computer interaction provides a rich source
for understanding subtle and expressive human communication for
cross-disciplinary exploration of human-computer interaction techniques.
|
|

|
|
|
|