This project was about automation of annotation of electro-acoustic music through application of machine learning methods. Electro-acoustic music is a contemporary form of electronic composition and production of music that originated in the 1940s. It is made with electronic technology, using synthesized sounds or prerecorded sounds from nature and studio which are often extensively processed and altered. Compared to the analysis of instrumental or vocal music, annotation of electro-acoustic music is both more challenging and less developed. There exist no "pre-segmented" discrete units like notes, there is no score and no universally established system for analysis. Although musicology has developed various sets of tools for analysis of electro-acoustic music, the tediousness of manual annotation has prevented the application of these theories to a larger body of music. On the other hand, Music Information Retrieval has developed a rich repertoire of machine learning algorithms for analysis of music, including methods that can be used for automatic annotation. Our essential result is that machine learning methods can indeed be used for annotation of electro-acoustic music but only in an interactive setting. Only the integration of a human analyst into the workflow allows to sidestep the seeming impasse that the lack of ground in annotation of electro-acoustic music presents.
As automatic speech recognition (ASR) systems so far do not exhaust the potential of automatic text processing and semantic technology, the INSPIRATION project aims at modelling and supporting document creation processes aided by speech recognition. The first application area is the production of medical reports.
Feasibility Study: Interactive Entertainment of Elder Persons with Intelligent and Emotional Personality Agents
In most listings of the benefits of virtual butlers or other technical companions for elder people one aspect is missing: games. Games can fulfill at least two tasks: First, they can entertain these people, they can amuse them, then people can become happier. Secondly, games can train emotional and cognitive capabilities. In this study we investigated which games could be used, which role an intelligent and emotional agent could play, and how the equipment should function and look in order to be accepted by elderly people.
One important means of natural human-computer interaction is (spoken) language, so for a variety of applications it is essential to have high quality speech synthesis for different languages. The outcome of this project will be high quality synthetic voices, which allow a computer to "speak" in different Viennese dialects/sociolects. Since the sources of these voices are pieces taken from actual human speech, the outcome of the synthetic voices will sound very natural, close to human speech. With this technology it is possible to realize a lot of applications from the domain of education and tourism to art. A mobile sample application, a Viennese district guide capable of various dialects or variants, is also developed within the project. In the research part of the project efficient methods are investigated for developing synthetic voices for languages that are variants of other languages. Furthermore, it is necessary to employ methods for switching, or shifting between the standard language and dialectal variants, which reflects the fact that this mixing of standards corresponds to the everyday language use of many speakers. User tests are conducted to evaluate the quality of the synthetic voices and of the relevant sample applications.
The rapidly growing amount of music available in digital form via internet or digital libraries calls for entirely new computer-based methods for analysing, describing, distributing, and presenting music. The currently emerging research and application field known as Music Information Retrieval (MIR) is a direct response to that need. Over the past years, our research group at the Austrian Research Institute for Artificial Intelligence (OFAI) has accumulated substantial expertise in intelligent music processing.
The project RASCALLI aims at the development of Responsive Artificial Situated Cognitive Agents that Live and Learn on the Internet. RASCALLI represent a growing class of cooperative agents that do not have a physical presence, but nevertheless are equipped with major ingredients of cognition including situated correlates of physical embodiment to become adaptive, cooperative and self improving in a certain environment (Internet) given certain tasks. Their task-based processing of Web content requires an action-based model of interpretative perception. Because of the size and importance of their memory, special attention is paid to the associative structuring of the acquired information based on interests and experience, and to models of an active, permanently structure-creating and restructuring memory. With RASCALLI we aim at artificial agents that are able to combine human and computer skills in such a way that both kinds of abilities can be optimally employed for the benefit of the human user.
The amount of digital information is constantly increasing, and search engines are still the means for accessing this information. However, recommender systems are at their best way to become a remedy to this unsatisfying situation. Especially in m- and e-commerce recommender systems have already achieved a high level of attention and application. Unfortunately current recommender technology suffers from two major shortcomings. They require a huge amount of expertise and handcrafting for modelling an application domain, and they rely in their recommendations on the behaviour and opinions of those users active in the system. This is a poor resource compared to the abundance of opinions and ratings out there in the internet. With the project SEMPRE, we aim at developing semantic technology for exploiting the rich and dynamic resource of factual information and human opinions available on the internet.
The rapidly growing amount of music available in digital form via internet or digital libraries calls for entirely new computer-based methods for analysing, describing, distributing, and presenting music. The currently emerging research and application field known as Music Information Retrieval (MIR) is a direct response to that need. Over the past years, our research group has accumulated substantial expertise in intelligent music processing. The goal of this project is to develop our know-how and methods further along three specific lines, to the point where they can be used as a basis for commercially relevant application projects.
Multi-sensory Autonomous Cognitive Systems Interacting with Dynamic Environments for Perceiving and Using Affordances
The main objective of the MACS project was to explore and exploit the concept of affordances for the design and implementation of autonomous mobile robots acting goal-directedly in a dynamic environment. The claim was to develop affordance-based control as a method for robotics. Technically speaking, a prototypical affordance-based architecture has been developed and implemented on a mobile robot, KURT3D. For providing simple manipulation capabilities, KURT3D is equipped with a magnetic gripper crane arm. The experiments have been performed in the simulator MACSim and in a real demonstrator scenario .