OFAI

Technical Reports - Query Results

Your query term was 'number = .'
629 reports found
Reports are sorted by descending number

OFAI-TR-2017-01 ( 628kB PDF file)

One Million Posts: A Data Set of German Online Discussions

Dietmar Schabus, Marcin Skowron, Martin Trapp

In this paper we introduce a new data set consisting of user comments posted to the website of a German-language Austrian newspaper. Professional forum moderators have annotated 11,773 posts according to seven categories they considered crucial for the efficient moderation of online discussions in the context of news articles. In addition to this taxonomy and annotated posts, the data set contains one million unlabeled posts. Our experimental results using six methods establish a first baseline for predicting these categories. The data and our code are available for research purposes from https://ofai.github.io/million-post-corpus.

Keywords: Language Resources, Natural Language Processing, Text Classification

Citation: Schabus D., Skowron M., Trapp M.: One Million Posts: A Data Set of German Online Discussions. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2017-01,


OFAI-TR-2016-10 ( 116kB PDF file)

An Empirical Analysis of Hubness in Unsupervised Distance-Based Outlier Detection

Arthur Flexer

Outlier detection is the task of automatic identification of unknown data not covered by training data (e.g. a previously unknown class in classification). We explore outlier detection in the presence of hubs and anti-hubs, i.e. data objects which appear to be either very close or very far from most other data due to a problem of measuring distances in high dimensions. We compare a classic distance based method to two new approaches, which have been designed to counter the negative effects of hubness, on six high-dimensional data sets. We show that mainly anti-hubs pose a problem for outlier detection and that this can be improved by using a hubness-aware approach based on re-scaling the distance space.

Keywords: Outlier detection, Hubness, Curse of dimensionality, Evaluation

Citation: Flexer A.: An Empirical Analysis of Hubness in Unsupervised Distance-Based Outlier Detection, in Proceedings of 4th International Workshop on High Dimensional Data Mining (HDM), in conjunction with the IEEE International Conference on Data Mining (IEEE ICDM 2016), Barcelona, Spain, 2016.


OFAI-TR-2016-09 ( 198kB PDF file)

Hubness aware outlier detection for music genre recognition

Arthur Flexer

Outlier detection is the task of automatic identification of unknown data not covered by training data (e.g. a new genre in genre recognition). We explore outlier detection in the presence of hubs and anti-hubs, i.e. data objects which appear to be either very close or very far from most other data due to a problem of measuring distances in high dimensions. We compare a classic distance based method to two new approaches, which have been designed to counter the negative effects of hubness, on two standard music genre data sets. We demonstrate that anti-hubs are responsible for many detection errors and that this can be improved by using a hubness-aware approach.

Keywords: music information retrieval, outlier detection, hubness, curse of dimensionality, genre recognition

Citation: Flexer A.: Hubness aware outlier detection for music genre recognition, in Proceedings of the 19th International Conference on Digital Audio Effects (DAFx-16), pp. 69-75, 2016.


OFAI-TR-2016-08 ( 159kB PDF file)

Data-Driven Identification of Dialogue Acts in Chat Messages

Dietmar Schabus, Brigitte Krenn, Friedrich Neubarth

We present an approach to classify chat messages into dialogue acts, focusing on questions and directives ("to-dos"). Our multi-lingual system uses word lexica, a specialized tokenizer and rule-based shallow syntactic analysis to compute relevant features, and then trains statistical models (support vector machines, random forests, etc.) for dialogue act prediction. The classification scores we achieve are very satisfactory on question detection and promising on to-do detection, on English and German data collections.

Keywords: NLP for user-generated content, text categorization, analysis and generation of conversations

Citation: Schabus D., Krenn B., Neubarth F.: Data-Driven Identification of Dialogue Acts in Chat Messages. In Proceedings of the 13th Conference on Natural Language Processing (KONVENS), Bochum, Germany, pp. 236-241, 2016.


OFAI-TR-2016-07

The Virtual Biographer. Theoretical Background and First Experiments

Sabine Payr, Marcin Skowron, Robert Trappl

(82 pages)

Citation: Payr S., Skowron M., Trappl R.: The Virtual Biographer. Theoretical Background and First Experiments. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2016-07, 2016.


OFAI-TR-2016-06 ( 182kB PDF file)

Mutual proximity graphs for music recommendation

Arthur Flexer, Jeff Stevens

We present mutual proximity graphs, which are an extension of mutual k-nearest neighbor (knn) graphs, and are able to avoid hub vertices having abnormally high connectivity. We apply this new approach in a music recommendation system based on an incrementally constructed knn graph. We show that mutual proximity graphs yield much better connected graphs with better reachability compared to knn graphs and mutual knn graphs.

Keywords: Music recommendation, hubness, k-nearest neighbor graphs, mutual proximity

Citation: Flexer A., Stevens J.: Mutual proximity graphs for music recommendation, Proceedings of the 9th International Workshop on Machine Learning and Music, Riva del Garda, Italy, 2016.


OFAI-TR-2016-05 ( 1901kB PDF file)

Centering versus Scaling for Hubness Reduction

Roman Feldbauer, Arthur Flexer

Hubs and anti-hubs are points that appear very close or very far to many other data points due to a problem of measuring distances in high-dimensional spaces. Hubness is an aspect of the curse of dimensionality affecting many machine learning tasks. We present the first large scale empirical study to compare two competing hubness reduction techniques: scaling and centering. We show that scaling consistently reduces hubness and improves nearest neighbor classification, while centering shows rather mixed results. Support vector classification is mostly unaffected by centering-based hubness reduction.

Keywords: Curse of dimensionality, Hubness, Empirical evaluation, SVM, k-NN classification

Citation: Feldbauer R., Flexer A.: Centering versus Scaling for Hubness Reduction, in Proceedings of the 25th International Conference on Artificial Neural Networks (ICANN'16), Barcelona, Spain, 2016.


OFAI-TR-2016-04

The Nexus of Science, Technology and Arts: Music, Dance, Drama, Films, Games, etc.

Robert Trappl

(invited presentation)

Keywords: Interdisciplinary research, Artificial Intelligence, The Arts

Citation: Trappl R.: The Nexus of Science, Technology and Arts: Music, Dance, Drama, Films, Games, etc., presented at the Symposium co-organised by the EC and the Biennale di Venezia, "Creation at the Nexus of Science Technology and Art", November 3-4, 2015, Ca’ Giustinian (San Marco), Venice, Italy. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2016-04,


OFAI-TR-2016-03

Designing a cross-media serious game to support the treatment of childhood obesity

Simon Mayr, Paolo Petta, Christiane Eichenberg, Brigitte Sindelar, Lev Ledit, Markus Schott

Keywords: Cross-media serious game, Aquamorra, Childhood obesity, Design criteria

Citation: Mayr S., Petta P., Eichenberg C., Sindelar B., Ledit L., Schott M. (eds.): Designing a cross-media serious game to support the treatment of childhood obesity, ISRII 8th Scientific Meeting: Technologies for a digital world: Improving health across the lifespan. 7-9 April 2016, Seattle, WA, USA, 2016.


OFAI-TR-2016-02 ( 683kB PDF file)

A serious game to treat childhood obesity

Simon Mayr, Lev Ledit, Paolo Petta, Christiane Eichenberg, Brigitte Sindelar

Serious games employ video game technology to convey serious content, facilitate learning, or initiate behavioral change. A common approach is to combine standard game mechanics with linear content expected to deliver the intended message. In contrast, we advocate an approach centered on player decisions and subjective experience, referring to innovative examples of learning by experience, both analog and digital. We present the design decisions underlying Aquamorra, a serious game to support the treatment of childhood obesity in the light of this approach.

Keywords: Serious game, Obesity therapy, Game mechanics

Citation: Mayr S., Ledit L., Petta P., Eichenberg C., Sindelar B.: A serious game to treat childhood obesity, IEEE SeGAH 2016, 4th International Conference on Serious Games and Applications for Health, May 11-13, 2016, Orlando, FL, USA, IEEE, 2016


OFAI-TR-2016-01

Rendering Expressive Performances of Musical Pieces Through Sampling from Generative Probabilistic Models

Carlos Eduardo Cancino Chacón, Maarten Grachten

The Basis Modeling (BM) framework is a state-of-the-art model for musical expression that has been used in both analysis and rendering of expressive music performances. In their current form, these models are deterministic, and thus, given a trained model, there is only one possible performance that can be generated for a given piece. By using a Bayesian framework, it is possible to produce a probabilistic interpretation of the models, and then generate performances by sampling from their predictive distributions. In this report we provide detailed derivations of the predictive distributions both the linear and non-linear versions of the BM approach

Citation: Cancino Chacón C., Grachten M.: Rendering Expressive Performances of Musical Pieces Through Sampling from Generative Probabilistic Models. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2016-01,


OFAI-TR-2015-04

Strategies for Conceptual Change in Convolutional Neural Networks

Maarten Grachten, Carlos Eduardo Cancino Chacón

A remarkable feature of human beings is their capacity for creative behavior, refering to their ability to react to problems in ways that are novel, surprising, and useful. Transformational creativity is a form of creativity where the creative behavior is induced by a transformation of the actor's conceptual space, that is, the representational system with which the actor interprets its environment. In this report, we focus on ways of adapting systems of learned representations as they switch to performing one task to performing another. We describe an experimental comparison of multiple strategies for adaptation of learned features, and evaluate how effectively each of these strategies realizes the adaptation, in terms of the amount of training, and in terms of their ability to cope with restricted availability of training data. We show, among other things, that across handwritten digits, natural images, and classical music, adaptive strategies are systematically more effective than a baseline method that starts learning from scratch.

Citation: Grachten M., Cancino Chacón C.: Strategies for Conceptual Change in Convolutional Neural Networks. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2015-04,


OFAI-TR-2015-03 ( 999kB PDF file)

Improving visualization of high-dimensional music similarity spaces

Arthur Flexer

Visualizations of music databases are a popular form of interface allowing intuitive exploration of music catalogs. They are often based on lower dimensional projections of high dimensional music similarity spaces. Such similarity spaces have already been shown to be negatively impacted by so-called hubs and anti-hubs. These are points that appear very close or very far to many other data points due to a problem of measuring distances in high-dimensional spaces. We present an empirical study on how this phenomenon impacts three popular approaches to compute two-dimensional visualizations of music databases. We also show how the negative impact of hubs and anti-hubs can be reduced by re-scaling the high dimensional spaces before low dimensional projection.

Keywords: Music information retrieval, Hubness, Visualization, Dimensionality reduction, High-dimensional data analysis

Citation: Flexer A.: Improving visualization of high-dimensional music similarity spaces, 16th International Society for Music Information Retrieval Conference, Malaga, Spain, 2015.


OFAI-TR-2015-02 ( 235kB PDF file)

The impact of hubness on music recommendation

Arthur Flexer

We review the impact of hubness, a general problem of machine learning in high-dimensional spaces, on music recommendation. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists. After reviewing the theory concerning the hubness phenomenon, we present methods which are able to decisively diminish hubness and its adverse effects in music and general multimedia datasets.

Keywords: hubness, music information retrieval, music recommendation, curse of dimensionality

Citation: Flexer A.: The impact of hubness on music recommendation, Machine Learning for Music Discovery Workshop at the 32nd International Conference on Machine Learning, Lille, France, 2015.


OFAI-TR-2015-01 ( 398kB PDF file)

The Unbalancing Effect of Hubs on K-medoids Clustering in High-Dimensional Spaces

Dominik Schnitzer, Arthur Flexer

Unbalanced cluster solutions are affected by very different cluster sizes, with some clusters being very large while others contain almost no data. We demonstrate that this phenomenon is connected to `hubness', a recently discovered general problem of machine learning in high dimensional data spaces. Hub objects have a small distance to an exceptionally large number of data points, and anti-hubs are far from all other data points. In an empirical study of K-medoids clustering we show that hubness gives rise to very unbalanced cluster sizes resulting in impaired internal and external evaluation indices. We compare three methods which reduce hubness in the distance spaces and show that with the balancing of the clusters evaluation indices improve. This is done using artificial and real data sets from diverse domains.

Keywords: Clustering, Hubness, Curse of dimensionality

Citation: Schnitzer, Dominik and Flexer, Arthur. The Unbalancing Effect of Hubs on K-medoids Clustering in High-Dimensional Spaces. In Proceedings of the International Joint Conference on Neural Networks, 2015.


OFAI-TR-2014-13 ( 366kB PDF file)

Restricted Boltzmann Machine Derivations

Jan Schlüter

This document gives detailed derivations for the central quantities of Restricted Boltzmann Machines (RBMs): The conditional distributions of visible and hidden units, and the log likelihood gradient with respect to the model parameters. It handles the standard Bernoulli-Bernoulli RBM (with binary visible and hidden units) as well as different formulations of the Gaussian-Bernoulli RBM (with real-valued visible units). It is not meant as a general introduction to RBMs, but as a supplement helping to follow the mathematics.

Keywords: Restricted Boltzmann Machines

Citation: Schlüter J.: Restricted Boltzmann Machine Derivations. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2014-13,


OFAI-TR-2014-12

Bayesian linear basis models with gaussian priors for musical expression

Carlos Eduardo Cancino Chacón, Maarten Grachten, Gerhard Widmer

We present a probabilistic linear basis model for musical expression. The model is an extension of prior work by Grachten and Widmer [Grachten and Widmer, 2012] and is a generalization of the work by Grachten et al. [Grachten et al., 2014]. By assuming the prior distribution of the model parameters to be Gaussian with arbitrary mean and covariance, this model allows for specifying musical knowledge, and for modeling multiple distinct performances of the same piece. We show that in its current state, the model performs at least on a par with the original approach.

Citation: Cancino Chacón C. E., Grachten M., Widmer G.: Bayesian linear basis models with gaussian priors for musical expression . Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2014-12,


OFAI-TR-2014-11 ( 368kB PDF file)

Authoring vs. Configuring Affective Agents for Interactive Storytelling

Stefan Rank, Steve Hoffmann, Hans-Georg Struck, Ulrike Spierling, Simon Mayr, Paolo Petta

Autonomous characters in interactive storytelling can be supported by using affective agent architectures. The configuration of most current tools for controlling agents is however implementation-specific and not tailored to the needs of authors. Based on literature review; a questionnaire evaluation of authors' preferences for character creation; and a case study of an author's conceptualization of this process, we investigate the different methods of configuration available in current agent architectures, reviewing discrepancies and matches. Given these relations, promising approaches to configuration are identified, based on: Initial inner states; `global' parameters of characters; libraries of stock characters; and selections of backstory experiences.

Keywords: Affective Characters, Interactive Storytelling, Authoring

Citation: Rank S., Hoffmann S., Struck H.-G., Spierling U., Mayr S., Petta P.: Authoring vs. Configuring Affective Agents for Interactive Storytelling, Applied Artificial Intelligence, 28(6):629-645, 2014.


OFAI-TR-2014-10 ( 34kB PDF file)

Tracking expressive performances with linear and non-linear timing models

Gerald Golka, Werner Goebl

Sensori-motor synchronization (SMS) describes the ability of humans to rhythmically coordinate movements to external stimuli. SMS can be viewed from two theoretical perspectives: According to the dynamical systems theory, SMS involves non-linear phase and period adjustments to a set of coupled internal oscillators. The information-processing approach posits linear phase and period correction of the internal timekeeper. Models derived from these theories are commonly tested in non-musical tapping experiments. In music performance tempo fluctuations are used to increase expressivity which makes tracking of musical sequences more difficult for these models. In order to improve the tracking capabilities, we propose and test an extension to both models by introducing piece-specific tempo expectations. This extension is tested on a corpus of expressively performed music. A set of symbolic performance data containing excerpts of Chopin's piano etude Op. 10 No. 3 performed by 22 professional pianists comprises the test corpus. Tempo expectations are modeled using local inter-onset intervals (averaged across performances). We test four different models: 2 (model types, linear/non-linear) x 2 (information on tempo expectations, y/n). Two methods are used: first, we optimize each model to fit individual performances so that the lowest possible timing errors (between predicted and actual note onset times) are achieved. Second, we run a crossvalidation experiment to test the generalization capabilities of the models. For this purpose each model is optimized to fit a subset (60%) of the performances and the evaluation is done on the remaining performances. This procedure is repeated multiple times with random assignments to the optimization/evaluation subsets. The experiments are currently under way. Preliminary results suggest that adding information on typical expressive timing patterns improves model predictions. We plan to further test these models in a musical synchronization task in which a human coperforms with a virtual duet partner driven by the extended models

Citation: Golka G., Goebl W.: Tracking expressive performances with linear and non-linear timing models, in Proceedings of the 13th International Conference on Music Perception and Cognition, Seoul, South Korea, 2014.


OFAI-TR-2014-09 ( 32kB PDF file)

Effects of musical expertise on audiovisual integration: Instrument-specific or generalisable?

Laura Bishop, Werner Goebl

During ensemble performance, musicians exchange auditory and visual signals that can help in synchronising with each other's actions. Musicians must integrate corresponding auditory and visual signals accurately to make use of them. Precision in audiovisual integration improves with increasing perceptual-motor expertise, perhaps because experts are better able to predict when the auditory effects of observed actions should occur. Performance expertise has been found to have a greater effect than visual expertise on musicians' prediction abilities during synchronisation tasks, with performers better able to synchronise with actions that fall within their own motor repertoires than with actions they have only ever observed. It is unclear whether the effects of expertise on audiovisual integration are likewise instrument-specific or generalisable across instruments, however. The present study investigated the potential instrument-specific effects of expertise on audiovisual integration. Expertise in playing a particular instrument was hypothesised to facilitate prediction of observed actions, increasing sensitivity to audiovisual asynchrony in that instrumental context. Ten-second clips were extracted from audio-video recordings of clarinet, piano, and violin performances, and presented to highly-skilled clarinettists, pianists, and violinists. Clips either maintained the audiovisual synchrony present in the original recording or were modified so that the video led or lagged behind the audio. Participants indicated as quickly as possible whether the audio and video channels in each clip were synchronised. Sensitivity to audiovisual asynchrony was assessed for each expertise group/stimulus instrument pairing by evaluating the mean point of subjective synchrony and the mean range of asynchronies most often rated as synchronised (i.e. temporal integration window; TIW). Though participants across expertise groups detected asynchronies most readily in piano and least readily in violin stimuli, pianists performed significantly better for piano than for clarinet or violin. A relationship between musical training and TIW was also observed with data pooled across stimuli. Thus, sensitivity to audiovisual asynchrony improved generally with increasing expertise, and only pianists showed facilitation for their own instrument. Sensitivity to audiovisual asynchrony was affected by musical training and the nature of sound-producing movements observed. The results suggest that, to some extent, the effects of performance expertise can be instrument-specific, though they may generalise across instrumental contexts more readily during audiovisual asynchrony detection tasks than during synchronisation tasks, when overt, precisely-timed movements are required.

Citation: Bishop L., Goebl W.: Effects of musical expertise on audiovisual integration: Instrument-specific or generalisable?, in Proceedings of the 13th International Conference on Music Perception and Cognition, Seoul, South Korea, 2014.


OFAI-TR-2014-08 ( 1904kB PDF file)

Context-specific effects of musical expertise on audiovisual integration

Laura Bishop, Werner Goebl

Ensemble musicians exchange auditory and visual signals that can facilitate interpersonal synchronisation. Musical expertise improves how precisely auditory and visual signals are perceptually integrated and increases sensitivity to asynchrony between them. Whether expertise improves sensitivity to audiovisual asynchrony in all instrumental contexts or only in those using sound-producing gestures that are within an observer's own motor repertoire is unclear. This study tested the hypothesis that musicians are more sensitive to audiovisual asynchrony in performances featuring their own instrument than in performances featuring other instruments. Short clips were extracted from audio-video recordings of clarinet, piano, and violin performances and presented to highly-skilled clarinettists, pianists, and violinists. Clips either maintained the audiovisual synchrony present in the original recording or were modified so that the video led or lagged behind the audio. Participants indicated whether the audio and video channels in each clip were synchronised. The range of asynchronies most often endorsed as synchronised was assessed as a measure of participants' sensitivities to audiovisual asynchrony. A positive relationship was observed between musical training and sensitivity, with data pooled across stimuli. While participants across expertise groups detected asynchronies most readily in piano stimuli and least readily in violin stimuli, pianists showed significantly better performance for piano stimuli than for either clarinet or violin. These findings suggest that, to an extent, the effects of expertise on audiovisual integration can be instrument-specific; however, the nature of the sound-producing gestures that are observed has a substantial effect on how readily asynchrony is detected as well.

Keywords: audiovisual integration, action prediction, music ensembles

Citation: Bishop L., Goebl W.: Context-specific effects of musical expertise on audiovisual integration, Frontiers in Cognitive Science, 5, 1123, 2014.


OFAI-TR-2014-07 ( 598kB PDF file)

Exploring Inter- and Intra-speaker Variability in Multi-modal Task Descriptions

Stephanie Schreitter, Brigitte Krenn

In natural human-human task descriptions, the verbal and the non-verbal parts of communication together comprise the information necessary for understanding. When robots are to learn tasks from humans in the future, the detection and integrated interpretation of both of these cues is decisive. In the present paper, we present a qualitative study on essential verbal and non-verbal cues by means of which information is transmitted during explaining and showing a task to a learner. In order to collect a respective data set for further investigation, 16 (human) teachers explained to a human learner how to mount a tube in a box with holdings, and six teachers did this to a robot learner. Detailed multi-modal analysis revealed that in both conditions, information was more reliable when transmitted via verbal and gestural references to the visual scene and via eye gaze than via the actual wording. In particular, intra-speaker variability in wording and perspective taking by the teacher potentially hinders understanding of the learner. The results presented in this paper emphasize the importance of investigating the inherently multi-modal nature of how humans structure and transmit information in order to derive respective computational models for robot learners.

Citation: Schreitter S., Krenn B.: Exploring Inter- and Intra-speaker Variability in Multi-modal Task Descriptions. In Proceedings of the 17th IEEE International Symposium on the Robot and Human Interactive Communication, (Ro-Man 2014), 2014.


OFAI-TR-2014-06 ( 150kB PDF file)

On inter-rater agreement in audio music similarity

Arthur Flexer

One of the central tasks in the annual MIREX evaluation campaign is the "Audio Music Similarity and Retrieval (AMS)" task. Songs which are ranked as being highly similar by algorithms are evaluated by human graders as to how similar they are according to their subjective judgment. By analyzing results from the AMS tasks of the years 2006 to 2013 we demonstrate that: (i) due to low inter-rater agreement there exists an upper bound of performance in terms of subjective gradings; (ii) this upper bound has already been achieved by participating algorithms in 2009 and not been surpassed since then. Based on this sobering result we discuss ways to improve future evaluations of audio music similarity.

Keywords: music information retrieval, audio similarity, evaluation, rater agreement

Citation: Flexer A.: On inter-rater agreement in audio music similarity, Proceedings of the 15th International Society for Music Information Retrieval Conference, Taipei, Taiwan, 2014.


OFAI-TR-2014-05 ( 575kB PDF file)

Design Pattern Canvas: Towards Co-Creation of Unified Serious Game Design Patterns

Gregor Žavcer, Simon Mayr, Paolo Petta

(Best Poster award) We introduce the Design Pattern Canvas, a visual tool for alignment and decomposition of game designer activities.

Keywords: design methodology, design patterns, serious games, co-creation

Citation: Gregor Žavcer, Simon Mayr, Paolo Petta: Design Pattern Canvas: Towards Co-Creation of Unified Serious Game Design Patterns, to appear in: Camilleri V., Dingli A., Montebello M. (eds.) Proceedings of the Sixth International Conference on Virtual Worlds and Games for Serious Applications: VS-Games 2014, September 9-12, 2014, Malta, EU, IEEE Press.


OFAI-TR-2014-04 ( 135kB PDF file)

The Trauma Treatment Game: Scientific, methodological and ethical challenges in developing Serious Games for Psychotherapy

Simon Mayr, Wolfgang Hörleinsberger, Paolo Petta

Serious games deliver interactive worlds in support of a wide range of application areas. Addressing the current paucity of scientific empirical studies in game-based psychotherapy, we address scientific and methodological challenges and their implications for the design of serious games for this domain. We do so in the context of a comprehensive multistage design process that preceded the final game concept of the ``Trauma Treatment Game'', a serious game to support individualised interventions to children of age eight to twelve suffering from trauma.

Keywords: Serious games, Methodology, Ethics, Psychotherapy

Citation: to appear in: Camilleri V., Dingli A., Montebello M. (eds.) Proceedings of the Sixth International Conference on Virtual Worlds and Games for Serious Applications: VS-Games 2014, September 9-12, 2014, Malta, EU.


OFAI-TR-2014-03 ( 120kB PDF file)

Choosing the Metric in High-Dimensional Spaces Based on Hub Analysis

Dominik Schnitzer, Arthur Flexer

To avoid the undesired effects of distance concentration in high-dimensional spaces, previous work has already advocated the use of fractional ℓp norms instead of the ubiquitous Euclidean norm. Closely related to concentration is the emergence of hub and anti-hub objects. Hub objects have a small distance to an exceptionally large number of data points while anti-hubs lie far from all other data points. The contribution of this work is an empirical examination of concentration and hubness, resulting in an unsupervised approach for choosing an ℓp norm by minimizing hubs while simultaneously maximizing nearest neighbor classification.

Keywords: Hubness, Concentration of distances, High-dimensional data analysis

Citation: Schnitzer D., Flexer A.: Choosing the Metric in High-Dimensional Spaces Based on Hub Analysis, in Proceedings of the 22nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, 2014.


OFAI-TR-2014-02 ( 344kB PDF file)

Improving Neighborhood-Based Collaborative Filtering by Reducing Hubness

Peter Knees, Dominik Schnitzer, Arthur Flexer

For recommending multimedia items, collaborative filtering (CF) denotes the technique of automatically predicting a user's rating or preference for an item by exploiting item preferences of a (large) group of other users. In traditional memory-based (or neighborhood-based) recommenders, this is accomplished by, first, selecting a number of similar users (or items) and, second, combining their ratings into a single user's predicted rating for an item. Strategies for both defi ning similarity (i.e., to identify nearest neighbors) and for combining ratings (i.e., to weight their impact) have been extensively studied and even resulted in inconsistent findings. In this paper, we investigate the eff ects of the high dimensionality of useritem matrices on the quality of memorybased movie rating prediction. By examining several publicly available real-world CF data sets, we show that the step of nearest neighbor selection is a ffected by the phenomena of similarity concentration and hub occurrence due to highdimensional data spaces and the class of similarity measures used. To mitigate this, we adapt a normalization technique called mutual proximity that has been shown to reduce these e ffects in classi cation tasks. Finally, we show that removing hubs and incorporating normalized similarity values into the neighbor weighting step leads to increased rating prediction accuracy, observable on all examined data sets in terms of lowered error measure (RMSE).

Keywords: Hubness, Collaborative Filtering, Concentration of distances,

Citation: Knees P., Schnitzer D., Flexer A.: Improving Neighborhood-Based Collaborative Filtering by Reducing Hubness, ACM International Conference on Multimedia Retrieval (ICMR), 2014.


OFAI-TR-2014-01 ( 201kB PDF file)

A Case for Hubness Removal in High-Dimensional Multimedia Retrieval

Dominik Schnitzer, Arthur Flexer, Nenad Tomasev

This work investigates the negative e ffects of hubness on multimedia retrieval systems. Because of a problem of measuring distances in high-dimensional spaces, hub objects are close to an exceptionally large part of the data while anti-hubs are far away from all other data points. In the case of similarity based retrieval, hub objects are retrieved over and over again while anti-hubs are nonexistent in the retrieval lists. We investigate textual, image and music data and show how re-scaling methods can avoid the problem and decisively improve the overall retrieval quality. The observations of this work suggest to make hubness analysis an integral part when building a retrieval system.

Keywords: multimedia, information retrieval, hubness, curse of dimensionality

Citation: Schnitzer D., Flexer A., Tomasev N.: A Case for Hubness Removal in High-Dimensional Multimedia Retrieval, Proceedings of the 36th European Conference on Information Retrieval (ECIR), 2014.


OFAI-TR-2013-05 ( 409kB PDF file)

Phenomena in conveying information during oral task descriptions

Stephanie Schreitter, Brigitte Krenn

A robot has to deal with a broad variety of information conveyed via verbal and non-verbal channels to be able to observe and listen to a task presented by a human teacher. We have collected a small corpus of human-human dyads to investigate how information is presented through verbal and/or visual channels. Apart from the characteristics of spoken language, the qualitative analysis of the data shows: (i) broad variation in wording regarding objects and actions, as well as omissions of lexical referents, (ii) patterns of use of verbal references and/or communicative gestures for directing the attention of the learner, (iii) a temporal structuring of the task by verbal means for all teachers, and (iv) the use of generic "you" for most of the teachers.

Citation: Schreitter S., Krenn B.: Phenomena in conveying information during oral task descriptions, Workshop on Embodied Communication of Goals and Intentions collocated with ICSR 2013, Bistol, United Kingdom, October, 2013.


OFAI-TR-2013-04 ( 411kB PDF file)

Can Shared Nearest Neighbors Reduce Hubness in High-Dimensional Spaces?

Arthur Flexer, Dominik Schnitzer

'Hubness' is a recently discovered general problem of machine learning in high dimensional data spaces. Hub objects have a small distance to an exceptionally large number of data points, and anti-hubs are far from all other data points. It is related to the concentration of distances which impairs the contrast of distances in high dimensional spaces. Computation of secondary distances inspired by shared nearest neighbor (SNN) approaches has been shown to reduce hubness and concentration and there already exists some work on direct application of SNN in the context of hubness in image recognition. This study applies SNN to a larger number of high dimensional real world data sets from diverse domains and compares it to two other secondary distance approaches (local scaling and mutual proximity). SNN is shown to reduce hubness but less than other approaches and, contrary to its competitors, it is only able to improve classification accuracy for half of the data sets.

Keywords: Machine Learning, High-dimensional data, Hubness, Curse of dimensionality

Citation: Proceedings of 1st International Workshop on High Dimensional Data Mining (HDM), in conjunction with the IEEE International Conference on Data Mining (IEEE ICDM 2013), Dallas, Texas


OFAI-TR-2013-03 ( 409kB PDF file)

Phenomena in conveying information during oral task descriptions

Stephanie Schreitter, Brigitte Krenn

A robot has to deal with a broad variety of information conveyed via verbal and non-verbal channels to be able to observe and listen to a task presented by a human teacher. We have collected a small corpus of human-human dyads to investigate how information is presented through verbal and/or visual channels. Apart from the characteristics of spoken language, the qualitative analysis of the data shows: (i) broad variation in wording regarding objects and actions, as well as omissions of lexical referents, (ii) patterns of use of verbal references and/or communicative gestures for directing the attention of the learner, (iii) a temporal structuring of the task by verbal means for all teachers, and (iv) the use of generic "you" for most of the teachers.

Citation: Schreitter S., Krenn B.: Phenomena in conveying information during oral task descriptions. In Proceedings of the 1st Workshop on Embodied Communication of Goals and Intentions collocated with ICSR 2013, October 27, 2013.


OFAI-TR-2013-02 ( 403kB PDF file)

Corpus annotation employing a cognitive framework of incremental language understanding

Stephanie Schreitter, Brigitte Krenn

With the overall goal to enable a robot to learn the connection between the sensory-motor and language levels in a task-driven context, we developed annotation guidelines to account for the multimodal complexity of oral task-oriented communication. In this paper, we present the results of utilizing a theoretical framework of embodied language comprehension in humans. An annotation scheme was developed for task-oriented multimodal interaction on a small corpus comprising 20 short dialogues of one human explaining a task to another human.

Keywords: multimodal corpora, embodied language processing, oral communication

Citation: Schreitter S., Krenn B.: Corpus annotation employing a cognitive framework of incremental language understanding, in Proceedings of the 9th Workshop on Multimodal Corpora collocated with IVA 2013, Edinburgh, Scotland, September 1, 2013.


OFAI-TR-2013-01 ( 169kB PDF file)

Using mutual proximity for novelty detection in audio music similarity

Arthur Flexer, Dominik Schnitzer

Mutual proximity rescales distance spaces to avoid negative e ffects of the curse of dimensionality. It results in probabilistic estimates of the proximity of data objects. We use these probabilities directly for novelty detection, i.e. the automatic identi cation of unknown data not covered by training data (e.g. a new genre in genre classi fication). Comparing this new approach with a distance based detection method we demonstrate improved performance on a standard music data set.

Keywords: music information retrieval, outlier detection, hubness, curse of dimensioanlity

Citation: Flexer A., Schnitzer D.: Using mutual proximity for novelty detection in audio music similarity, in Procceedings of the 6th International Workshop on Machine Learning and Music, Prague, Czech Republic, 2013.


OFAI-TR-2012-17 ( 281kB PDF file)

The Relation of Hubs to the Doddington Zoo in Speaker Verification

Dominik Schnitzer, Arthur Flexer, Jan Schlueter

In speaker verification systems there exists the well-known phenomenon of speakers which are very problematic to verify and have been given various metaphoric animal names. Our work connects this so-called 'Doddington zoo' and the animals of the whole 'biometric menagerie' to the problem of 'hubs' in high dimensional data spaces, which was recently the topic of a number of publications in the machine learning literature. Due to a general problem of measuring distances in high dimensional data spaces, hub objects emerge which have a high similarity to a large number of data items. This is a novel aspect of the 'curse of dimensionality' which adversely affects classification and identification performance. In a series of experiments we try to understand the 'Doddington zoo' problem with respect to the notions of hubs and anti-hubs.

Keywords: Speaker Verification, Hubs, Normalization, Machine Learning

Citation: Schnitzer D., Flexer A., Schlüter J.: The Relation of Hubs to the Doddington Zoo in Speaker Verification. Technical Report, Proceedings of the 21st European Signal Processing Conference (EUSIPCO'2013), September 9-13, Marrakech, Morocco, 2013.


OFAI-TR-2012-16 ( 1946kB PDF file)

Structure and stability of online chat networks built on emotion-carrying links

Vladimir Gligorijevic, Marcin Skowron, Bosiljka Tadic

High-resolution data of online chats are studied as a physical system in the laboratory in order to quantify collective behavior of users. Our analysis reveals strong regularities characteristic of natural systems with additional features. In particular, we find self-organized dynamics with long-range correlations in user actions and persistent associations among users that have the properties of a social network. Furthermore, the evolution of the graph and its architecture with specific k-core structure are shown to be related with the type and the emotion arousal of exchanged messages. Partitioning of the graph by deletion of the links which carry high arousal messages exhibits critical fluctuations at the percolation threshold.

Keywords: Publications List Interact, Social structure emerges in online chats, Users associate by emotion-carrying messages, Physics and computer science reveal new dimension of user behaviors

Citation: Gligorijevic V., Skowron M., Tadic B.: Structure and stability of online chat networks built on emotion-carrying links. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-16,


OFAI-TR-2012-15 ( 380kB PDF file)

Local and Global Scaling Reduce Hubs in Space

Dominik Schnitzer, Arthur Flexer, Markus Schedl, Gerhard Widmer

"Hubness" has recently been identified as a general problem of high dimensional data spaces, manifesting itself in the emergence of objects, so-called hubs, which tend to be among the k nearest neighbors of a large number of data items. As a consequence many nearest neighbor relations in the distance space are asymmetric, that is, object y is amongst the nearest neighbors of x but not vice versa. The work presented here discusses two classes of methods that try to symmetrize nearest neighbor relations and investigates to what extent they can mitigate the negative effects of hubs. We evaluate local distance scaling and propose a global variant which has the advantage of being easy to approximate for large datasets and of having a probabilistic interpretation. Both local and global approaches are shown to be effective especially for high-dimensional datasets, which are affected by high hubness. Both methods lead to a strong decrease of hubness in these datasets, while at the same time improving properties like classification accuracy. We evaluate the methods on a large number of public machine learning datasets and synthetic data. Finally we present a real-world application where we are able to achieve significantly higher retrieval quality.

Keywords: local and global scaling, shared near neighbors, hubness, classification, curse of dimensionality, nearest neighbor relation

Citation: Schnitzer D., Flexer A., Schedl M., Widmer G.: Local and Global Scaling Reduce Hubs in Space, Journal of Machine Learning Research, 13(Oct):2871-2902, 2012.


OFAI-TR-2012-14 ( 14136kB PDF file)

Evolving Topology on the Network of Online Chats

Vladimir Gligorijevic, Marcin Skowron, Bosiljka Tadic

Large amount of data collected at Web portals contain valuable information to study human behavior in the on-line communications. Recently a powerful methodology was developed to study the emergence of the collective emotional behaviors of Blog users, by combining the methods of statistical physics of complex systems with the machine-learning techniques for text analysis. Mapping the high-resolution data onto a suitable network structure makes a starting point in this approach, on which the quantitative analysis within the graph theory is based. In this work we use network mapping approach to analyse the users collective behaviors in the online chats. Specifically, having in mind character of the dynamics in IRC channels, here we analyse the evolution of the network that emerges via user contacts and in particular, evolving specific topology features on such network over successive time windows.

Keywords: Publications List Interact, Social and Information Networks, Data Analysis, Text Analysis, Online Communication Affective Computing

Citation: Gligorijevic V., Skowron M., Tadic B.: Evolving Topology on the Network of Online Chats. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-14,


OFAI-TR-2012-13 ( 377kB PDF file)

Affect Listeners - From dyads to group interactions with affective dialog systems

Marcin Skowron, Stefan Rank

Affect Listeners are applied as tools for studying the role of emotions in online communication. They need to interact both in dyads as well as in group settings with multiple users. In this paper, we present the evolution of such affective dialog systems from a focus on dyadic interaction to multi-party interaction on chat networks. Starting from experiments on the use of these dialog systems in virtual dyadic settings, we outline the requirements, design and implementation decisions necessary to apply the systems to affective interactions with multiple users. Finally, we introduce two realisations of Interactive Affective Bots designed for such interaction scenarios that integrate modelling of individuals and groups as part of their decision mechanism.

Keywords: Publications List Interact, affective dialog system, affective human-computer interactions, agent control architecture

Citation: Skowron M., Rank S.: Affect Listeners - From dyads to group interactions with affective dialog systems. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-13,


OFAI-TR-2012-12 ( 1110kB PDF file)

Entropy-growth-based model of emotionally charged online dialogues

Julian Sienkiewicz, Marcin Skowron, Georgios Paltoglou, Janusz Holyst

We analyze emotionally annotated massive data from IRC (Internet Relay Chat) and model the dialogues between its participants by assuming that the driving force for the discussion is the entropy growth of emotional probability distribution. This process is claimed to be correlated to the emergence of the power-law distribution of the discussion lengths observed in the dialogues. We perform numerical simulations based on the noticed phenomenon obtaining a good agreement with the real data. Finally, we propose a method to artificially prolong the duration of the discussion that relies on the entropy of emotional probability distribution.

Keywords: Publications List Interact, Computation and Language, Social and Information Networks, Data Analysis, Statistics and Probability, Physics and Society

Citation: Sienkiewicz J., Skowron M., Paltoglou G., Holyst J.: Entropy-growth-based model of emotionally charged online dialogues. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-12,


OFAI-TR-2012-11 ( 762kB PDF file)

Unsupervised Feature Learning for Speech and Music Detection in Radio Broadcasts

Jan Schlueter, Reinhard Sonnleitner

Detecting speech and music is an elementary step in extracting information from radio broadcasts. Existing solutions either rely on general-purpose audio features, or build on features specifically engineered for the task. Interpreting spectrograms as images, we can apply unsupervised feature learning methods from computer vision instead. In this work, we show that features learned by a mean-covariance Restricted Boltzmann Machine partly resemble engineered features, but outperform three hand-crafted feature sets in speech and music detection on a large corpus of radio recordings. Our results demonstrate that unsupervised learning is a powerful alternative to knowledge engineering.

Keywords: Music Information Retrieval,

Citation: Schlueter J., Sonnleitner R.: Unsupervised Feature Learning for Speech and Music Detection in Radio Broadcasts, in Proceedings of the 15th International Conference on Digital Audio Effects (DAFx-12), York, UK, 2012.


OFAI-TR-2012-10 ( 190kB PDF file)

Putting the User in the Center of Music Information Retrieval

Markus Schedl, Arthur Flexer

Personalized and context-aware music retrieval and recommendation algorithms ideally provide music that perfectly fits the individual listener in each imaginable situation and for each of her information or entertainment need. Although first steps towards such systems have recently been presented at ISMIR and similar venues, this vision is still far away from being a reality. In this paper, we investigate and discuss literature on the topic of user-centric music retrieval and reflect on why the breakthrough in this field has not been achieved yet. Given the different expertises of the authors, we shed light on why this topic is a particularly challenging one, taking a psychological and a computer science view. Whereas the psychological point of view is mainly concerned with proper experimental design, the computer science aspect centers on modeling and machine learning problems. We further present our ideas on aspects vital to consider when elaborating user-aware music retrieval systems, and we also describe promising evaluation methodologies, since accurately evaluating personalized systems is a notably challenging task.

Keywords: Music Information Retrieval, Evaluation, User studies

Citation: Schedl M., Flexer A.: Putting the User in the Center of Music Information Retrieval, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR'12), Porto, Portugal, October 8th-12th, 2012.


OFAI-TR-2012-09

Interactive Entertainment of Elder Persons using Intelligent and Emotional Software Agents

Lisa Szugfil, Robert Trappl

This project tried to broaden the scope of classical digital games for elderly people by developing a game which takes social and emotional aspects into account, gives elderly people the possibility to bring their own experience into the game and puts cognitive training into context. A modified version of the classical memory game was developed, in which a human played against an emotional software agent. An experiment with eighteen participants (Mage = 84.33 years) examined the influence of the game-type on the perception of and the interaction with the software agent. Furthermore the perception of the playing speed of the counter player was investigated. The results showed significantly more comments towards the software agent when playing a personalized memory game, than when playing the classical memory game. In addition, the mirrored game speed of the software agent was evaluated as being faster than the human player's own playing speed but also as optimal by the participants.

Keywords: digital game, elderly people, cognitive training in context, memory, software agent, emotions, playing speed, table top

Citation: Szugfil L., Trappl R.: Interactive Entertainment of Elder Persons using Intelligent and Emotional Software Agents. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-09,


OFAI-TR-2012-08 ( 1056kB PDF file)

The Hippocampal-Entorhinal Complex performs Bayesian Localization and Error Correction

T Madl, S Franklin, K Chen, D Montaldi, R Trappl

The mammalian brain updates representations of spatial location with self-motion cues, a process referred to as path integration. Since self-motion information is inherently inexact and subject to neuronal noise, this process leads to errors, which would accumulate over time if not corrected by sensory information about the environment. In this paper, we propose that the hippocampal-entorhinal complex, the major neuronal correlate representing spatial information, corrects such errors by integrating self-motion information and sensory information about the environment in a Bayes-optimal manner. Based on theoretical arguments as well as empirical data, we propose that hippocampal place cells are able to encode probability distributions and uncertainties of allocentric spatial location, and to use them for Bayesian inference to improve the accuracy of the location representation using different sources of information. We hypothesize about possible neuronal correlates of the components and processes required for such inference. Unlike most previously suggested error correction and spatial cue integration mechanisms, we not only provide a plausible neuronal basis for these mechanisms but also generate concrete predictions from our hypotheses and substantiate them with empirical data. We describe a computational model performing Bayesian localization in arbitrary two-dimensional environments in a biologically plausible way, and use it to replicate neuronal recording data as well as behaviour data in published studies in order to strengthen our claims. Our ideas tie in with a growing body of research suggesting that the brain might behave like a Bayesian machine (the Bayesian brain hypothesis [1]), and provides empirical evidence suggesting that it might employ Bayesian processes on the level of neuronal implementation.

Citation: Madl T., Franklin S., Chen K., Montaldi D., Trappl R.: The Hippocampal-Entorhinal Complex performs Bayesian Localization and Error Correction. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-08,


OFAI-TR-2012-07 ( 290kB PDF file)

Persistent Empirical Wiener Estimation With Adaptive Threshold Selection For Audio Denoising

Kai Siedenburg

Exploiting the persistence properties of signals leads to significant improvements in audio denoising. This contribution derives a novel denoising operator based on neighborhood smoothed, Wiener filter like shrinkage. Relations to the sparse denoising approach via thresholding are drawn. Further, a rationale for adapting the threshold level to a performance criterion is developed. Using a simple but efficient estimator of the noise level, the introduced operators with adaptive thresholds are demonstrated to act as attractive alternatives to the state of the art in audio denoising.

Keywords: Audio denoising

Citation: Siedenburg K.: Persistent Empirical Wiener Estimation With Adaptive Threshold Selection For Audio Denoising, Proceedings of the 9th Sound and Music Computing Conference (SMC 2012), Copenhagen, Denmark, 2012.


OFAI-TR-2012-06 ( 139kB PDF file)

A MIREX meta-analysis of hubness in audio music similarity

Arthur Flexer, Dominik Schnitzer, Jan Schlueter

We use results from the 2011 MIREX ``Audio Music Similarity and Retrieval'' task for a meta analysis of the hub phenomenon. Hub songs appear similar to an undesirably high number of other songs due to a problem of measuring distances in high dimensional spaces. Comparing 17 algorithms we are able to confirm that different algorithms produce very different degrees of hubness. We also show that hub songs exhibit less perceptual similarity to the songs they are close to, according to an audio similarity function, than non-hub songs. Application of the recently introduced method of ``mutual proximity'' is able to decisively improve this situation.

Keywords: Music Information Retrieval, Hubs

Citation: Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR'12), Porto, Portugal, October 8th-12th, 2012


OFAI-TR-2012-05 ( 1818kB PDF file)

Constructing high-level perceptual audio descriptors for textural sounds

Thomas Grill

This paper describes the construction of computable audio descriptors capable of modeling relevant high-level perceptual qualities of textural sounds. These qualities - all metaphoric bipolar and continuous constructs - have been identified in previous research: high-low, ordered-chaotic, smooth-coarse, tonal-noisy, and homogeneous-heterogeneous, covering timbral, temporal and structural properties of sound. We detail the construction of the descriptors and demonstrate the effects of tuning with respect to individual accuracy or mutual independence. The descriptors are evaluated on a corpus of 100 textural sounds against respective measures of human perception that have been retrieved by use of an online survey. Potential future use of perceptual audio descriptors in music creation is illustrated by a prototypic sound browser application.

Keywords: Music Information Retrieval, Audio descriptor, Perception

Citation: Grill T.: Constructing high-level perceptual audio descriptors for textural sounds, Proceedings of the 9th Sound and Music Computing Conference (SMC 2012), pp. 486-493, Copenhagen, Denmark, 2012


OFAI-TR-2012-04 ( 1973kB PDF file)

Visualization of perceptual qualities in textural sounds

Thomas Grill, Arthur Flexer

We describe a visualization strategy that is capable of efficiently representing relevant perceptual qualities of textural sounds. The general aim is to develop intuitive screen-based interfaces representing large collections of sounds, where sound retrieval shall be much facilitated by the exploitation of cross-modal mechanisms of human perception. We propose the use of metaphoric sensory properties that are shared between sounds and graphics, constructing a meaningful mapping of auditory to visual dimensions. For this purpose, we have implemented a visualization using tiled maps, essentially combining low-dimensional projection and iconic representation. To prove the suitability we show detailed results of experiments having been conducted in the form of an online survey. Potential future use in music creation is illustrated by a prototypic sound browser application.

Keywords: Music Information Retrieval, Visualization, Perception

Citation: Grill T., Flexer A.: Visualization of perceptual qualities in textural sounds, Proceedings of the International Computer Music Conference (ICMC 2012), Ljubljana, Slovenia, 2012


OFAI-TR-2012-03 ( 1633kB PDF file)

Emotional persistence in online chatting communities

Antonios Garas, David Garcia, Marcin Skowron, Frank Schweitzer

How do users behave in online chatrooms, where they instantaneously read and write posts? We analyzed about 2.5 million posts covering various topics in Internet relay channels, and found that user activity patterns follow known power-law and stretched exponential distributions, indicating that online chat activity is not different from other forms of communication. Analysing the emotional expressions (positive, negative, neutral) of users, we revealed a remarkable persistence both for individual users and channels. I.e. despite their anonymity, users tend to follow social norms in repeated interactions in online chats, which results in a specific emotional “tone” of the channels. We provide an agent-based model of emotional interaction, which recovers qualitatively both the activity patterns in chatrooms and the emotional persistence of users and channels. While our assumptions about agent's emotional expressions are rooted in psychology, the model allows to test different hypothesis regarding their emotional impact in online communication.

Keywords: Publications List Interact, applied physics, statistical physics, modelling and theory, text analysis, online communication, affective computing

Citation: Nature - Scientific Reports, 2, 402, doi:10.1038/srep00402


OFAI-TR-2012-02 ( 337kB PDF file)

Creativity in Configuring Affective Agents for Interactive Storytelling

Stefan Rank, Steve Hoffmann, Hans-Georg Struck, Ulrike Spierling, Paolo Petta

Affective agent architectures can be used as control components in Interactive Storytelling systems for artificial autonomous characters. Creative authoring for such systems then involves configuration of these agents that translate part of the creative process to the systemÂ’s runtime, necessarily constrained by the capabilities of the specific implementation. Using a framework for presenting configuration options based on literature review; a questionnaire evaluation of authorsÂ’ preferences for character creation; and a case study of an authorÂ’s conceptualisation of the creative process, we categorise available and potential methods for configuring affective agents in existing systems regarding creative exploration. Finally, we present work-in-progress on exemplifying the different options in the ActAffAct system.

Keywords: Creativity, Authoring, Interactive Storytelling, Affective Characters

Citation: Rank S., Hoffmann S., Struck H., Spierling U., Petta P. (2012) Creativity in Configuring Affective Agents for Interactive Storytelling. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-02; to appear in: Proceedings, International Conference on Computational Creativity, May 30-June 1, University College Dublin, Dublin Ireland, 2012, to appear.


OFAI-TR-2012-01 ( 399kB PDF file)

Towards the development of a conceptual framework for an applied theory of problem structuring for complex agents: Questions to Luhmann's Social System Theory

Karl Neumayer, Paolo Petta

This extended abstract provides a snapshot of the current status of our efforts aimed at the development of a principled approach to corporate strategy consulting. This research is motivated by the need to improve the quality of strategic decision making of enterprises as complex agents. To this end, we take a step back and propose a paradigmatic reconceptualisation of the foundations of decision making in terms of processes underlying Problem Structuring, with implications in particular for the identity of complex agents, the notion of rationality, as well as the shaping of decision processes. The two interrelated main components are the transpersonal Weinhaus conceptual modelling framework and a structured method for the development, implementation, and verification of sound interventions. A key guideline is our aim to enable the identification of relevant, practical, and verifiable interventions. Against this body of work, we can formulate a number of candidate questions to Social Systems Theory to discuss at the Symposium, so as to: critically review our achievements and ascertain the scope of applicability of our model, identify directions and means of improvements, look for answers to open challenges, and understand the potential for a reformulation in Social Systems Theory terms.

Keywords: Corporate strategy consulting, Enterprise modelling, Transpersonal modelling, Action theory, Theory of social systems, Agent-based modelling, Business modelling,

Citation: Neumayer K., Petta P.: Towards the development of a conceptual framework for an applied theory of problem structuring for complex agents: Questions to Luhmann's Social System Theory. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2012-01.
(Extended version of the extended abstract appearing in the proceedings of the 21st European Meeting on Cybernetics and Systems Research (EMCSR 2012), April 10-13, Vienna, Austria (EU), BCSSS Bertalanffy Center for the Study of Systems Science, Vienna, Austria (EU))


OFAI-TR-2011-25 ( 497kB PDF file)

On the Nature of Engineering Social Artificial Companions

Dirk Heylen, Rieks op den Akker, Mark ter Maat, Paolo Petta, Stefan Rank, Dennis Reidsma, Job Zwiers

The literature on social agents has put forward a number of requirements that social agents need to fulfill. In this paper we analyze the kinds of reasons and motivations that lie behind the statement of these requirements. In a second part of the paper, we look at how one can go about engineering the social agents. We introduce a general language in which to express dialogue rules and some tools that support the development of dialogue systems.

Keywords: Social Robots, Artificial Companions, Cognitive Engineering, Software Engineering

Citation: Heylen D., op den Akker R., Maat M.ter, Petta P., Rank S., Reidsma D., Zwiers J. (2011) On the nature of engineering social artificial companions, Applied Artificial Intelligence 25(6):549-574. DOI: 10.1080/08839514.2011.587156


OFAI-TR-2011-24 ( 139kB PDF file)

ICIDS 2011 Workshop: Sharing Interactive Digital Storytelling Technologies

Nicolas Szilas, Thomas Boggini, Paolo Petta

This workshop was organized around three types of contributions:
  • Technology providers, with contributions developed by their research labs or companies available for sharing;
  • Software integrators, with visions on how to technically organize the sharing of IDS-related components and with success and flop stories of community processes;
  • Users, with needs and intentions to use third-party IDS components and middleware within their own scientific, product, and/or artistic development efforts.
     
Online site: http://icids.org/sharing

Keywords: Interactive Digital Storytelling, Component technologies, Software Sharing

Citation: Szilas N., Boggini T., Petta P. (2011) ICIDS 2011 Workshop: Sharing Interactive Digital Storytelling Technologies, in: M. Si et al. (Eds.): ICIDS 2011, LNCS 7069, Springer, Berlin / Heidelberg, pp. 366-367.
doi: 10.1007/978-3-642-25289-1_52.


OFAI-TR-2011-23 ( 186kB PDF file)

Specification of an Open Architecture for Interactive Storytelling

Nicolas Szilas, Thomas Boggini, Monica Axelrad, Paolo Petta, Stefan Rank

his article introduces OPARIS, an OPen ARchitecture for Interactive Storytelling, which aims at facilitating and fostering the integration of various and heterogeneous Interactive Storytelling components. It is based on a modular decomposition of functionalities and a specification of the various messages that different modules exchange with each other.

Keywords: Interactive Digital Storytelling, Interactive Narrative, Software Architecture, Narrative Engine, Behaviour Engine, Animation Engine

Citation: Szilas N., Boggini T., Axelrad, M., Petta, P., Rank, S. (2011) Specification of an Open Architecture for Interactive Storytelling, in: M. Si et al. (Eds.): ICIDS 2011, LNCS 7069, Springer, Berlin / Heidelberg, pp. 330--333 doi:  10.1007/978-3-642-25289-1_41


OFAI-TR-2011-22 ( 395kB PDF file)

A Music Engine for Interactive Drama

Nicolas Szilas, Marcos Aristides, Paolo Petta

A number of Interactive Drama prototypes have been created during the last decade. These Artificial Intelligence-based systems usually aim at enabling the user to drive the story as the main character. Despite the acknowledged role of sound and music in visual narrative, almost none of these prototypes includes interactive background music. In this paper, a Music Engine for the IDtension narrative engine is proposed that is able to adapt in real-time to current user's action and narrative states. In the lineage of the branching music approach developed in some video games, the Music Engine being developed in Max/MSP uses a pre-composed graph-based score to enrich the whole interactive narrative experience. In particular, the reactivity of the Music Engine is aimed at corroborating the userÂ’s subjective feeling of agency, and thereby at enhancing the experience of Interactive Drama systemÂ’s main components---user interface, narrative engine, and the theatre---as an integrated whole.

Keywords: Interactive Drama, Interactive Digital Storytelling, Music Engine

Citation: Nicolas Szilas, Marcos Aristides and Paolo Petta (2011) A Music Engine for Interactive Drama, in: Licinio Roque and Valter Alves (eds.) Proceedings of the 6th Audio Mostly Conference: A Conference on Interaction with Sound (AM '11), Coimbra, Portugal, ACM Press, 2011. Poster.


OFAI-TR-2011-21 ( 271kB PDF file)

A survey of research work in computer science and cognitive science dedicated to the modeling of reactive human behaviors

Stéphane Donikian, Paolo Petta

Modeling believable autonomous agents needs to take into account many different aspects from very different disciplines, ranging from cognitive psychology to mechanics. In this paper, we focus on research work dedicated to the modeling of human decision in a reactive way, a domain in-between the biomechanical motion control of the activity and the rational and social background which motivates and shapes the execution of such activities. We cover models of reactive human behaviors introduced in computer science and cognitive science, assessing and comparing them from the application-oriented perspective of modeling credible real-time virtual anthropomorphic actors

Keywords: Reactive behaviour modeling, Virtual humans, Behavioral animation, Human cognition

Citation: Donikian S., Petta P. (2011) A survey of research work in computer science and cognitive science dedicated to the modeling of reactive human behaviors, Computer Animation & Virtual Worlds 22(5):445-455. doi: 10.1002/cav.375


OFAI-TR-2011-20 ( 343kB PDF file)

Music Similarity Estimation with the Mean-Covariance Restricted Boltzmann Machine

Jan Schlueter, Christian Osendorfer

Existing content-based music similarity estimation methods largely build on complex hand-crafted feature extractors, which are difficult to engineer. As an alternative, unsupervised machine learning allows to learn features empirically from data. We train a recently proposed model, the mean-covariance Restricted Boltzmann Machine, on music spectrogram excerpts and employ it for music similarity estimation. In k-NN based genre retrieval experiments on three datasets, it clearly outperforms MFCC-based methods, beats simple unsupervised feature extraction using k-Means and comes close to the stateof- the-art. This shows that unsupervised feature extraction poses a viable alternative to engineered features.

Keywords: Music Information Retrieval, Music Similarity, Boltzmann Machine

Citation: Schlueter J., Osendorfer C.: Music Similarity Estimation with the Mean-Covariance Restricted Boltzmann Machine, Proceedings of the 10th International Conference on Machine Learning and Applications (ICMLA 2011), Honolulu, USA, 2011.


OFAI-TR-2011-19 ( 522kB PDF file)

Advantages of nonstationary Gabor transforms in beat tracking

Andre Holzapfel, Gino Angelo Velasco, Nicki Holighaus, Monika Doerfler, Arthur Flexer

In this paper the potential of using nonstationary Gabor transform for beat tracking in music is examined. Nonstationary Gabor transforms are a generalization of the shorttime Fourier transform, which allow exibility in choosing the number of bins per octave, while retaining a perfect inverse transform. In this paper, it is evaluated if these properties can lead to an improved beat tracking in music signals, thus presenting an approach that introduces recent ndings in mathematics to music information retrieval. For this, both nonstationary Gabor transforms and short-time Fourier transform are integrated into a simple beat tracking framework. Statistically signi cant improvements are observed on a large dataset, which motivates to integrate the nonstationary Gabor transform into state of the art approaches for beat tracking and tempo estimation.

Keywords: beat tracking, nonstationary Gabor transform, music information retrieval

Citation: Holzapfel A., Velasco G., Holighaus N., Doerfler M., Flexer A.: Advantages of nonstationary Gabor transforms in beat tracking. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-19, 2011


OFAI-TR-2011-18

An Interdisciplinary VR-architecture for 3D chatting with non-verbal communication

Stephane Gobron, Junghyun Ahn, Quentin Silvestre, Daniel Thalmann, Stefan Rank, Marcin Skowron, Georgios Paltoglou, Michael Thelwall

The communication between avatar and agent has already been treated from different but specialized perspectives. In contrast, this paper gives a balanced view of every key architectural aspect: from text analysis to computer graphics, the chatting system and the emotional model. Non-verbal communication, such as facial expression, gaze, or head orientation is crucial to simulate realistic behavior, but is still an aspect neglected in the simulation of virtual societies. In response, this paper aims to present the necessary modularity to allow virtual humans (VH) conversation with consistent facial expression -either between two users through their avatars, between an avatar and an agent, or even between an avatar and a Wizard of Oz. We believe such an approach is particularly suitable for the design and implementation of applications involving VHs interaction in virtual worlds. To this end, three key features are needed to design and implement this system entitled 3D-emoChatting. First, a global architecture that combines components from several research fields. Second, a real-time analysis and management of emotions that allows interactive dialogues with non-verbal communication. Third, a model of a virtual emotional mind called emoMind that allows to simulate individual emotional characteristics. To conclude the paper, we briefly present the basic description of a user-test which is beyond the scope of the present paper.

Keywords: Publications List Interact, Three-Dimensional Graphics and Realism, Virtual Reality—Natural Language Processing, Text Analysis, Publications List Interact

Citation: In the Proceedings of Joint Virtual Reality Conference of EuroVR, EGVE 2011


OFAI-TR-2011-17

Model of emotional dialogues based on entropy model

Julian Sienkiewicz, Marcin Skowron, Georgios Paltoglou, Janusz Holyst

We process emotionally annotated (negative, neutral, positive) data from Internet Relay Chat (IRC) extracting 90.000 one-to-one dialogues between users. Statistical analysis shows that, regardless of the length of the dialogue measured in the number of comments, each dialogue ends when the probabilities of finding positive and neutral emotional value equalize. Moreover, the entropy of the emotional probabilities distribution increases with the dialogue evolution. Additionally we observe clustering of comments with the same emotional value. Basing on entropy growth and emotional clustering, we construct a model of dialogues that reproduces the characteristics of the real data and we show how it can be useful for an automatic moderator in this medium.

Keywords: online communication, social norms, modelling, Publications List Interact

Citation: In the Proceedings of European Conference on Complex Systems, ECCS 2011


OFAI-TR-2011-16

Emotional communication patterns in online chat communities

Antonios Garas, David Garcia Becerra, Frank Schweitzer, Marcin Skowron

Real-time communication tools, like chats or instant message services, have an increasing popularity for both casual and professional interactions. These platforms allow us to study the patterns of human behavior in emotional communication through the messages written by people participating in these conversations. We study a dataset composed of logs from Internet Relay Chat (IRC) channels, in which a large amount of users share comments in real time. We use a sentiment analysis tool to extract the emotions expressed in each comment, classified as positive, negative, or neutral. Statistical analysis of the time delays between user messages show that people in chatrooms follow similar principles as in other means of communication. Furthermore, the time dynamics of the whole conversation shows the presence of long-range correlations. We analyze the persistence of the emotional expression of individuals, i.e. how likely they are to express a similar sentiment as in the previous message. We find that most of the users behave in an emotionally persistent way, while some of them have a more random or anti-persistent behavior. Even under the presence of this heterogeneity, there is persistence in the channel as a whole. These results indicate the presence of social norms of emotional expression, as well as the existence of social links between users under the fast communication of a chatroom. We have developed and analyzed a model for the exchange of emotions online that shows the same features as the IRC data, where collective emotions emerge from individual behavior and communication.

Keywords: online communication, social norms, agent-based modeling, Publications List Interact

Citation: In Proceedings of the International Society for Research on Emotion Conference, ISRE 2011


OFAI-TR-2011-15

CyberEmotions: the good, the bad, the neutral - effect of an affective profile in dialog system-user online conversations

Marcin Skowron, Stefan Rank

Emotionally driven online behavior is traceable in a wide range of human communication processes on the Internet. Here the sum of individual emotions of a large number of users, with their interconnectivity and complex dynamics influence formation, evolution and breaking-up of online communities. Our research concentrates on the basic communication process between two conversants. Such interactions constitute a fundament for the modeling of more complex, multi-agents communication processes. Using artificial conversational entities (dialog systems), we investigate the role of emotions in online, natural language based communication. Our previous work demonstrated that a dialog system's capability to establish an emotional connection, and further to conduct a realistic and enjoyable dialog was in pair with the results obtained in a Wizard of Oz setting. In this talk, we present findings from recent experiments on the effect of conversant affective profiles (i.e., positive, negative, neutral) and their influence on the communication processes. The results demonstrate that the affective profile to a large extent determines the assessment of users' emotional connection and enjoyment from the interaction while it does not significantly influence the perception of core capabilities of the dialog systems, i.e. dialog coherence, dialog realisticness. The emotional changes experienced by the participants during the online interactions were correlated with the type of artificial systems affective profile and induced changes to various aspects of the conducted dialogs, e.g., timing, communication style, users' expressions of affective states.

Keywords: Publications List Interact, affective dialog system affective human-computer interactions, agent control architecture

Citation: In Proceedings of International Society for Research on Emotion Conference, ISRE 2011.


OFAI-TR-2011-14 ( 279kB PDF file)

Using Mutual Proximity to Improve Content-Based Audio Similarity

Dominik Schnitzer, Arthur Flexer, Markus Schedl, Gerhard Widmer

This work introduces Mutual Proximity, an unsupervised method which transforms arbitrary distances to similarities computed from the shared neighborhood of two data points. This reinterpretation aims to correct inconsistencies in the original distance space, like the hub phenomenon. Hubs are objects which appear unwontedly often as nearest neighbors in predominantly high-dimensional spaces. We apply Mutual Proximity to a widely used and standard content-based audio similarity algorithm. The algorithm is known to be negatively affected by the high number of hubs it produces. We show that without a modification of the audio similarity features or inclusion of additional knowledge about the datasets, applying Mutual Proximity leads to a significant increase of retrieval quality: (1) hubs decrease and (2) the k-nearest-neighbor classification rates increase significantly. The results of this paper show that taking the mutual neighborhood of objects into account is an important aspect which should be considered for this class of content-based audio similarity algorithms.

Keywords: Music Information Retrieval, Audio Similarity Classification, Hubs

Citation: Schnitzer D., Flexer A., Schedl M., Widmer G.: Using Mutual Proximity to Improve Content-Based Audio Similarity. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-14, 2011


OFAI-TR-2011-13

Affect Bartender - Affective Cues and Their Application in a Conversational Agent

Marcin Skowron, Georgios Paltoglou

This paper presents methods for the detection of textual expressions of users' affective states and explores an application of these affective cues in a conversational system -- Affect Bartender. We also describe the architecture of the system, core system components and a range of developed communication interfaces.The application of the described methods is illustrated with examples of dialogs conducted with experiment participants in a Virtual Reality setting.

Keywords: Affective Interactions, Conversational Agent Textual Affect Sensing, Sentiment Classification, Publications List Interact

Citation: Skowron M., Paltoglou G.: Affect Bartender - Affective Cues and Their Application in a Conversational Agent. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-13, 2011


OFAI-TR-2011-12

Talking with affective dialog systems: Extending the analysis of affect in online communication

Marcin Skowron

Recent advances in computer technology extend the capacities of autonomous systems to detect and categorise the textual expressions of affective states. On the one hand, this enables large‐scale studies of the role of affect in the ICT mediated communication processes. On the other hand, it supports the development of interactive systems that can sense and take into account information on the users' sentiment to manage the communication flow. In our talk, we introduce affective dialog systems and autonomous interactive bots that provide supplemental means for analysing the role of affect in synchronous online communication. In particular, such systems enable direct querying of users about their sentiments and affective states towards a set of entities, events and processes and for conducting the follow‐up, task‐oriented dialogs to gain additional insights, e.g., on the users' background, motivations and expectations. The themes that are used by the system are provided manually or acquired from the online news feeds, e.g., Reuters News. We describe how “hot topics” can be automatically detected and incorporated in the dialog, to acquire information on users' affective responses to topics of interest. The presented systems are deployable in various interaction settings, e.g., virtual online worlds, providing an access to user groups that are often not actively engaged in discussions in the other, online communication channels such as blogs and newsgroups. In this talk we present the insights from experiments on the system‐users interactions, conducted in 3D virtual reality settings, in which the affective dialog system managed the verbal aspects of a virtual human communication. The system performance, in terms of its capability to: i.) generate a realistic dialog, ii.) provide an enjoyable experience for the users and iii.) establish an emotional connection with the experiment participants, matched the results obtained in the Wizard‐of‐Oz settings, i.e., unseen human operator who controls the conversation of a virtual character.

Keywords: dialog system, affective computing, HCI, Publications List Interact

Citation: Skowron M.: Talking with affective dialog systems: Extending the analysis of affect in online communication. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-12, 2011


OFAI-TR-2011-11

Effect of affective profile on communication patterns and affective expressions in interactions with a dialog system

Marcin Skowron, Mathias Theunis, Stefan Rank, Anna Borowiec

Interlocutors' affective profile and character traits play an important role in interactions. In the presented study, we apply a dialog system to investigate the effects of the affective profile on user-system communication patterns and users' expressions of affective states. We describe the data-set acquired from experiments with the affective dialog system, the tools used for its annotation and findings regarding the effect of affective profile on participants' communication style and affective expressions.

Keywords: affective profile, dialog system, affective computing, HCI, Publications List Interact

Citation: Skowron M., Theunis M., Rank S., Borowiec A.: Effect of affective profile on communication patterns and affective expressions in interactions with a dialog system. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-11, 2011


OFAI-TR-2011-10

The good, the bad and the neutral: affective profile in dialog system-user communication

Marcin Skowron, Stefan Rank, Mathias Theunis, Julian Sienkiewicz

We describe the use of affective profiles in a dialog system and its effect on participants' perception of conversational partners and experienced emotional changes in an experimental setting, as well as the mechanisms for realising three different affective profiles and for steering task-oriented follow-up dialogs. Experimental results show that the system's affective profile determines the rating of chatting enjoyment and user-system emotional connection to a large extent. Self-reported emotional changes experienced by participants during an interaction with the system are also strongly correlated with the type of applied profile. Perception of core capabilities of the system, realism and coherence of dialog, are only influenced to a limited extent.

Keywords: affective dialog system, affective profile conversational agent affective computing, HCI, Publications List Interact

Citation: Skowron M., Rank S., Theunis M., Sienkiewicz J.: The good, the bad and the neutral: affective profile in dialog system-user communication. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-10, 2011


OFAI-TR-2011-09

Sentiment analysis of informal textual communication in cyberspace

Georgios Paltoglou, Stephane Gobron, Marcin Skowron, Mike Thelwall, Daniel Thalmann

The ability to correctly identify the existence and polarity of emotion in informal, textual communication is a very important part of a realistic and immersive 3D environment where people communicate with one another through avatars or with an automated system. Such a feature would provide the system the ability to realistically represent the mood and intentions of the participants, thus greatly enhancing their experience. In this paper, we study and compare a number of approaches for detecting whether a textual utterance is of objective or subjective nature and in the latter case detecting the polarity of the utterance (i.e. positive vs. negative). Experiments are carried out on a real corpus of social exchanges in cyberspace and general conclusions are presented.

Keywords: Opinion Mining, Sentiment Analysis Conversational Systems, Virtual Reality, Virtual Human, Emotional Profile, Publications List Interact

Citation: Paltoglou G., Gobron S., Skowron M., Thelwall M., Thalmann D.: Sentiment analysis of informal textual communication in cyberspace. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-09, 2011


OFAI-TR-2011-08 ( 2498kB PDF file)

Identification of perceptual qualities in textural sounds using the repertory grid method

Thomas Grill, Arthur Flexer, Stuart Cunningham

This paper is about exploring which perceptual qualities are relevant to people listening to textural sounds. Knowledge about those personal constructs shall eventually lead to more intuitive interfaces for browsing large sound libraries. By conducting mixed qualitative-quantitative interviews within the repertory grid framework ten bi-polar qualities are identi ed. A subsequent web-based study yields measures for inter-rater agreement and mutual similarity of the perceptual qualities based on a selection of 100 textural sounds. Additionally, some initial experiments are conducted to test standard audio descriptors for their correlation with the perceptual qualities.

Keywords: textural audio, auditory perception, verbal description, personal constructs, repertory grid, machine listening

Citation: Grill T., Flexer A., Cunningham S.: Identification of perceptual qualities in textural sounds using the repertory grid method. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-08, 2011


OFAI-TR-2011-07 ( 244kB PDF file)

Improving tempo-sensitive and tempo-robust descriptors for rhythmic similarity

Andre Holzapfel, Arthur Flexer, Gerhard Widmer

For the description of rhythmic content of music signals usually features are preferred that are invariant in presence of tempo changes. In this paper it is shown that the importance of tempo depends on the musical context. For popular music, a tempo-sensitive feature is improved on multiple datasets using analysis of variance, and it is shown that also a tempo-robust description profits from the integration into the resulting processing framework. Important insights are given into optimal parameters for rhythm description, and limitations of current approaches are indicated.

Keywords: Music Information Retrieval, Tempo, Rhythm

Citation: Holzapfel A., Flexer A., Widmer G.: Improving tempo-sensitive and tempo-robust descriptors for rhythmic similarity. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-07, 2011


OFAI-TR-2011-06 ( 12592kB PDF file)

On automated annotation of acousmatic music

Volkmar Klien, Thomas Grill, Arthur Flexer

This paper presents an inquiry concerning the feasibility of using existing methods from the field of Music Information Retrieval (MIR) for automated annotation of acousmatic music. Thorough discussion and appraisal of the meaning and role of annotation in this context leads to the conclusion that: (i) full automation is not possible due to the lack of a "ground truth" and the absence of semantic comprehension on the side of the computer, (ii) MIR can nevertheless play a valuable role by providing human annotators with tools for interactive annotation. We present two possible approaches to interactive annotation applied to compositions of acousmatic music, namely John Chowning's Turenas and Denis Smalley's Wind Chimes. We also discuss the possible impact of such semi-automatic annotation on the theoretical coverage and practice of acousmatic music.

Keywords: Acousmatic music, Music information retrieval, Annotation

Citation: Klien V., Grill T., Flexer A.: On automated annotation of acousmatic music, Journal of New Music Research, Volume 41, Issue 2, pages 153-173, 2012


OFAI-TR-2011-05 ( 214kB PDF file)

No peanuts! Affective Cues for the Virtual Bartender

Marcin Skowron, Hannes Pirker, Stefan Rank, Georgios Paltoglou, Junghyun Ahn, Stephane Gobron

The aim of this paper is threefold: it explores methods for the detection of affective states in text, it presents the usage of such affective cues in a conversational system and it evaluates its effectiveness in a virtual reality setting. Valence and arousal values, used for generating facial expressions of users' avatars, are also incorporated into the dialog, helping to bridge the gap between textual and visual modalities. The system is evaluated in terms of its ability to: i) generate a realistic dialog, ii) create an enjoyable chatting experience, and iii) establish an \emph{emotional connection} with participants. Results show that user ratings for the conversational agent match those obtained in a Wizard of Oz setting.

Keywords: Conversational System, Virtual Agent, Dialog System, Affective Computing, HCI, System Evaluation, Publications List Interact

Citation: Skowron M., Pirker H., Rank S., Paltoglou G., Ahn J., Gobron S.: No peanuts! Affective Cues for the Virtual Bartender. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-05, 2011


OFAI-TR-2011-04 ( 69kB PDF file)

Virtual Agent Modeling in the RASCALLI Platform

Christian Eis, Marcin Skowron, Brigitte Krenn

The RASCALLI platform is both a runtime and a development environment for virtual systems augmented with cognition. It provides a framework for the implementation and execution of modular software agents. Due to the underlying software architecture and the modularity of the agents, it allows the parallel execution and evaluation of multiple agents. These agents might be all of the same kind or of vastly different kinds or they might differ only in specific (cognitive) aspects, so that the performance of these aspects can be effectively compared and evaluated.

Keywords: Cognitive Agents, Agent Modeling and Evaluation, Publications List Interact

Citation: Eis C., Skowron M., Krenn B.: Virtual Agent Modeling in the RASCALLI Platform. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-04, 2011


OFAI-TR-2011-03 ( 57kB PDF file)

Adaptive Mind Agent

Brigitte Krenn, Marcin Skowron, Gregor Sieber, Erich Gstrein, Joerg Irran

We present the Adaptive Mind Agent, an intelligent virtual agent that is able to actively participate in a real-time, dynamic environment. The agent is equipped with a collection of processing tools that form the basis of its perception from and action on the environment consisting of web documents, URLs, RRS feeds, domain-specific knowledgebases, other accessible virtual agents and the user. How these predispositions are finally shaped into unique agent behaviour depends on the agent's abilities to learn through actual interactions, in particular the abilities: (i) to memorize and evaluate episodes comprising the actions the agent had performed on its environment in the past depending on its perceptions of the user requests and its interpretation of the user's feedback reinforcing or inhibiting a certain action; (ii) to dynamically develop user-driven interest and preference profiles through memorizing and evaluating the user clicks on selected web pages.

Keywords: Virtual Agent, HCI, System Adaptation, Publications List Interact

Citation: Krenn B., Skowron M., Sieber G., Gstrein E., Irran J.: Adaptive Mind Agent. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-03, 2011


OFAI-TR-2011-02 ( 860kB PDF file)

Affect Bartender - Affective Cues and Their Application in a Conversational Agent

Marcin Skowron, Georgios Paltoglou

Abstract—This paper presents methods for the detection of textual expressions of users’ affective states and explores an application of these affective cues in a conversational system – Affect Bartender. We also describe the architecture of the system, core system components and a range of developed communication interfaces. The application of the described methods is illustrated with examples of dialogs conducted with experiment participants in a Virtual Reality setting.

Keywords: Affective Interactions, Conversational Agent, Dialog System, Textual Affect Sensing Sentiment Classification, Publications List Interact

Citation: Skowron M., Paltoglou G.: Affect Bartender - Affective Cues and Their Application in a Conversational Agent. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-02, 2011


OFAI-TR-2011-01 ( 282kB PDF file)

Sentiment analysis of informal textual communication in cyberspace

Georgios Paltoglou, Stephane Gobron, Marcin Skowron, Mike Thelwall, Daniel Thalmann

The ability to correctly identify the existence and polarity of emotion in informal, textual communication is a very important part of a realistic and immersive 3D environment where people communicate with one another through avatars or with an automated system. Such a feature would provide the system the ability to realistically represent the mood and intentions of the participants, thus greatly enhancing their experience. In this paper, we study and compare a number of approaches for detecting whether a textual utterance is of objective or subjective nature and in the latter case detecting the polarity of the utterance (i.e. positive vs. negative). Experiments are carried out on a real corpus of social exchanges in cyberspace and general conclusions are presented.

Keywords: Sentiment Classification, Affective Computing, Publications List Interact

Citation: Paltoglou G., Gobron S., Skowron M., Thelwall M., Thalmann D.: Sentiment analysis of informal textual communication in cyberspace. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2011-01, 2011


OFAI-TR-2010-14 ( 795kB PDF file)

On Computing Morphological Similarity of Audio Signals

Martin Gasser, Arthur Flexer, Thomas Grill

Most methods to compute content-based similarity between audio samples are based on descriptors representing the spectral shape or the texture of the audio signal only. This paper describes an approach based on (i) the extraction of spectro-temporal profiles from audio and (ii) non-linear alignment of the profiles to calculate a distance measure.

Keywords: Music Information Retrieval,

Citation: Gasser M., Flexer A., Grill T.: On Computing Morphological Similarity of Audio Signals. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2010-14, 2010


OFAI-TR-2010-13 ( 371kB PDF file)

A Fast Audio Similarity Retrieval Method for Millions of Music Tracks

Dominik Schnitzer, Arthur Flexer, Gerhard Widmer

We present a filter-and-refine method to speed up nearest neighbor searches with the Kullback-Leibler divergence for multivariate Gaussians. This combination of features and similarity estimation is of special interest in the field of automatic music recommendation and it is widely used to compute music similarity. However, the non-vectorial features and a non-metric divergence make using it with large corpora difficult, as standard indexing algorithms can not be used. This paper proposes a method for fast nearest neighbor retrieval in large databases which relies on the above approach. In its core the method rescales the divergence and uses a modified FastMap implementation to speed up nearest-neighbor queries. Overall the method accelerates the search for similar music pieces by a factor of 10 - 30 and yields recall values of 95 - 99% compared to a standard linear search.

Keywords: Music Information Retrieval, Audio Indexing, Music Recommendation

Citation: Schnitzer D., Flexer A., Widmer G.: A Fast Audio Similarity Retrieval Method for Millions of Music Tracks, to appear in: Multimedia Tools and Applications, 2011.


OFAI-TR-2010-12 ( 1005kB PDF file)

Re-texturing the sonic environment

Thomas Grill

This paper examines modeling the acoustic environment (i.e. soundscape) with respect to its textural qualities. This is explored in the context of an audiovisual installation which captures the external environment and re-synthesizes a corresponding, but nonetheless potentially di ffering immersive audiovisual environment from a given sound and image corpus in the exhibition space. In order to establish the association between sonic structures of the external and internal domains a perceptually grounded, compact and real-time capable method for modeling sound textures based on amplitude fluctuation patterns is devised and evaluated.

Keywords: Sonic Arts, Music Information Retrieval

Citation: Grill T.: Re-texturing the sonic environment. Proceedings of the 5th Audio Mostly Conference: A Conference on Interaction with Sound, pp. 42-48, 2010.


OFAI-TR-2010-11 ( 6388kB PDF file)

Limitations of interactive music recommendation based on audio content

Arthur Flexer, Martin Gasser, Dominik Schnitzer

We present a study on the limitations of an interactive music recommendation service based on automatic computation of audio similarity. Songs which are, according to the audio similarity function, similar to very many other songs and hence appear unwantedly often in recommendation lists keep a significant proportion of the audio collection from being recommended at all. This problem is studied in-depth with a series of computer experiments including analysis of alternative audio similarity functions and comparison with actual download data.

Keywords: Music Information Retrieval, Music Recommendation, Audio Content

Citation: Flexer A., Gasser M., Schnitzer D.: Limitations of interactive music recommendation based on audio content, Proceedings of the 5th Audio Mostly Conference: A Conference on Interaction with Sound, pp. 96-102, 2010.


OFAI-TR-2010-10 ( 1251kB PDF file)

Islands of Gaussians: The Self Organizing Map and Gaussian Music Similarity Features

Dominik Schnitzer, Arthur Flexer, Gerhard Widmer, Martin Gasser

Multivariate Gaussians are of special interest in the MIR field of automatic music recommendation. They are used as the de facto standard representation of music timbre to compute music similarity. However, standard algorithms for clustering and visualization are usually not designed to handle Gaussian distributions and their attached metrics (e.g. the Kullback-Leibler divergence). Hence to use these features the algorithms generally handle them indirectly by first mapping them to a vector space, for example by deriving a feature vector representation from a similarity matrix. This paper uses the symmetrized Kullback-Leibler centroid of Gaussians to show how to avoid the vectorization detour for the Self Organizing Maps (SOM) data visualization algorithm. We propose an approach so that the algorithm can directly and naturally work on Gaussian music similarity features to compute maps of music collections. We show that by using our approach we can create SOMs which (1) better preserve the original similarity topology and (2) are far less complex to compute, as the often costly vectorization step is eliminated.

Keywords: Music Information Retrieval, Self Organizing Map, Gaussian,

Citation: Schnitzer D., Flexer A., Widmer G., Gasser M.: Islands of Gaussians: The Self Organizing Map and Gaussian Music Similarity Features, Proceedings of the Eleventh International Society for Music Information Retrieval Conference (ISMIR 2010), 2010.


OFAI-TR-2010-09 ( 434kB PDF file)

Towards automated annotation of acousmatic music

Volkmar Klien, Thomas Grill, Arthur Flexer

At the Austrian Research Institute for Artificial Intelligence (OFAI) we are currently undertaking a two year research project entitled "Towards Automatic Annotation of Electroacoustic Music" investigating the possibilities and potential obstacles for finding (partial) solutions to problems related to computer assisted annotation of electroacoustic music. Setting aside technological issues pertaining to the relevant fields of signal processing and music information retrieval the paper at hand aims at outlining the reasons behind our choice of SmalleyÂ’s theory of spectromorphology (SM) as our conceptual background, issues pertaining to the role of the annotated score, the formalisation of spectromorphology for automation as well as potential limitations. Given that neither the manual annotation of acousmatic music nor the technical implementation thereof can be seen as straightforward matters, research in this area is still at a very basic level making fully automatic and even fully functional semiautomatic annotation of electroacoustic sound a long-term research goal.

Keywords: Acousmatic music, Annotation, Machine Learning

Citation: Klien V., Grill T., Flexer A.: Towards automated annotation of acousmatic music. Proceedings of the Electronic Music Studies Network Conference 2010 (EMS'10), Shanghai, China, 2010.


OFAI-TR-2010-08 ( 918kB PDF file)

Sparse Regression in Time-Frequency Representations of Complex Audio

Monika Doerfler, Gino Velasco, Arthur Flexer, Volkmar Klien

Time-frequency representations are commonly used tools for the representation of audio and in particular music signals. From a theoretical point of view, these representations are linked to Gabor frames. Frame theory yields a convenient reconstruction method making post-processing unnecessary. Furthermore, using dual or tight frames in the reconstruction, we may resynthesize localized components from so-called sparse representation coefficients. Sparsity of coefficients is directly reinforced by the application of a ℓ1-penalization term on the coefficients. We introduce an iterative algorithm leading to sparse coefficients and demonstrate the effect of using these coefficients in several examples. In particular, we are interested in the ability of a sparsity promoting approach to the task of separating components with overlapping analysis coefficients in the time-frequency domain. We also apply our approach to the problem of auditory scene description, i.e. source identification in a complex audio mixture.

Keywords: Signal Processing, Audio, Sparsity, Annotation

Citation: Doerfler M., Velasco G., Flexer A., Klien V.: Sparse Regression in Time-Frequency Representations of Complex Audio. Proceedings of the 7th Sound and Music Computing Conference (SMC'10), Barcelona, Spain, 2010.


OFAI-TR-2010-07 ( 252kB PDF file)

Hubs and Orphans - an Explorative Approach

Martin Gasser, Arthur Flexer, Dominik Schnitzer

In audio based music similarity, a well known effect is the existence of hubs, i.e. songs which appear similar to many other songs without showing any meaningful perceptual similarity. We show that this effect depends on the homogeneity of the samples under consideration. We compare three small sound collections (consisting of polyphonic music, environmental sounds, and samples of individual musical instruments) with regard to their hubness. We find that the collection consisting of cleanly recorded musical instruments produces the smallest hubs, wheras hubness increases with inhomogeneity of the audio signals. We also investigate how well the three data sets can be mapped into a 2D visualization space by a dimensionality reduction algorithm based on Multidimensional Scaling.

Keywords: Audio Similarity, Music Recommendation, Hubs

Citation: Gasser M., Flexer A., Schnitzer D.: Hubs and Orphans - an Explorative Approach. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2010-07, 2010


OFAI-TR-2010-06 ( 371kB PDF file)

Combining features reduces hubness in audio similarity

Arthur Flexer, Dominik Schnitzer, Martin Gasser, Tim Pohle

In audio based music similarity, a well known effect is the existence of hubs, i.e. songs which appear similar to many other songs without showing any meaningful perceptual similarity. We verify that this effect also exists in very large databases (>250000 songs) and that it even gets worse with growing size of databases. By combining different aspects of audio similarity we are able to reduce the hub problem while at the same time maintaining a high overall quality of audio similarity.

Keywords: music similarity, hubs

Citation: Flexer A., Schnitzer D., Gasser M., Pohle T.: Combining features reduces hubness in audio similarity. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2010-06, 2010


OFAI-TR-2010-05 ( 816kB PDF file)

Companions, Virtual Butlers, Assistive Robots: Empirical and Theoretical Insights for Building Long-Term Social Relationships

Dirk Heylen, Brigitte Krenn, Sabine Payr

Robots and agents are becoming increasingly prominent in everyday life, e.g. as companions, user interfaces to smart homes, household robots, or for lifestyle reassurance. In these roles, they have to interact with their users in a complex social world, and must build and maintain long-term relationships with them. A symposium at EMCSR 2010 dealt with theoretical and empirical research on long-term relationships of humans with humans, animals, and machines that show complex interactive behaviours, and with methodologies to create knowledge about interaction with companions, virtual butlers and assistive robots. This technical report brings together the five papers presented at this symposium.

Keywords: Companions, Long-term Relationships Virtual Butlers, Assistive Robots

Citation: Heylen, D., Krenn, B., Payr, S. (eds.). 2010. Companions, Virtual Butlers, Assistive Robots: Empirical and Theoretical Insights for Building Long-Term Social Relationships. In Trappl R. (ed.): Cybernetics and Systems 2010, Austrian Society for Cybernetic Studies, Vienna, Austria, pp. 539-570.
Also available as OFAI technical report OFAI-TR-2010-05, Austrian Research Institute for Artificial Intelligence of the Austrian Society for Cybernetic Studies, Vienna, Austria.


OFAI-TR-2010-04 ( 483kB PDF file)

Because we are all falling down. Physics, gestures and relative realities

Volkmar Klien, Thomas Grill, Arthur Flexer

Relative Realities, a media installation by Volkmar Klien, realized in co-operation with Thomas Grill investigates how movement and its perception relates across different media, modes and spaces. The installation's sonic aspects are generated in real-time in accordance to the installation's physical movement. This generative music engine is conceptualized and based on physical movement rather than established 'musical' parameters. Using a virtual physics (e.g. gaming) engine, physical interaction between objects, a flow of energy is created, which is then wired dynamically to sonic parameters on high (structural) as well low (timbral) temporal levels. Physical modeling in this context is applied not to synthesize sound but objects' behavior, to create and control flows of energy. Even though the following article describes an artistic approach to the issues at stake we hope that it will allow for cross-pollination between analysis and practice in the sonic arts.

Keywords: Sonic arts, Electroacoustic music

Citation: Klien V., Grill T., Flexer A.: Because we are all falling down. Physics, gestures and relative realities. Proceedings of the International Computer Music Conference (ICMC'10), New York City, NY, USA, 2010.


OFAI-TR-2010-03 ( 211kB PDF file)

Docking Agent-based Simulation of Collective Emotion to Equation-based Models and Interactive Agents

Stefan Rank

The creation and demise of e-communities is greatly influenced by emergent emotional phenomena that warrant study above the level of individuals: collective emotions. Our requirements for an agent-based simulation of these phenomena are constrained both by the data available from online communities as well as by the scenarios of use for the simulation. In this paper, we consider both the relation to mathematical models developed in parallel as well as the use of the simulation as decision support for interactive conversational systems. We show how both of these 'docking' attempts inform the simulation design and contribute to it.

Keywords: Agent-based Simulation, Emotion, Collective Emotions

Citation: Rank S.: Docking Agent-based Simulation of Collective Emotion to Equation-based Models and Interactive Agents. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2010-03, 2010


OFAI-TR-2010-02 ( 162kB PDF file)

Designing an Agent-based Simulation of Collective Emotions

Stefan Rank

The creation and demise of e-communities is strongly influenced by emergent emotional phenomena that warrant study above the level of individuals: collective emotions. For an agent-based simulation of these phenomena, our requirements are constrained both by the data available as well as by the scenarios of use for the simulation. In this paper, we consider both the relation to mathematical models developed in parallel as well as the use of the simulation as decision support for interactive conversational systems. We show how both of these 'docking' attempts inform and contribute to the simulation design.

Keywords: Agent-based Simulation, Emotion, Collective Emotions

Citation: Rank S.: Designing an Agent-based Simulation of Collective Emotions. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2010-02, 2010


OFAI-TR-2010-01 ( 136kB PDF file)

Effects of Album and Artist Filters in Audio Similarity Computed for Very Large Music Databases

Arthur Flexer, Dominik Schnitzer

In audio based music recommendation, a well known effect is the dominance of songs from the same artist as the query song in recommendation lists. We verify that this effect also exists in very large databases (> 250000 songs). Since our data set contains multiple albums from individual artists, we can also show that the album effect is relatively bigger than the artist effect.

Keywords: Music Information Retrieval, Audio Similarity

Citation: Computer Music Journal, Vol. 34, No. 3: 20-28, 2010


OFAI-TR-2009-06 ( 1117kB PDF file)

Contextual Slow Feature Extraction Framework

Raphael Deimel

The paper presents an agent-based framework for investigating a class of learning algorithms that exploit temporal correlation in sensor signals. They are referred to as Slow Feature Extraction (SFE) methods, such as Slow Feature Analysis (SFA) (Wiskott and Sejnowski 2002) or spike-timing dependent neural plasticity (Körding and König 2001). The framework provides the notion of a Context within the agent, that can be utilized to suppress or affirm certain Slow Features when analysing sensor data with SFE methods. The paper presents several possible modifications to a basic slowness criterion as used by the Slow Feature Analysis algorithm. Simulations with a contextualized version of SFA (cSFA) shows increased robustness of feature extraction in the face of different action patterns. The framework is further shown to naturally provide a hierarchical organisation of SFE methods and for the formal description of multisensory settings, useful for investigating Correlative Learning.

Keywords: Slow Feature Analysis, Agent Based Computing, Context, Correlative Learning

Citation: Deimel R.: Contextual Slow Feature Extraction Framework. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2009-06, 2009


OFAI-TR-2009-05 ( 175kB PDF file)

Using Domain Knowledge to Improve Automatic Speech Recognition: Correcting Errors in Prescriptions of Medications

Stephanie Schreitter, Alexandra Klein, Johannes Matiasek, Harald Trost

We present an approach to improving automatic speech recognition (ASR) for the creation of medical reports by analyzing hypotheses in the word graph based on background knowledge. Our application area is prescriptions of medications, which are a frequent source of misrecognitions: In a sample of 123 reports, we found that no less than about a third of the active substances or trade names and dosages were recognized incorrectly. In about 25\% of these errors, the correct string of words was contained in the word graph -- a significant potential for improvement. To realize this potential, we have built a knowledge base of medications based on information contained in the Unified Medical Language System (UMLS). This knowledge base contains trade names, active substances, strengths and dosages. Based on this representation, we generate a variety of linguistic realizations for prescriptions. Whenever an inconsistency in a prescription is encountered in the best path of the word graph, the system searches for alternative paths which contain valid linguistic realizations of prescriptions consistent with the knowledge base. If such a path exists, a new concept edge with a better score is added to the word graph, resulting in a higher plausibility for this reading. The concept edge can be used for rescoring the word graph to obtain a new best path. A preliminary evaluation led to encouraging results: in about half of the cases where the word graph contained the correct variant, the correction was successful.

Keywords: Semantic rescoring, Speech recognition, Error correction

Citation: Schreitter S., Klein A., Matiasek J., Trost H.: Using Domain Knowledge to Improve Automatic Speech Recognition: Correcting Errors in Prescriptions of Medications. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2009-05, 2009


OFAI-TR-2009-04 ( 2kB PDF file)

A Fast Audio Similarity Retrieval Method for Millions of Music Tracks

Dominik Schnitzer, Arthur Flexer, Gerhard Widmer

We present a filter-and-refine method to speed up acoustic audio similarity queries which use the Kullback-Leibler divergence as similarity measure. The proposed method rescales the divergence and uses a modified FastMap implementation to accelerate nearest-neighbor queries. Overall the method accelerates the search for similar music pieces by a factor of 10 - 30 compared to a linear scan but still offers high recall values (relative to a linear scan) of 95 - 99%. We show how the proposed method can be used to query several million songs for their acoustic neighbors very fast while producing almost the same results that a linear scan over the whole database would return. We present a working prototype implementation which is able to process similarity queries on a 2.5 million songs collection in about half a second on a standard CPU.

Keywords: Music Information Retrieval, Indexing, Audio Similarity

Citation: Schnitzer D., Flexer A., Widmer G.: A Fast Audio Similarity Retrieval Method for Millions of Music Tracks. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2009-04, 2009


OFAI-TR-2009-03 ( 168kB PDF file)

A Filter-and-Refine Indexing Method for Fast Similarity Search in Millions of Music Tracks

Dominik Schnitzer, Arthur Flexer, Gerhard Widmer

We present a filter-and-refine method to speed up acoustic audio similarity queries which use the Kullback-Leibler divergence as similarity measure. The proposed method rescales the divergence and uses a modified FastMap implementation to accelerate nearest-neighbor queries. The search for similar music pieces is accelerated by a factor of 10-30 compared to a linear scan but still offers high recall values (relative to a linear scan) of 95 - 99%. We show how the proposed method can be used to query several million songs for their acoustic neighbors very fast while producing almost the same results that a linear scan over the whole database would return. We present a working prototype implementation which is able to process similarity queries on a 2,5 million songs collection in about half a second on a standard CPU.

Keywords: Music Information Retrieval, Indexing,

Citation: Schnitzer D., Flexer A., Widmer G.: A Filter-and-Refine Indexing Method for Fast Similarity Search in Millions of Music Tracks, in Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR'09), Kobe, Japan, 2009.


OFAI-TR-2009-02 ( 1574kB PDF file)

FM4 Soundpark: Audio-based Music Recommendation in Everyday Use

Martin Gasser, Arthur Flexer

We present an application of content-based music recommendation techniques within an online community platform targeted at an audience interested mainly in independent and alternative music. The web platformÂ’s goals will be described, the problems of content management approaches based on daily publishing of new music tracks will be discussed, and we will give an overview of the user interfaces that have been developed to simplify access to the music collection. Finally, the adoption of content-based music recommendation tools and new user interfaces to improve user acceptance and recommendation quality will be justified by detailed user access analyses.

Keywords: Music Information Retrieval, Music Recommendation, Audio Similarity, Visualization

Citation: Proceedings of the 6th Sound and Music Computing Conference, Porto, Portugal, 2009.


OFAI-TR-2009-01 ( 60kB g-zipped PostScript file,  80kB PDF file)

Album and Artist Effects for Audio Similarity at the Scale of the Web

Arthur Flexer, Dominik Schnitzer

In audio based music recommendation, a well known effect is the dominance of songs from the same artist as the query song in recommendation lists. We verify that this effect also exists in a very large data set at the scale of the world wide web (>250000). Since our data set contains multiple albums from individual artists, we can also show that the album effect is relatively bigger than the artist effect.

Keywords: Music Information Retrieval, Audio Similarity

Citation: Proceedings of the 6th Sound and Music Computing Conference, Porto, Portugal, 2009.


OFAI-TR-2008-16 ( 156kB PDF file)

Identifying Segment Topics in Medical Dictations

Johannes Matiasek, Jeremy Jancsary, Alexandra Klein, Harald Trost

In this paper, we describe the use of lexical and semantic features for topic classification in dictated medical reports. First, we employ SVM classification to assign whole reports to coarse work-type categories. Afterwards, text segments and their topic are identified in the output of automatic speech recognition. This is done by assigning work-type-specific topic labels to each word based on features extracted from a sliding context window, again using SVM classification utilizing semantic features. Classifier stacking is then used for a posteriori error correction, yielding a further improvement in classification accuracy.

Citation: Matiasek J., Jancsary J., Klein A., Trost H.: Identifying Segment Topics in Medical Dictations. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-16, 2008


OFAI-TR-2008-15 ( 46kB PDF file)

The IRIS Network of Excellence: Integrating Research in Interactive Storytelling

Marc Cavazza, Stéphane Donikian, Marc Christie, Ulrike Spierling, Nicolas Szilas, Peter Vorderer, Tilo Hartmann, Christoph Klimmt, Elisabeth André, Ronan Champagnat, Paolo Petta, Patrick Olivier

Interactive Storytelling is a major endeavour to develop new media which could offer a radically new user experience, with a potential to revolutionise digital entertainment. European research in Interactive Storytelling has played a leading role in the development of the field, and this creates a unique opportunity to strengthen its position even further by structuring collaboration between some of its main actors. IRIS (Integrating Research in Interactive Storytelling) aims at creating a virtual centre of excellence that will be able to progress the understanding of fundamental aspects of Interactive Storytelling and the development of corresponding technologies.

Keywords: Interactive Storytelling, Interactive Narrative, Narrative Formalisms, Planning, Authoring Tools, Character Animation, Camera Control

Citation: to appear in: Spierling U., Szilas N. (eds.): Proceedings of the First Joint International Conference on Interactive Digital Storytelling, November 26-29, 2008, Erfurt, Germany, EU; Lecture Notes in Computer Science, Springer-Verlag Heidelberg.


OFAI-TR-2008-14 ( 69kB PDF file)

Mismatch interpretation by semantics-driven alignment

Martin Huber, Jeremy Jancsary, Alexandra Klein, Johannes Matiasek, Harald Trost

This paper describes a method for the alignment of automatically recognized speech transcripts with formatted documents manually derived from the speech recognition results. Novel features of our alignment method are a parametrizable scoring function, an intelligent tokenization system drawing on domain knowledge, and semantic comparisons. The field of application are dictated medical reports processed by automated speech recognition.

Keywords: alignment, semantic similarity

Citation: Huber M., Jancsary J., Klein A., Matiasek J., Trost H.: Mismatch interpretation by semantics-driven alignment. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-14, 2008


OFAI-TR-2008-13 ( 64kB PDF file)

Semantics-based Automatic Literal Reconstruction Of Dictations

Jeremy Jancsary, Alexandra Klein, Johannes Matiasek, Harald Trost

This paper describes a method for the automatic literal reconstruction of dictations in the domain of medical reports. The raw output of an automatic speech recognition system and the final report edited by a professional medical transcriptionist serve as input to the reconstruction algorithm. Reconstruction is based on automatic alignment between the speech recognition result and the edited report. Based on an ontology (i.e. UMLS) and lexical resources (i.e. WordNet and an inventory of spoken variants for each concept), semantic representations are assigned to terms and phrases. Alignment takes into account semantic similarity scores, based on the similarity between semantic representations of the two sources, and phonetic similarity scores. This paper explains how the speech recognition output is compared and aligned to the edited written documents and how the two different input sources are complementary for the task of reconstructing a literal transcript.

Keywords: semantic similarity, alignment

Citation: Jancsary J., Klein A., Matiasek J., Trost H.: Semantics-based Automatic Literal Reconstruction Of Dictations. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-13, 2008


OFAI-TR-2008-12 ( 463kB PDF file)

Revealing the Structure of Medical Dictations with Conditional Random Fields

Jeremy Jancsary, Johannes Matiasek, Harald Trost

Automatic processing of medical dictations poses a significant challenge. We approach the problem by introducing a statistical framework capable of identifying types and boundaries of sections, lists and other structures occurring in a dictation, thereby gaining explicit knowledge about the function of such elements. Training data is created semi-automatically by aligning a parallel corpus of corrected medical reports and corresponding transcripts generated via automatic speech recognition. We highlight the properties of our statistical framework, which is based on conditional random fields (CRFs) and implemented as an efficient, publicly available toolkit. Finally, we show that our approach is effective both under ideal conditions and for real-life dictation involving speech recognition errors and speech-related phenomena such as hesitation and repetitions.

Keywords: medical dictations, conditional random fields, structure recognition, text segmentation

Citation: Jancsary J., Matiasek J., Trost H.: Revealing the Structure of Medical Dictations with Conditional Random Fields. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-12, 2008


OFAI-TR-2008-11 ( 50kB PDF file)

Interaction Based Knowledge Acquisition and Exchange Using Grounded Symbols - Transferring Affordance Based Learning to Virtual Agents

Joerg Irran, Brigitte Krenn

Abstract— The presented work introduces the basis for a new generation of virtual agents that are able to assist their users in finding, retrieving and processing data, Since higher level cognitive processes, however, still remain the domain of humans, these virtual agents can be seen, in a metaphorical sense as “dogs” being able to provide their skills to humans after a training/learning process guided by the user. The aim of our research is to combine the power of both sides, computers and humans, to realize virtual agents that provide capable assistance to their users with a certain degree of autonomous and flexible behaviour but still being guided by humans. In our approach, we do not attempt to mimic human cognition. Rather, we enable the agents to learn via self-experience, from positive and negative feedback by the user, and from communication with other agents of their kind using grounded and agreed upon symbols. The design of the agents is inspired by insights from embodied cognition - in particular from affordance-based robotics - that are transferred to a virtual context.

Keywords: publication, NLU, agents, affordance, robotics

Citation: Irran J., Krenn B.: Interaction Based Knowledge Acquisition and Exchange Using Grounded Symbols - Transferring Affordance Based Learning to Virtual Agents. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-11, 2008


OFAI-TR-2008-10 ( 28kB PDF file)

Acquisition and Exchange of Knowledge - From Real to Virtual Embodiment

Joerg Irran, Gregor Sieber, Marcin Skowron, Brigitte Krenn

Today's computer power enables us to create software agents that can process large amounts of data in very short time. Higher level cognitive processes, however, still remain the domain of humans. The aim of our research is to combine the power of both sides, to realize virtual agents that provide capable assistance to their users. In our approach we do not attempt to mimic human cognition. Rather, we enable the agents to learn via self-experience, from positive and negative feedback by the user, and from communication with other agents of their kind using grounded and agreed upon symbols. The design of the agents is inspired by insights from embodied cognition - in particular from affordance-based robotics - that are transferred to a virtual context.

Keywords: publications, NLU, Rascalli, affordances, learning, communication, agents

Citation: Irran J., Sieber G., Skowron M., Krenn B.: Acquisition and Exchange of Knowledge - From Real to Virtual Embodiment. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-10, 2008


OFAI-TR-2008-09 ( 42kB PDF file)

Functional Mark-up for Behaviour Planning: Theory and Practice

Brigitte Krenn, Gregor Sieber

We approach the discussion of requirements for an FML from a high-level perspective on communication and the current state of developments in ECA communication. From a general point of view questions arise such as: Who is communicating to whom in which socio-cultural and situational context. What is the overall interaction history of the communication partners, and what is the history of the ongoing dialogue. What is the intention of the communication and what is its content. Transferring these questions to the ECA domain, at least leads to questions of modelling the virtual character's persona including some notion of personality and emotion, and of modelling the communication act itself, be it in terms of real-time action and response or in terms of generating a complete dialogue scene in one go. Our goal is mainly to come up with open questions and core topics regarding a possible scope of an FML given the current state of art in ECA communication. From a practical point of view, we start from a narrowed down perspective on modelling the communication partners and the communication act. In section 2, we give a brief outline of the current state of ECA development and its implications for the creation of a commonly used mark-up or representation language at the interface of intent and behaviour planning. We propose a set of person characteristics and aspects of communication acts that need to be considered in the specification of a functional mark-up language. This is followed by a discussion of some basic building blocks relevant for the computation of communicative events (section 3). In section 4, we finally point out that one of the main challenges of FML lies in finding a trade-off between detailed semantic descriptions and interoperability of system components. We round up our considerations with some words of caution regarding the feasibility and desirability of a clear-cut separation between intent and behaviour planning.

Keywords: publications nlu rascalli

Citation: Krenn B., Sieber G.: Functional Mark-up for Behaviour Planning: Theory and Practice . Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-09, 2008


OFAI-TR-2008-08 ( 295kB PDF file)

A Content-Based User-Feedback Driven Playlist Generator and its Evaluation in a Real-World Scenario

Martin Gasser, Elias Pampalk, Martin Tomitsch

The Simple Playlist Generator (SPG) is a software audio player with integrated playlist generation capabilities. It combines purely content-based playlist generation with an iterative playlist refinement process based on user-feedback. In this paper we briefly describe the system, and we present a systematic user evaluation of the software. Our results indicate that content-based play\-list generation bears a lot of potential in the domain of personal music collections. Users very quickly adapted to the idea of playlists based on music similarity and were satisfied with the quality of automatically generated playlists.

Keywords: user evaluation, playlist generation, user feedback, audio similarity

Citation: Gasser M., Pampalk E., Tomitsch M.: A Content-Based User-Feedback Driven Playlist Generator and its Evaluation in a Real-World Scenario. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-08, 2008


OFAI-TR-2008-07 ( 198kB PDF file)

An Implementation of a Simple Playlist Generator Based on Audio Similarity Measures and User Feedback

Elias Pampalk, Martin Gasser

This paper presents an implementation of a simple playlist generator. An audio-based music similarity measure and simple heuristics are used to create playlists given minimum user input. The ultimate goal of this work is to conduct a field study, i.e., to run the system on the users' personal collection and study the usage behavior over a longer period of time. The functions include, for example, allowing the user to control the variance of the playlists in terms of how often the same song or songs from the same artists are repeated.

Keywords: content-based, music similarity, user feedback, playlist generation

Citation: Pampalk E., Gasser M.: An Implementation of a Simple Playlist Generator Based on Audio Similarity Measures and User Feedback. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-07, 2008


OFAI-TR-2008-06 ( 201kB PDF file)

Automatic Reduction of MIDI Files Preserving Relevant Musical Content

S{\o}ren Tjagvad Madsen, Rainer Typke, Gerhard Widmer

Retrieving music from large digital databases is a demanding computational task. The cost for indexing and searching depends not only on the computational effort of measuring musical similarity, but also heavily on the number and sizes of files in the database. One way to speed up music retrieval is to reduce the search space by removing redundant and uninteresting material in the database. We propose a simple measure of `interestingness' based on music complexity, and present a reduction algorithm for MIDI files based on this measure. It is evaluated by comparing reduction ratios and the correctness of retrieval results for a query by humming task before and after applying the reduction.

Keywords: interestingness, music complexity, reduction, MIDI

Citation: Madsen S., Typke R., Widmer G.: Automatic Reduction of MIDI Files Preserving Relevant Musical Content. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-06, 2008


OFAI-TR-2008-05 ( 328kB PDF file)

Game-Based Development of Collaboration Competences

Sabine Payr, Bernhard Jung, Juan Martinez-Miranda, Paolo Petta

The subject of collaboration has attracted the attention in research areas including management, organizational dynamics and education, mainly because effective collaboration dynamics are fundamental to learning, knowledge exchange, and development/innovation processes in a wide variety of contexts. Collaboration competences are emerging as key condition for productive and sustainable innovation and learning processes. The work presented in this paper is part of an international effort aimed at improving the understanding of factors inhibiting effective collaboration dynamics and at developing collaboration competences through training. An integral part of the training settings are simulation games addressing collaboration challenges at the organizational, group and interpersonal levels. This paper focuses on the design and implementation of the embodied virtual character for the interpersonal level game, its theoretical foundations and its deployment and evaluation in learning situations.

Keywords: Serious Games, Blended Learning, Soft Skills, Collaboration, Virtual Character,

Citation: Payr S., Jung B., Martinez-Miranda J., Petta P. (2008) Game-Based Development of Collaboration Competences. in: Proceedings of ED-MEDIA 2008, World Conference on Educational Multimedia, Hypermedia & Telecommunications, June 30-July 4, 2008, Vienna, Austria (EU), Association for the Advancement of Computing in Education, Chesapeake VA USA, 4554-4563.


OFAI-TR-2008-04 ( 752kB g-zipped PostScript file,  345kB PDF file)

StreamCatcher: Integrated Visualization of Music Clips and Online Audio Streams

Martin Gasser, Arthur Flexer, Gerhard Widmer

We propose a content-based approach to explorative visualization of online audio streams (e.g., web radio streams). The visualization space is defined by prototypical instances of musical concepts taken from personal music collections. Our system shows the relation of prototypes to each other and generates an animated visualization that places representations of audio streams in the vicinity of their most similar prototypes. Both computation of music similarity and visualization are formulated for online real time performance. A software implementation of these ideas is presented and evaluated.

Keywords: Music Information Retrieval, Visualization, Multidimensional Scaling

Citation: Gasser M., Flexer A., Widmer G.: StreamCatcher: Integrated Visualization of Music Clips and Online Audio Streams. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-04, 2008


OFAI-TR-2008-03 ( 57kB g-zipped PostScript file,  133kB PDF file)

Playlist Generation using Start and End Songs

Arthur Flexer, Dominik Schnitzer, Martin Gasser, Gerhard Widmer

A new algorithm for automatic generation of playlists with an inherent sequential order is presented. Based on a start and end song it creates a smooth transition allowing users to discover new songs in a music collection. The approach is based on audio similarity and does not require any kind of meta data. It is evaluated using both objective genre labels and subjective listening tests. Our approach allows users of the website of a public radio station to create their own digital ``mixtapes'' online.

Keywords: Music Information Retrieval, Music recommendation, Pattern Recognition

Citation: Flexer A., Schnitzer D., Gasser M., Widmer G.: Playlist Generation using Start and End Songs. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-03, 2008


OFAI-TR-2008-02 ( 382kB PDF file)

Computational Framework for and the Realization of Cognitive Agents Providing Intelligent Assistance Capabilities

Marcin Skowron, Joerg Irran, Brigitte Krenn

The scope of the presented research covers virtual agents providing intelligent assistance capabilities for accessing and processing information from the Internet, domain specific databases and knowledge repositories. They receive natural language inputs and communicate findings to their users via a set of task oriented interfaces. Cognitive agents are conceived to evolve in a response to the changes of interests, needs and preferences of the users and the alterations in their environment. We present a virtual embodied cognitive agents architecture, and a computational framework that allows their modular and flexible creation, based on a set of components. The framework supports the creation of an environment for multiple agents and provides communication mechanisms, used to share knowledge between the agents. The exemplary assembly of these building blocks to realize smart assistance applications further demonstrates the platform's capacity to support development, instantiation and evaluation of collaborative cognitive agents.

Keywords: Cognitive Agent, Intelligent User Interface, Intelligent Assistance, Modularity, Bottom-up and Top-down control

Citation: Skowron M., Irran J., Krenn B.: Computational Framework for and the Realization of Cognitive Agents Providing Intelligent Assistance Capabilities, Proceedings of the 18th European Conference on Artificial Intelligence (ECAI'2008), July 21-25 2008, Patras, Greece, 2008.


OFAI-TR-2008-01 ( 9004kB PDF file)

AT2AI-6: From Agent Theory to Agent Implementation. Working Notes.

Bernhard Jung, Fabien Michel, Alessandro Ricci, Paolo Petta

Working Notes of the 6th International Workshop AT2AI-6: From Agent Theory to Agent Implementation. Workshop at the Seventh International Conference on Autonomous Agents and Multiagent Systems AAAMAS 2008, May 12-16, Estoril, Portugal, EU.

Keywords: Multi-agent Systems, AAMAS Workshop, Agent Theory, Agent Implementation

Citation: Jung B., Michel F., Ricci A., Petta P.: AT2AI-6: From Agent Theory to Agent Implementation. Working Notes.. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2008-01, 2008


OFAI-TR-2007-11 ( 394kB PDF file)

Phonetic Segmentation of the GEMEP-Corpus: Applying Forced Alignment on Emotional Speech

Hannes Pirker

This report documents the efforts of applying MFCC based Hidden Markov Models for the task of phonetic segmentation of emotional speech. The samples of emotional speech were taken from the Geneva Multimodal Emotion Portrayals (GEMEP) corpus. This multimodal corpus of acted emotional utterances provides data with highly uniform and controlled lexical content, and thus offers a promising basis for further systematic studies, especially on the acoustic properties of emotional speech as well as on the temporal relationship between speech, gestures and facial expressions. The phonetic segmentation on the level of phonemes described in this report offers a solid basis for all kinds of further investigations of temporal properties of multimodal emotional data. The report provides a description of the technical lay-out of the automatic alignment procedure, observations on peculiarities of the data and an evaluation of the obtained quality of the segmentation.

Keywords: Automatic Alignment, Phonetic Segmentation, Emotional Speech

Citation: Pirker H.: Phonetic Segmentation of the GEMEP-Corpus: Applying Forced Alignment on Emotional Speech. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2007-11, 2007


OFAI-TR-2007-10

The Intermediary Agent's Brain: Supporting Learning to Collaborate at the Inter-Personal Level

Juan Martinez, Bernhard Jung, Sabine Payr, Paolo Petta

This paper discusses the design of the Intermediary Agent's brain, the control module of an embodied conversational virtual peer in a simulation game aimed at providing learning experiences regarding the dynamics of collaboration at the inter-personal (IP) level. We derive the overall aims of the game from theoretical foundations in collaboration theory and pedagogical theory and related requirements for the virtual peer; present the overall modular design of the system; and then detail the design perspectives and the interplay of the related operationalised concepts leading to the control architecture of the Intermediary Agent, that is realised as a simple cognitive appraisal process driven by direct and indirect effects of the mission-oriented and social interactions of players and agent on the agent's level of trust in its human peers. We conclude with coverage of related work and key challenges ahead.

Keywords: Interface Agents, Collaboration, Human-Computer Interaction, Situated Learning,

Citation: Martinez J., Jung B., Payr S., Petta P.: The Intermediary Agent's Brain: Supporting Learning to Collaborate at the Inter-Personal Level. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2007-10, 2007


OFAI-TR-2007-09 ( 272kB PDF file)

Designing Criteria-Driven Scheduling as Integrated Service for IEEE-FIPA Compliant Multi-Agent Infrastructures

Christoph Hermann, Bernhard Jung, Paolo Petta

The Foundation for Intelligent Physical Agents (FIPA) provides a rich set of standards for implementing industrial scale multi-agent infrastructures. Despite its manifold possibilities for achieving coordinated action execution based on interaction protocols, it lacks direct support of multi-agent planning and scheduling for goal directed action execution. In this paper, we discuss a design strategy to integrate Design-to-Criteria (DTC) scheduling using the Framework for Task Analysis, Environment Modeling and Simulation (TAEMS) for explicit partial modelling of coordination issues into FIPA infrastructures, as represented by the Java Agent DEvelopment Framework (JADE). Following the concept of "coordination as a service", we exploit the infrastructural facilities of the FIPA multi-agent platform, and re-use FIPA interaction protocols for exchange of partial TAEMS structures, as well as for committing to action execution.

Keywords: Multi-agent Systems, TAEMS, Java Agent DEvelopment Framework (JADE), Design-to-Criteria Scheduling (DTC), Coordination

Citation: Hermann C., Jung B., Petta P.: Designing Criteria-Driven Scheduling as Integrated Service for IEEE-FIPA Compliant Multi-Agent Infrastructures. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2007-09, 2007


OFAI-TR-2007-08

Learning to anticipate a temporarily hidden moving object

Achim Lewandowski

In this paper we provide a robot with the ability to anticipate the location of reappearance of a moving target, usually a ball, which is temporarily hidden behind a wall. The images taken by the robot are simplified and transformed into a smaller number of sector views, whereby each sector is assigned to one of the states 'target', 'wall' or 'background'. Based on the current observed sector view an action sequence is started, which is either followed until a different sector view has occurred or a pre-defined number of steps has elapsed. We need to add a simple memory, if the object is allowed to pass behind the wall from both ends. The goal for the robot is to attain a distance to the object, which is lower than a given threshold, possibly in the smallest number of steps. During training rewards are derived from the average number of steps needed to reach a new sector view and a matrix of estimated transition probabilities is updated with every transition. Bumping into walls is penalized. Accessible states which belong to views with the so far observed maximum number of target sectors are declared as goal states. We alter the transition and the reward matrices to allow the application of known optimization algorithms to find a path to the goal states. We demonstrate the applicability of our algorithm with simulation experiments.

Keywords: Robotics, Anticipation

Citation: Lewandowski A.: Learning to anticipate a temporarily hidden moving object. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2007-08, 2007


OFAI-TR-2007-07 ( 647kB PDF file)

Anticipatory Behaviour based on Prediction of Image Sequences

Achim Lewandowski, Patrick M. Poelz, Georg Dorffner

In our scenario an object moves in an environment with obstacles. We would like to provide a robot with anticipatory behaviour. The target is often not visible and the robot should learn to anticipate the location where the object will most likely reappear. The observations are coded as a sequence of views and therefore no physical motion model is needed for the moving object. We apply Prediction Fractal Machines (PFM) as well as Variable Length Markov Models (VLMM) to predict the continuation of the sequence of views. We present results of simulations and real world experiments.

Keywords: Robotics, Anticipation, Image processing, Prediction Fractal Machines, VLMM

Citation: Lewandowski A., Poelz P., Dorffner G.: Anticipatory Behaviour based on Prediction of Image Sequences. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2007-07, 2007


OFAI-TR-2007-06 ( 283kB PDF file)

Using Particle Filters to Anticipate the Location of Reappearance of a Temporarily Hidden Target

Achim Lewandowski, Patrick M. Poelz, Georg Dorffner

A robot and a moving target act in a known environment. Obstacles, predominantly walls, may hinder free view and motion. The task of the robot is to anticipate a possible location of reappearance, if the target is currently hidden. We assume in our approach that the belief about the current state of the target can be expressed as a timedependent probability function, which itself is represented by means of particle filters. The robot knows the dynamics of the target or is at least allowed, during a training phase, to learn the one step forecast of the state of the target, including the behavior near to a wall. We show how to forecast the state of the target, how to use the robot’s observations for an update of the current belief, and finally how to take advantage of the approach by predicting a possible location of reappearance to which the robot might be sent.

Keywords: Robotics, Anticipation, Particle Filters

Citation: Lewandowski A., Poelz P., Dorffner G.: Using Particle Filters to Anticipate the Location of Reappearance of a Temporarily Hidden Target. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2007-06, 2007


OFAI-TR-2007-05 ( 557kB PDF file)

From ActAffAct to BehBehBeh: Increasing Affective Detail in a Story-World

Stefan Rank, Paolo Petta

Story-worlds are virtual worlds inhabited by synthetic characters that provide an environment in which users participate actively in the creation of a narrative. Implementation approaches range from plot-driven to character-based. Character-based approaches require synthetic agents with autonomy and personality. Affective agent architectures are used to construct such autonomous personality agents, and computational models of emotion are seen as a prerequisite for the required emotional and social competences. The present paper reports on ongoing work based on the experiences gained in earlier work, in particular TABASCO and ActAffAct (Acting Affectively affecting Acting).

Keywords: story-world, interactive narrative, affective agent architectures, disgust

Citation: Rank S., Petta P. (2007): From ActAffAct to BehBehBeh: Increasing Affective Detail in a Story-World. In Cavazza M., Donikian S. (eds.), Virtual Storytelling: Fourth International Conference (ICVS 2007), St.Malo, France, EU, December, 2007. Proceedings, Springer-Verlag Berlin Heidelberg, pp.206-209.
Report: Rank S., Petta P. (2007): From ActAffAct to BehBehBeh: Increasing Affective Detail in a Story-World. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2007-05.


OFAI-TR-2007-04 ( 234kB g-zipped PostScript file,  384kB PDF file)

Independent Component Analysis of EEG recorded during two-person game playing

Arthur Flexer, Scott Makeig

We report a study on two-person game playing involving simultaneous EEG recording from both subjects. Independent Component Analysis is used for identifying activities of individual cortical EEG sources. Activity of a midline fronto-central component is identified in four of five subjects. This component accounts for the P300 waveform whose amplitude varies depending on the success in the gaming situation.

Keywords: EEG, Game Theory, Independent Component Analysis

Citation: Flexer A., Makeig S.: Independent Component Analysis of EEG recorded during two-person game playing, to appear in: Applied Artificial Intelligence, 2007.


OFAI-TR-2007-03 ( 143kB PDF file)

Building a computational model of emotion based on parallel processes and resource management

Stefan Rank

Dramatic story-worlds require software agents with emotional competences. I propose as building blocks for a computational model of emotion explicitly bounded resources and concurrently active processes acquiring and using the resources. A set of objectives for the implementation of such a model is presented based on limitations identified in earlier approaches towards the same goal. The proposed method for achieving these objectives involves incremental modelling of a growing collection of emotional episodes, with a clear delineation of technically necessary simplifications.

Keywords: affective agent architectures, computational modelling, embodiment, design criteria

Citation: Rank S. (2007): Building a computational model of emotion based on parallel processes and resource management. In Cowie R., Rosis F.de (eds.), Proceedings of the Doctoral Consortium, The Second International Conference on Affective Computing and Intelligent Interaction, September 12-14, 2007, Lisbon, Portugal, pp.102-109.


OFAI-TR-2007-02 ( 986kB PDF file)

Basing artificial emotion on process and resource management

Stefan Rank, Paolo Petta

Executable computational process models of emotion are based on specific sets of modelling primitives. Motivated by the requirements of a specific scenario and concepts used by emotion theories, we propose as building blocks explicitly bounded resources and concurrent processes acquiring and using them. Our approach is intended for the incremental modelling of a growing collection of emotional episodes, with a clear delineation of technically necessary simplifications of the natural phenomena. An episode of disgust is used to discuss the approach, which is realised using real-time cooperative microthreading technology.

Keywords: affective agent architectures, appraisal theories, computational modelling, design criteria, disgust, embodiment, real-time systems

Citation: Rank S., Petta P. (2007): Basing artificial emotion on process and resource management. In Paiva A. et al. (eds.), Proceedings of the 2nd International Conference on Affective Computing and Intelligent Interaction, September 12-14, 2007, Lisbon, Portugal, Springer-Verlag Berlin Heidelberg, pp.350-361.


OFAI-TR-2007-01 ( 45kB g-zipped PostScript file,  54kB PDF file)

A closer look on artist filters for musical genre classification

Arthur Flexer

Musical genre classification is the automatic classification of audio signals into user defined labels describing pieces of music. A problem inherent to genre classification experiments in music information retrieval research is the use of songs from the same artist in both training and test sets. We show that this does not only lead to over-optimistic accuracy results but also selectively favours particular classification approaches. The advantage of using models of songs rather than models of genres vanishes when applying an artist filter. The same holds true for the use of spectral features versus fluctuation patterns for preprocessing of the audio files.

Keywords: music information retrieval, audio classification, genre classification, artist filter

Citation: Flexer A.: Musical Genre Classification and the Artist Filter Effect, in Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR'07), Vienna, Austria, 2007.


OFAI-TR-2006-15 ( 497kB PDF file)

Social role management in human-computer interaction I

Sabine Payr

This report summarises research undertaken in the first year of the sub-project 4 "Social Role Management in Human-Computer Interaction" of the FFG SELP project "Advanced Knowledge Technologies: Grounding, Fusion, Applications". The first part describes the corpus of human-human and human-machine dialogues that have been collected. The second part reviews methods of discourse analysis that are being used for dialogue analysis. The aspect of social relationships considered at this stage are inequalities in power, be they institutional or interactional. This theme has influenced the review of methods and the first analyses that have been undertaken. It is already possible to present and discuss some of the results in terms of discourse strategies and devices, from literature as well as from our own analyses. It is shown that it is possible to apply these notions to human-machine dialogues. Examples show the relevance of the approach and underline the importance of parallel and comparative use of data from human-human and human-machine dialogues. Only in this way it is possible to describe not only the presence but also the absence of certain discursive signals and the effects of this presence/absence on the social or socio-technical relationship between humans and conversational machines.

Keywords: human-computer interaction, Embodied Conversational Actors (ECAs), Dialogue Simulation, Conversation Analysis, Critical Discourse Analysis

Citation: Payr S.: Social role management in human-computer interaction I. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2006-15, 2006


OFAI-TR-2006-14 ( 805kB g-zipped PostScript file,  211kB PDF file)

Acoustic Cues to Beat Induction: A Machine Learning Perspective

Fabien Gouyon, Gerhard Widmer, Xavier Serra, Arthur Flexer

This paper brings forward the question of which acoustic features are the most adequate for identifying beats computationally in acoustic music pieces. We consider many different features computed on consecutive short portions of acoustic signal, among which those currently promoted in the literature on beat induction from acoustic signals and several original features, unmentioned in this literature. Evaluation of feature sets regarding their ability to provide reliable cues to the localization of beats is based on a machine learning methodology with a large corpus of beat-annotated music pieces, in audio format, covering distinctive music categories. Confirming common knowledge, energy is shown to be a very relevant cue to beat induction (especially the temporal variation of energy in various frequency bands, with the special relevance of frequency bands below 500 Hz and above 5 kHz). Some of the new features proposed in this paper are shown to outperform features currently promoted in the literature on beat induction from acoustic signals. We finally hypothesize that modelling beat induction may involve many different, complementary, acoustic features and that the process of selecting relevant features should partly depend on acoustic properties of the very signal under consideration.

Keywords: Beat Induction, Music Information Retrieval

Citation: to appear in: Music Perception


OFAI-TR-2006-13 ( 241kB PDF file)

Detecting Harmonic Change in Musical Audio

Christopher Harte, Mark Sandler, Martin Gasser

We propose a novel method for detecting changes in the harmonic content of musical audio signals. Our method uses a new model for Equal Tempered Pitch Class Space. This model maps 12-bin chroma vectors to the interior space of a 6-D polytope; pitch classes are mapped onto the vertices of this polytope. Close harmonic relations such as fifths and thirds appear as small Euclidian distances. We calculate the Euclidian distance between analysis frames n + 1 and n ? 1 to develop a harmonic change measure for frame n. A peak in the detection function denotes a transition from one harmonically stable region to another. Initial experiments show that the algorithm can successfully detect harmonic changes such as chord boundaries in polyphonic audio recordings.

Keywords: Pitch Space, Harmonic, Segmentation, Music, Audio

Citation: C. Harte, M. Sandler, and M. Gasser. Detecting Harmonic Change in Musical Audio. In Proceedings of the Audio and Music Computing for Multimedia Workshop (in conjunction with ACM Multimedia 2006), October 27, 2006, Santa Barbara, Canada (to appear)


OFAI-TR-2006-12 ( 82kB PDF file)

Onset Detection Revisited

Simon Dixon

Various methods have been proposed for detecting the onset times of musical notes in audio signals. We examine recent work on onset detection using spectral features such as the magnitude, phase and complex domain representations, and propose improvements to these methods: a weighted phase deviation function and a half-wave rectified complex difference. These new algorithms are compared with several state-of-the-art algorithms from the literature, and these are tested using a standard data set of short excerpts from a range of instruments (1060 onsets), plus a much larger data set of piano music (106054 onsets). Some of the results contradict previously published results and suggest that a similarly high level of performance can be obtained with a magnitude-based (spectral flux), a phase-based (weighted phase deviation) or a complex domain (complex difference) onset detection function.

Keywords: Music, Audio content extraction

Citation: Proceedings of the 9th International Conference on Digital Audio Effects


OFAI-TR-2006-11 ( 595kB PDF file)

Exploring Pianist Performance Styles with Evolutionary String Matching

S. T. Madsen, G. Widmer

We propose novel machine learning methods for exploring the domain of music performance praxis. Based on simple measurements of timing and intensity in 12 recordings of a Schubert piano piece, short performance sequences are fed into a SOM algorithm in order to calculate `performance archetypes'. The archetypes are labeled with letters and approximate string matching done by an evolutionary algorithm is applied to find similarities in the performances represented by these letters. We present a way of measuring each pianist's habit of playing similar phrases in similar ways and propose a ranking of the performers based on that. Finally, an experiment revealing common expression patterns is briefly described.

Keywords: Self Organizing Map, Evolutionary Algorithm, Approximate String Matching, Expressive Music Performance

Citation: Madsen S.T., Widmer G.: Exploring Pianist Performance Styles with Evolutionary String Matching. International Journal on Artificial Intelligence Tools. World Scientific Publishing Company, 2006.


OFAI-TR-2006-10 ( 126kB PDF file)

Unobtrusive practice tools for pianists

W. Goebl, G Widmer

This paper proposes novel computer-based interfaces for piano practicing. They are designed to display in real time certain well-defined sub-aspects of piano playing. They are intelligent and unobtrusive in that they adjust automatically to the needs of the practitioner so that no other interaction is needed than moving the piano keys. They include 1) a pattern display, finding recurring pitch patterns and displaying expressive timing and dynamics thereof, 2) a chord display, showing timing asynchronies and tone intensity variations of chords tones, and 3) an acoustic piano roll display that visually models the acoustic piano tone from MIDI data.

Citation: Goebl W., Widmer G.: Unobtrusive practice tools for pianists. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2006-10, 2006


OFAI-TR-2006-09 ( 63kB g-zipped PostScript file,  69kB PDF file)

Probabilistic Combination of Features for Music Classification

Arthur Flexer, Fabien Gouyon, Simon Dixon, Gerhard Widmer

We describe an approach to combination of music similarity feature spaces in the context of music classification. The approach is based on taking the product of posterior probabilities obtained from separate classifiers for the different feature spaces. This allows for a different influence of the classifiers per song and an overall classification accuracy which improves those resulting from individual feature spaces alone. This is demonstrated by combining spectral and rhythmic similarity for classification of ballroom dance music.

Keywords: Music Information retrieval, Music Classification, Combination, Spectral Similarity, Rhythmic Similarity, Tempo

Citation: Flexer A., Gouyon F., Dixon S., Widmer G.: Probabilistic Combination of Features for Music Classification. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2006-09, 2006


OFAI-TR-2006-08

Computational Models of Music Similarity and their Application to Music Information Retrieval

Elias Pampalk

This thesis aims at developing techniques which support users in accessing and discovering music. The main part consists of two chapters. Chapter 2 gives an introduction to computational models of music similarity. The combination of different approaches is optimized and the largest evaluation of music similarity measures published to date is presented. The best combination performs significantly better than the baseline approach in most of the evaluation categories. A particular effort is made to avoid overfitting. To cross-check the results from the evaluation based on genre classification a listening test is conducted. The test confirms that genrebased evaluations are suitable to efficiently evaluate large parameter spaces. Chapter 2 ends with recommendations on the use of similarity measures. Chapter 3 describes three applications of such similarity measures. The first application demonstrates how music collections can be organized and visualized so that users can control the aspect of similarity they are interested in. The second application demonstrates how music collections can be organized hierarchically into overlapping groups at the artist level. These groups are summarized using words from web pages associated with the respective artists. The third application demonstrates how playlists can be generated which require minimum user input.

Citation: Elias Pampalk, Computational Models of Music Similarity and their Application to Music Information Retrieval, Doctoral Thesis, Vienna University of Technology, March 2006


OFAI-TR-2006-04 ( 154kB PDF file)

Comparability is Key to Assess Affective Architectures

Stefan Rank, Paolo Petta

Current research on affective architectures for situated agents is fragmented into modelling aspects of emotion for specific purposes. To improve comparability of different architectures, we propose an approach that comprises the analysis of both, the niche space of target scenarios and the design space of architectures for autonomous agents. In this paper we focus on the niche space, and illustrate how the emotional phenomena that may occur in scenarios can be used to contextualise the investigation of functional roles of emotion within and across agents.

Keywords: cognitive appraisal theories of emotion, situated agent, emotion, agent architectures, scenarios

Citation: Rank S., Petta P. (2006): Comparability is Key to Assess Affective Architectures. In Trappl R. (ed.), Cybernetics and Systems 2006, Austrian Society for Cybernetic Studies, Vienna, pp.643-648.


OFAI-TR-2006-03 ( 415kB PDF file)

An Experimental Comparison of Audio Tempo Induction Algorithms

Fabien Gouyon, Anssi Klapuri, Simon Dixon, Miguel Alonso, George Tzanetakis, Christian Uhle, Pedro Cano

We report on the tempo induction contest organised during the International Conference on Music Information Retrieval (ISMIR 2004) held at the University Pompeu Fabra in Barcelona in October 2004. The goal of this contest was to evaluate some state-of-the-art algorithms in the task of inducing the basic tempo (as a scalar, in beats per minute) from musical audio signals. To our knowledge, this is the first published large scale cross-validation of audio tempo induction algorithms. Participants were invited to submit algorithms to the contest organiser, in one of several allowed formats. No training data was provided. A total of 12 entries (representing the work of 7 research teams) were evaluated, 11 of which are reported in this document. Results on the test set of 3199 instances were returned to the participants before they were made public. Anssi Klapuri's algorithm won the contest. This evaluation shows that tempo induction algorithms can reach over 80% accuracy for music with a constant tempo, if we do not insist on finding a specific metrical level. After the competition, the algorithms and results were analysed in order to discover general lessons for the future development of tempo induction systems. One conclusion is that robust tempo induction entails the processing of frame features rather than that of onset lists. Further, we propose a new ``redundant'' approach to tempo induction, inspired by knowledge of human perceptual mechanisms, which combines multiple simpler methods using a voting mechanism. Machine emulation of human tempo induction is still an open issue. Many avenues for future work in audio tempo tracking are highlighted, as for instance the definition of the best rhythmic features and the most appropriate periodicity detection method. In order to stimulate further research, the contest results, annotations, evaluation software and part of the data are available at http://ismir2004.ismir.net/ISMIR_Contest.html

Keywords: Rhythm, Beat tracking, Music information retrieval

Citation: IEEE Transactions on Speech and Audio Processing, Sept 2006, to appear


OFAI-TR-2006-02 ( 250kB PDF file)

A Review of Automatic Rhythm Description Systems

Fabien Gouyon, Simon Dixon

Rhythm belongs with harmony, melody and timbre as one of the most fundamental aspects of music. Sound by its very nature is temporal, and in its most generic sense, the word rhythm is used to refer to all of the temporal aspects of a musical work, whether represented in a score, measured from a performance, or existing only in the perception of the listener. In order to build a computer system capable of intelligently processing music, it is essential to design representation formats and processing algorithms for the rhythmic content of the music. Computer systems reported in the literature offer different interpretations of the words ``automatic rhythm description'' as they address diverse applications such as tempo induction, beat tracking, quantisation of performed rhythms, meter induction and characterisation of intentional timing deviations. Although some rhythmic concepts are consensual, no single representation of rhythm has been devised which would be suitable for all applications. In this paper, we propose a unifying framework for automatic rhythm description systems, and review existing systems with respect to the functional units of the proposed framework.

Keywords: Rhythm, Music Information Retrieval, Tempo Induction, Beat Tracking

Citation: Computer Music Journal 29 (1), 34-54, 2005


OFAI-TR-2006-01 ( 570kB PDF file)

Perceptual Smoothness of Tempo in Expressively Performed Music

Simon Dixon, Werner Goebl, Emilios Cambouropoulos

We report three experiments examining the perception of tempo in expressively performed classical piano music. Each experiment investigates beat and tempo perception in a different way: rating the correspondence of a click track to a musical excerpt with which it was simultaneously presented; graphically marking the positions of the beats using an interactive computer program; and tapping in time with the musical excerpts. We examine the relationship between the timing of individual tones, that is, the directly measurable temporal information, and the timing of beats as perceived by listeners. Many computational models of beat tracking assume that beats correspond with the onset of musical tones. We introduce a model, supported by the experimental results, in which the beat times are given by a curve calculated from the tone onset times that is smoother (less irregular) than the tempo curve of the onsets.

Keywords: Rhythm, Beat Tracking, Music Perception

Citation: Music Perception, 23 (3), pp 195-214, 2006.


OFAI-TR-2005-27 ( 1333kB g-zipped PostScript file,  1150kB PDF file)

Digits - A Dataset for Handwritten Digit Recognition

A.K. Seewald

In this paper we describe the preprocessing steps for a contributed digit dataset, going all the way from a physical page of paper -- filled out by students -- past digital scanning to computerized segmentation, resizing, and blurring. Surprisingly, very little expertise can be transferred from other datasets to our new dataset for a state-of-the-art SVM classifier, although the performance for each separate dataset is acceptable. This may indicate that at least SVM, and possibly also other learners, are sensitive to small changes in preprocessing, emphasizing the need not only to create benchmark datasets for handwritten digit recognition, but also to document their preprocessing as detailed as possible and aim to replicate that as well. Our work is a small step in that direction.

Keywords: Data Mining, Pattern Recognition, Handwritten Digit Recognition

Citation: Seewald A.: Digits - A Dataset for Handwritten Digit Recognition. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-27, 2005


OFAI-TR-2005-26 ( 68kB PDF file)

Automatic Classification of Musical Artists based on Web-Data

Peter Knees, Elias Pampalk, Gerhard Widmer

The organization of music is one of the central challenges in times of increasing distribution of digital music. A well-tried means is the classification in genres and/or styles. In this paper we propose the use of text categorization techniques to classify artists present on the Internet. In particular, we retrieve and analyze webpages ranked by search engines to describe artists in terms of word occurrences on related pages. To classify artists we primarily use support vector machines. Based on a previously published paper and on a master’s thesis, we present experiments comprising the evaluation of the classification process on a taxonomy of 14 genres with altogether 224 artists, as well as an estimation of the impact of daily fluctuations in the Internet on our approach, exploiting a long-term study over a period of almost one year. On the basis of these experiments we study (a) how many artists are necessary to define the concept of a genre, (b) which search engines perform best, (c) how to formulate search queries best, (d) which overall performance we can expect for classification, and finally (e) how our approach is suited as a similarity measure for artists.

Citation: ÖGAI Journal Vol. 24, No. 1, pp 16-25.


OFAI-TR-2005-25 ( 672kB PDF file)

Intelligent Structuring and Exploration of Digital Music Collections

Markus Schedl, Elias Pampalk, Gerhard Widmer

In this paper we present a general approach to the automatic content-based organization and visualization of large digital music collections. The general methodology consists in extracting musically and perceptually relevant patterns (‘features’) from the given audio recordings (e.g., mp3 files), using topology-preserving data projection methods to map the entire music collection onto two-dimensional visualization planes (possibly in a hierarchical fashion), and using a new display metaphor (the ‘Islands of Music’) to display the inherent structure of the music collection to the user. It is shown how arbitrary meta-data can be integrated into the visualization process, and how similarity according to different viewpoints can be defined and exploited. The basic methodology is briefly described, three prototype systems are presented, and a general discussion of the practical application possibilities of such technologies is offered.

Keywords: intelligent music processing, music similarity, automatic hierarchical structuring, interfaces to music

Citation: Elektrotechnik und Informationstechnik 7/8, pp 1-6, Springer Verlag


OFAI-TR-2005-24 ( 112kB PDF file)

Improvements of Audio-Based Music Similarity and Genre Classification

Elias Pampalk, Arthur Flexer, Gerhard Widmer

Audio-based music similarity measures can be applied to automatically generate playlists or recommendations. In this paper spectral similarity is combined with complementary information from fluctuation patterns including two new descriptors derived thereof. The performance is evaluated in a series of experiments on four music collections. The evaluations are based on genre classification, assuming that very similar tracks belong to the same genre. The main findings are that, (1) although the improvements are substantial on two of the four collections our extensive experiments confirm earlier findings that we are approaching the limit of how far we can get using simple audio statistics. (2)We have found that evaluating similarity through genre classification is biased by the music collection (and genre taxonomy) used. Furthermore, (3) in a cross validation no pieces from the same artist should be in both training and test set.

Keywords: SIMAC

Citation: Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR'05), London, UK, September 11-15.


OFAI-TR-2005-23 ( 55kB PDF file)

Dynamic Playlist Generation Based on Skipping Behaviour

Elias Pampalk, Tim Pohle, Gerhard Widmer

Common approaches to creating playlists are to randomly shuffle a collection (e.g. iPod shuffle) or manually select songs. In this paper we present and evaluate heuristics to adapt playlists automatically given a song to start with (seed song) and immediate user feedback. Instead of rich metadata we use audio-based similarity. The user gives feedback by pressing a skip button if the user dislikes the current song. Songs similar to skipped songs are removed, while songs similar to accepted ones are added to the playlist. We evaluate the heuristics with hypothetical use cases. For each use case we assume a specific user behavior (e.g. the user always skips songs by a particular artist). Our results show that using audio similarity and simple heuristics it is possible to drastically reduce the number of necessary skips.

Keywords: SIMAC

Citation: Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR'05), London, UK, September 11-15.


OFAI-TR-2005-22 ( 258kB PDF file)

Hierarchical Organization and Description of Music Collections at the Artist Level

Elias Pampalk, Arthur Flexer, Gerhard Widmer

As digital music collections grow, so does the need to organizing them automatically. In this paper we present an approach to hierarchically organize music collections at the artist level. Artists are grouped according to similarity which is computed using a web search engine and standard text retrieval techniques. The groups are described by words found on the webpages using term selection techniques and domain knowledge.We compare different term selection techniques, present a simple demonstration, and discuss our findings.

Keywords: SIMAC

Citation: Proceedings of the 9th European Conference on Research and Advanced Technology for Digital Libraries (ECDL'05), Vienna, Austria, September 18-23.


OFAI-TR-2005-21 ( 42kB PDF file)

Speeding Up Music Similarity

Elias Pampalk

This paper describes (1) the submission to the ISMIR’04 genre classification contest and (2) the submission to the MIREX’05 (Music Information Retrieval eXchange) audio-based genre classification and artist identification tasks. The main difference between the submissions is the reduction of computation time in the order of magnitudes. This paper concludes with a discussion of the relationship between genre classification and artist identification, the relationship between similarity and classification, and references to related MIREX’05 submissions.

Keywords: Spectral Music Similarity, Fluctuation Patterns, Genre Classification, SIMAC

Citation: MIREX 2005, 2nd Annual Music Information Retrieval Evaluation eXchange, September 11 – 14, London


OFAI-TR-2005-20 ( 241kB g-zipped PostScript file,  433kB PDF file)

An Evaluation of Naive Bayes Variants in Content-Based Learning for Spam Filtering

A.K. Seewald

We describe an in-depth analysis of spam-filtering performance of a simple Naive Bayes learner and two current variants. A set of seven mailboxes comprising about 65,000 mails from seven different users, as well as a representative snapshot of 25,000 mails which were received over 18 weeks by a single user, were used for evaluation. Our main motivation was to test whether two variants of Naive Bayes learning, SpamAssassin and CRM114, were superior to simple Naive Bayes learning, represented by SpamBayes.
Surprisingly, we found that the performance of these systems was remarkably similar and that the extended systems have significant weaknesses which are not apparent for the simpler Naive Bayes learner. The simpler Naive Bayes learner, SpamBayes, also offers the most stable performance in that it deteriorates least over time. Overall, SpamBayes should be preferred over the more complex variants.
In the course of our investigations we also propose a mail collection procedure that allows to train well-performing spam-filters much faster than other approaches, extensively check for noise-level susceptibility and concept drift, and investigate both performance at default thresholds and threshold-independent performance with some suprising results.

Keywords: Multi-view learning, Applications, Spam-filtering, Machine Learning

Citation: Seewald A.: An Evaluation of Naive Bayes Variants in Content-Based Learning for Spam Filtering. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-20, 2005


OFAI-TR-2005-19

Generating Similarity-Based Playlists Using Traveling Salesman Algorithms

Tim Pohle, Elias Pampalk, Gerhard Widmer

When using a mobile music player en-route, usually only little attention can be paid to its handling. Nonetheless it is desirable that all music stored in the device can be accessed quickly, and that tracks played in a sequence should match up. In this paper, we present an approach to satisfy these constraints: a playlist containing all tracks stored in themusic player is generated such that in average, consecutive pieces are maximally similar. This is achieved by applying a Traveling Salesman algorithm to the pieces, using timbral similarities as the distances. The generated playlist is linear and circular, thus the whole collection can easily be browsed with only one input wheel. When a chosen track finishes playing, the player advances to the consecutive tracks in the playlist, generally playing tracks similar to the chosen track. This behavior could be a favorable alternative to the wellknown shuffle function that most current devices – such as the iPod shuffle, for example – have. We evaluate the fitness of four different Traveling Salesman algorithms for this purpose. Evaluated aspects were runtime, the length of the resulting route, and the genre distribution entropy. We implemented a Java applet to demonstrate the application and its usability.

Keywords: audio similarity, Music Information Retrieval (MIR), playlist generation

Citation: Pohle T., Pampalk E., Widmer G.: Generating Similarity-Based Playlists Using Traveling Salesman Algorithms. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-19, 2005


OFAI-TR-2005-18 ( 61kB g-zipped PostScript file,  91kB PDF file)

Statistical Evaluation of Music Information Retrieval Experiments

Arthur Flexer

This work concerns the necessity of statistical evaluation of Music Information Retrieval (MIR) experiments. This necessity is motivated by applying fundamental notions of statistical hypotheses testing to MIR research. Minimum requirements concerning statistical evaluation are developed and the appropriate statistical techniques are introduced and exemplified in a genre classification context. Articles from the MIR literature are examined and criticized for the lack of statistical evaluation they contain.

Keywords: Music Information Retrieval, Evaluation, Statistical Testing, Sampling Methods

Citation: Flexer A.: Statistical Evaluation of Music Information Retrieval Experiments. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-18, 2005


OFAI-TR-2005-17 ( 92kB PDF file)

MATCH: A Music Alignment Tool Chest

Simon Dixon, Gerhard Widmer

We present MATCH, a toolkit for aligning audio recordings of different renditions of the same piece of music, based on an efficient implementation of a dynamic time warping algorithm. A forward path estimation algorithm constrains the alignment path so that dynamic time warping can be performed with time and space costs that are linear in the size of the audio files. Frames of audio are represented by a positive spectral difference vector, which emphasises note onsets in the alignment process. In tests with Classical and Romantic piano music, the average alignment error was 41ms (median 20ms), with only 2 out of 683 test cases failing to align. The software is useful for content-based indexing of audio files and for the study of performance interpretation; it can also be used in real-time for tracking live performances. The toolkit also provides functions for displaying the cost matrix, the forward and backward paths, and any metadata associated with the recordings, which can be shown in real time as the alignment is computed.

Keywords: Audio alignment, Music performance analysis, Dynamic time warping

Citation: Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005), London, England, September 2005, pp 492-497.


OFAI-TR-2005-16 ( 138kB PDF file)

Live Tracking of Musical Performances using On-Line Time Warping

Simon Dixon

Dynamic time warping finds the optimal alignment of two time series, but it is not suitable for on-line applications because it requires complete knowledge of both series before the alignment can be computed. Further, the quadratic time and space requirements are limiting factors even for off-line systems. We present a novel on-line time warping algorithm which has linear time and space costs, and performs incremental alignment of two series as one is received in real time. This algorithm is applied to the alignment of audio signals in order to follow musical performances of arbitrary length. Each frame of audio is represented by a positive spectral difference vector, emphasising note onsets. The system was tested on various test sets, including recordings of 22 pianists playing music by Chopin, where the average alignment error was 59ms (median 20ms). We demonstrate one application of the system: the analysis and visualisation of musical expression in real time.

Keywords: Audio alignment, Music performance analysis, Dynamic time warping

Citation: Proceedings of the 8th International Conference on Digital Audio Effects (DAFx05), Madrid, Spain, September 2005, pp 92-97.


OFAI-TR-2005-15 ( 93kB PDF file)

The "Air Worm": An Interface for Real-Time Manipulation of Expressive Music Performance

Simon Dixon, Werner Goebl, Gerhard Widmer

Expressive performance of traditional Western music is a complex phenomenon which is mastered by few, and yet appreciated by many. In this paper we explore various ways of interacting with expressive performances using methods that are accessible to non-expert music-lovers. The key idea is that a non-expert cannot control the vast range of parameters which musicians use to create expression; they require a simple interface allowing control of a small number of parameters, where the remaining parameters are determined by expert performances. A digital theremin is used as an input device, and users can control the two most important expressive parameters, tempo and loudness, of an expressively performed audio or MIDI file. Several modes of operation are possible: the "Air Worm" builds on previous work in performance visualisation, where the tempo is displayed on the horizontal axis and loudness on the vertical axis in a two-dimensional animation; the "Air Tapper" uses a conducting metaphor where the beat is given by the minimum vertical point in a quasi-periodic trajectory; and the "Mouse Worm" allows users without a theremin to use the mouse as controller.

Keywords: Expressive performance, Interactive performance, Theremin

Citation: Proceedings of the 2005 International Computer Music Conference (ICMC2005), Barcelona, Spain, September 2005, pp 614-617.


OFAI-TR-2005-14 ( 60kB PDF file)

An On-Line Time Warping Algorithm for Tracking Musical Performances

Simon Dixon

Dynamic time warping is not suitable for on-line applications because it requires complete knowledge of both series before the alignment of the first elements can be computed. We present a novel on-line time warping algorithm which has linear time and space costs, and performs incremental alignment of two series as one is received in real time. This algorithm is applied to the alignment of audio signals in order to track musical performances.

Keywords: Audio alignment, Music performance analysis, Dynamic time warping

Citation: Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, August 2005, pp 1727-1728.


OFAI-TR-2005-13 ( 197kB g-zipped PostScript file,  336kB PDF file)

Lambda Pruning - An Approximation of the String Subsequence Kernel

Florian Kleedorfer, Alexander K. Seewald

The Support Vector Machine is a powerful learning algorithm which is known to work well for a variety of learning tasks. Kernels for complex data structures such as strings, lists, trees and general graphs have been developed. However, some of these kernels are of high computational complexity and are therefore not widely used. Even relatively efficient kernels may take a lot of patience to run on real-life data. We advance the state of the art for a specific kernel defined on a complex data structure, the Subsequence String Kernel (SSK), in two ways: Firstly, by introducing an approximation technique for the SSK called Lambda Pruning (SSK-LP), which is able to reduce memory consumption and runtime by up to several orders of magnitude with little loss in accuracy. Secondly, by creating an average case time and worst case space complexity model for both SSK and SSK-LP. Estimations of runtime and memory requirements for a specific learning task can be directly computed from average string length and desired parameter settings. This combined approach has allowed us to run learning tasks which would be infeasible with the non-approximated SSK. Memory consumption for SSK-LP is constant and does not depend on string length. Models based on SSK and SSK-LP perform similarily for a set of real-life learning tasks.

Keywords: Machine Learning, Data Mining, Support Vector Machines

Citation: Kleedorfer F., Seewald A.: Lambda Pruning - An Approximation of the String Subsequence Kernel. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-13, 2005.


OFAI-TR-2005-12 ( 252kB PDF file)

The Machine Learning and Intelligent Music Processing Group at the Austrian Research Institute for Artificial Intelligence (OFAI), Vienna

Gerhard Widmer, Simon Dixon, Arthur Flexer, Werner Goebl, Peter Knees, Sřren Tjagvad Madsen, Elias Pampalk, Tim Pohle, Markus Schedl, Asmir Tobudic

The report introduces our research group in Vienna and briefly sketches the major lines of research that are currently being pursued at our lab. Extensive references to publications are given, where the interested reader will find more detail.

Citation: Proceedings of the 2005 International Computer Music Conference (ICMC2005), Barcelona


OFAI-TR-2005-11 ( 103kB PDF file)

What is In an Affective Architecture for Situated Agents?

Stefan Rank, Pablo Lucas dos Anjos, Paolo Petta, Ruth Aylett

The uses of affective architectures are varied, and although different applications share common characteristics, they are founded on different, and not always complementary, conceptual assumptions. This submission to the WP7 workshop of HUMAINE is an attempt to provide a conceptualisation of the role of emotional processes in architectures for situated agents, focussing on the role for bridging the gap between 'higher-' and 'lower-level' aspects of behaviour coordination.

Keywords: cognitive appraisal theories of emotion, situated agent, emotion, agent architectures

Citation: Rank S., Anjos P.L., Petta P., Aylett R. (2005): What is In an Affective Architecture for Situated Agents? In Canamero L. (ed.): Humaine Deliverable D7a: WP7 Workshop Proceedings, King's College London UK EU, July 4-5.


OFAI-TR-2005-10 ( 479kB PDF file)

Appraisal for a Character-based Story-World

Stefan Rank, Paolo Petta

Generation of interesting narratives in a simulated dramatic story-world requires situated software agents with emotional competences. The operationalisation of concepts from appraisal theories of emotion can provide flexible roleplayers that reduce the required external macro-level control. Situatedness and the analysis of the social lifeworld of characters are the foundations of the presented architecture that is used to generate simple cliché plots. The subjective evaluative interpretation of changes in a character’s environment and appropriate reactions provide for the causal and emotional connections that can lead to the unfolding of a story. The architecture is a contribution towards a process-oriented model of emotional phenomena.

Keywords: drama, cognitive appraisal theories of emotion, situated agent, emotion, motivation, agent architectures

Citation: Rank S., Petta P. (2005): Appraisal for a Character-based Story-World. In Panayiotopoulos T. et al. (eds.), Intelligent Virtual Agents, 5th International Working Conference, IVA 2005, Kos, Greece, September 2005, Proceedings, Springer Berlin Heidelberg, pp.495-496.


OFAI-TR-2005-09 ( 53kB g-zipped PostScript file,  64kB PDF file)

Novelty detection based on spectral similarity of songs

Arthur Flexer, Elias Pampalk, Gerhard Widmer

We are introducing novelty detection, i.e. the automatic identification of new or unknown data not covered by the training data, to the field of music information retrieval. Two methods for novelty detection - one based solely on the similarity information and one also utilizing genre label information - are evaluated within the context of genre classification based on spectral similarity. Both are shown to perform equally well.

Keywords: music information retrieval, novelty detection, spectral similarity, genre classification

Citation: Flexer A., Pampalk E., Widmer G.: Novelty detection based on spectral similarity of songs. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-09, 2005


OFAI-TR-2005-08 ( 82kB g-zipped PostScript file,  143kB PDF file)

Hidden Markov Models for spectral similarity of songs

Arthur Flexer, Elias Pampalk, Gerhard Widmer

Hidden Markov Models (HMM) are compared to Gaussian Mixture Models (GMM) for describing spectral similarity of songs. Contrary to previous work we make a direct comparison based on the log-likelihood of songs given an HMM or GMM. Whereas the direct comparison of log-likelihoods clearly favors HMMs, this advantage in terms of modeling power does not allow for any gain in genre classification accuracy.

Keywords: Hidden Markov Models, Spectral Similarity, Music Information Retrieval

Citation: Flexer A., Pampalk E., Widmer G.: Hidden Markov Models for spectral similarity of songs. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-08, 2005


OFAI-TR-2005-07 ( 192kB g-zipped PostScript file,  351kB PDF file)

Using ICA for removal of ocular artifacts in EEG recorded from blind subjects

Arthur Flexer, Herbert Bauer, Juergen Pripfl, Georg Dorffner

One of the standard applications of Independent Component Analysis (ICA) to EEG is removal of artifacts due to movements of the eye bulbs. Short blinks as well as slower saccadic movements are removed by subtracting respective independent components (ICs). EEG recorded from blind subjects poses special problems since it shows a higher quantity of eye movements which are also more irregular and very different across subjects. It is demonstrated that ICA can still be of use by comparing results from four blind subjects with results from one subject without eye bulbs who therefore does not show eye movement artifacts at all.

Keywords: Independent Component Analysis Electroencephalogram

Citation: Flexer A., Bauer H., Pripfl J., Dorffner G.: Using ICA for removal of ocular artifacts in EEG recorded from blind subjects, Neural Networks, Volume 18, Issue 7, pp. 998-1005, 2005.


OFAI-TR-2005-06

Evaluation of Frequently Used Audio Features for Classification of Music Into Perceptual Categories

Tim Pohle, Elias Pampalk, Gerhard Widmer

The ever-growing amount of available music induces an increasing demand for Music Information Retrieval (MIR) applications such as music recommendation applications or automatic classification algorithms. When audio-based, a crucial part of such systems are the audio feature extraction routines. In this paper, we evaluate how well a variety of combinations of feature extraction andmachine learning algorithms are suited to classifymusic into perceptual categories. The examined categorizations are perceived tempo, mood (happy / neutral /sad), emotion (soft / neutral / aggressive), complexity, and vocal content. The aim is to contribute to the investigation which aspects of music are not captured by the common audio descriptors; from our experiments we can conclude that most of the examined categorizations are not captured well. This indicates that more research is needed on alternative (possibly extra-musical) sources of information for useful music classification.

Keywords: AI-Austria Publication, Publications List AI (IMKAI+OESGK+OEFAI), WWW_ML_Music, WWW_ML_MIR, WWW_ML_ML, WWW_ML_SIMAC, WWW_ML_START

Citation: Pohle T., Pampalk E., Widmer G.: Evaluation of Frequently Used Audio Features for Classification of Music Into Perceptual Categories. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-06, 2005


OFAI-TR-2005-05 ( 852kB g-zipped PostScript file,  588kB PDF file)

Agent Encapsulation in a Cognitive Vision MAS

Bernhard Jung, Paolo Petta

This paper casts a baseline cognitive vision design into a multi-agent framework and therein addresses the questions how and to what extent explicit consideration of coordination may affect the design and performance of such systems. In an analysis of our decomposition into task-dependent entities using both, functional and physical approaches to encapsulation, we show that different kinds of algorithms with different notions of architecture and representation become possible. We describe the evolution of our implementation out of a traditional monolithic design and show how in it functionalities akin to notions of conventional tracking and reasoning emerge out of the distributed interaction between component agents, with a performance at least on par with the baseline system.

Keywords: Multi-Agent System (MAS), Agent Encapsulation, Cognitive Vision, Coordination

Citation: Jung B., Petta P.: Agent Encapsulation in a Cognitive Vision MAS. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-05, 2005, in Pechoucek M., Petta P., Varga L.Z. (eds.): Multi-Agent Systems and Applications IV, 4th International Central and Eastern European Conference on Multi-Agent Systems, CEEMAS 2005, Budapest, Hungary, September, 2005, Proceedings, Springer-Verlag Berlin Heidelberg New York, pp. 51-61, 2005.


OFAI-TR-2005-04 ( 76kB g-zipped PostScript file,  195kB PDF file)

A Close Look at Current Approaches in Spam Filtering

A.K. Seewald

In this paper we present an applied machine learning approach to spam filtering, SA-Train. We compare SA-Train, which runs repeated steps of rule score learning via linear Support Vector Machine and bayesian learning and is based on SpamAssassin, to a state-of-the-art bayesian phrase learner, CRM114, as well as to a simpler bayesian learner, SpamBayes. We also compare our approach to ready-to-use systems such as SpamAssassin's default settings without bayesian model, Symantec BrightMail and Google Mail. Surprisingly, the relatively simple SpamBayes learner turns out to be the best learner in this setting, and its corresponding model degrades less over time than two other ready-to-use models, giving almost constant performance for four months without additional training.

Keywords: Spam Filtering, Machine Learning, Multi-view Learning

Citation: Seewald A.: A Close Look at Current Approaches in Spam Filtering. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-04, 2005


OFAI-TR-2005-03 ( 140kB g-zipped PostScript file,  117kB PDF file)

Towards Multi-Agent Coordination in Cognitive Vision

Bernhard Jung, Paolo Petta

This paper casts a simple cognitive vision design into a multi-agent framework and therein addresses the questions how and to what extent explicit consideration of coordination may affect the design and performance of such systems. We give a description of the evolution of our implementation, show how in it tracking and reasoning emerge out of the distributed interaction between component agents, describe individual agent capabilities and the conversation policies employed, and compare its performance to the one of the original design.

Keywords: Cognitive Vision, Multi-agent Systems, Coordination, Conversation Policies

Citation: Jung B., Petta P.: Towards Multi-Agent Coordination in Cognitive Vision. In Zillich M., Vincze M. (eds.): 1st Austrian Cognitive Vision Workshop, OCG (Austrian Computer Society), Vienna, Austria, EU, books@ocg.at Band 186, 2005, pp.9-18.


OFAI-TR-2005-02 ( 475kB PDF file)

Motivating Dramatic Interactions

Stefan Rank, Paolo Petta

Simulated dramatic story-worlds need to be populated with situated software agents that act in a dramatically believable way. In order to provide flexible roleplayers, agent architectures should limit the required external macro-level control. We present work on an architecture that exploits social embedding and concepts from appraisal theories of emotion to achieve the enactment of simple cliche plots. The interplay of motivational constructs and the subjective evaluative interpretation of changes in an agent's environment provide for the causal and emotional connections that can lead to the unfolding of a story.

Keywords: drama, cognitive appraisal theories of emotion, situated agent, emotion, motivation, agent architectures

Citation: Rank S., Petta P. (2005): Motivating Dramatic Interactions. In Agents that Want and Like: Motivational and Emotional Roots of Cognition and Action, Proceedings of the AISB05 Symposium, April 12-15 2005, University of Hertfordshire, Hatfield, UK, pp.102-107.


OFAI-TR-2005-01 ( 182kB g-zipped PostScript file,  466kB PDF file)

Touch and temporal behavior of grand piano actions

Werner Goebl, Roberto Bresin, Alexander Galembo

This study investigated the temporal behavior of grand piano actions from different manufacturers under different touch conditions and dynamic levels. An experimental setup consisting of accelerometers and a calibrated microphone was used to capture key and hammer movements, as well as the sound signal. Five selected keys were played by pianists with two types of touch (``pressed touch'' versus ``struck touch'') over the entire dynamic range. Discrete measurements were extracted from the accelerometer data for each of the over 2300 recorded tones (e.g., finger--key, hammer--string, and key bottom contact times, maximum hammer velocity, MHV). Travel times of the hammer (from finger--key to hammer--string) as a function of MHV varied clearly between the two types of touch, but only slightly between pianos. A travel time versus MHV function found in earlier work [W. Goebl, J. Acoust. Soc. Am. 110(1), 563--572 (2001)] derived from a computer-controlled piano was replicated. Constant temporal behavior over type of touch and low compression properties of the parts of the action (reflected in key bottom contact times) were hypothesized to be indicators for instrumental quality.

Keywords: Piano action, Expressive Performance, Temporal behavior, Accelerometer

Citation: Goebl W., Bresin R., Galembo A.: Touch and temporal behavior of grand piano actions. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2005-01, 2005


OFAI-TR-2004-23 ( 178kB PDF file)

Features of Emotional Planning in Software Agents

Stefan Rank, Paolo Petta, Robert Trappl

It can be argued that emotions are an essential element of intelligence; they are certainly relevant for cognition and action in humans. We believe that software agents can benefit from explicit consideration of emotional processes in the design of their architecture. Our work extends the foundation laid in work on the Tabasco framework by addressing the relation of planning capabilities and emotional processes in agents that are resource-bounded and situated in complex (rich, social, dynamic, and partially observable) environments: planning is considered as a separate module of the agent, that can - but need not - be consulted during execution. We first introduce our approach towards integration of planning in emotional situated agents, analysing the interfaces of continuous planners and the emotion process as conceptualised by cognitive appraisal theories. Next, we cover some implemented systems that already integrate aspects of emotion theories and planning. We thus assert that situated agents can profit from, both, the more abstracted and objectified perspective of planning, and the subjective and grounded current evaluations in the emotion process; and that elements from both views are in fact required to achieve a whole architectural design. Successful synthesis of the two perspectives, however, necessitates of a deeper consolidation and integration of their functionalities, with reconceptualisations beyond what has been realised to date.

Keywords: cognitive appraisal theories of emotion, situated agent, emotion, planning, decision theory, agent architectures

Citation: Rank S., Petta P., Trappl R. (2006): Features of Emotional Planning in Software Agents, In Della Riccia G., Dubois D., Kruse R., Lenz H.-J. (eds.): Decision Theory and Multi-Agent Planning, Springer Wien/New York.


OFAI-TR-2004-22 ( 774kB g-zipped PostScript file,  156kB PDF file)

An Open Source Tool for Semi-Automatic Rhythmic Annotation

Gouyon F., Wack N., Dixon S.

We present a plugin implementation for the multi-platform WaveSurfer sound editor. Added functionalities are the semi-automatic extraction of beats at diverse levels of the metrical hierarchy as well as uploading and downloading functionalities to a music metadata database. It is built on existing open source (GPL-licensed) audio processing tools, namely WaveSurfer, BeatRoot and CLAM, in the intent to expand the scope of these softwares. It is therefore also provided as GPL code with the explicit goal that researchers in the audio processing community can freely use and improve it. We provide technical details of the implementation as well as practical use cases. We also motivate the use of rhythmic metadata in Music Information Retrieval scenarios.

Keywords: Beat Tracking, Rhythm, Music Information Retrieval

Citation: Proceedings of the 7th International Conference on Digital Audio Effects (DAFx-04)


OFAI-TR-2004-21 ( 141kB g-zipped PostScript file,  172kB PDF file)

Dance Music Classification: A Tempo-Based Approach

Gouyon F., Dixon S.

Recent research has studied the relevance of various features for automatic genre classification, showing the particular importance of tempo in dance music classification. We complement this work by considering a domain-specific learning methodology, where the computed tempo is used to select an expert classifier which has been specialised on its own tempo range. This enables the all-class learning task to be reduced to a set of two- and three-class learning tasks. Current results are around 70% classification accuracy (8 ballroom dance music classes, 698 instances, baseline 15.9%).

Keywords: Feature Extraction, Genre Classification, Music Information Retrieval, Rhythm

Citation: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR'04)


OFAI-TR-2004-20 ( 58kB g-zipped PostScript file,  104kB PDF file)

Towards Characterisation of Music via Rhythmic Patterns

Simon Dixon, Fabien Gouyon, Gerhard Widmer

A central problem in music information retrieval is finding suitable representations which enable efficient and accurate computation of musical similarity and identity. Low level audio features are ideal for calculating identity, but are of limited use for similarity measures, as many aspects of music can only be captured by considering high level features. We present a new method of characterising music by typical bar-length rhythmic patterns which are automatically extracted from the audio signal, and demonstrate the usefulness of this representation by its application in a genre classification task. Recent work has shown the importance of tempo and periodicity features for genre recognition, and we extend this research by employing the extracted temporal patterns as features. Standard classification algorithms are utilised to discriminate 8 classes of Standard and Latin ballroom dance music (698 pieces). Although pattern extraction is error-prone, and patterns are not always unique to a genre, classification by rhythmic pattern alone achieves up to 50% correctness (baseline 16%), and by combining with other features, a classification rate of 96% is obtained.

Keywords: Feature Extraction, Genre Classification, Music Information Retrieval, Rhythm

Citation: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR'04)


OFAI-TR-2004-19 ( 234kB g-zipped PostScript file,  554kB PDF file)

Correlation of Subjective Expectation and P300 Amplitude during a Game of Matching Pennies

Flexer A., Makeig S.

We report a study on two-person game playing involving simultaneous EEG recording from both subjects. Independent Component Analysis is used for identifying activities of individual cortical EEG sources. Activity of a midline fronto-central component identified in four of five subjects was correlated with a measure of subjective expectation. This component accounts for the P300 waveform whose amplitude varies depending on the context of the gaming situation.

Keywords: Independent Component Analysis, EEG, Neuroeconomics

Citation: A. F., S. M.: Correlation of Subjective Expectation and P300 Amplitude during a Game of Matching Pennies. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-19, 2004


OFAI-TR-2004-18 ( 366kB g-zipped PostScript file,  700kB PDF file)

A reliable probabilistic sleep stager based on a single EEG signal

Arthur Flexer, Georg Gruber, Georg Dorffner

Objective: We developed a probabilistic continuous sleep stager based on Hidden Markov Models using only a single EEG signal. It offers the advantage of being objective by not relying on human scorers, having much finer temporal resolution (one second instead of 30 seconds), and being based on solid probabilistic principles rather than a predefined set of rules (Rechtschaffen \& Kales). Methods and Material: 68 whole night sleep recordings from two different sleep labs are analysed using Gaussian Observation Hidden Markov Models. Results: Our unsupervised approach detects the cornerstones of human sleep (wakefulness, deep and rem sleep) with around 80\% accuracy based on data from a single EEG channel. There are some difficulties in generalizing results across sleep labs. Conclusion: Using data from a single electrode is sufficient for reliable continuous sleep staging. Sleep recordings from different sleep labs are not directly comparable. Training of separate models for the sleep labs is necessary.

Keywords: Time Series Processing, Sleep Analysis, Hidden Markov Models, EEG

Citation: Flexer A., Gruber G., Dorffner G.: A reliable probabilistic sleep stager based on a single EEG signal, Artificial Intelligence in Medicine, 33(3)199-207, 2005.


OFAI-TR-2004-17 ( 320kB PDF file)

Hierarchical Organization and Visualization of Drum Sample Libraries

Elias Pampalk, Peter Hlavac, Perfecto Herrera

Drum samples are an important ingredient for many styles of music. Large libraries of drum sounds are readily available. However, their value is limited by the ways in which users can explore them to retrieve sounds. The available schemes for organizations those collections are either insufficient to hold a large number of samples, rely on cumbersome manual classification, are inconsistent, and error-prone. In this paper, we present a new approach for automatically structuring and visualizing large sample libraries using only audio input. We create a hierarchical user interface for efficient exploration and retrieval which is based on a similarity measure for drum sounds and self-organizing maps.

Citation: Pampalk E., Hlavac P., Herrera P.: Hierarchical Organization and Visualization of Drum Sample Libraries. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-17, 2004


OFAI-TR-2004-16 ( 112kB PDF file)

Artist Classification with Web-based Data

Peter Knees, Elias Pampalk, Gerhard Widmer

Manifold approaches exist for organization of music by genre and/or style. In this paper we propose the use of text categorization techniques to classify artists present on the Internet. In particular, we retrieve and analyze webpages ranked by search engines to describe artists in terms of word occurrences on related pages. To classify artists we primarily use support vector machines. We present 3 experiments in which we address the following issues. First, we study the performance of our approach compared to previous work. Second, we investigate how daily fluctuations in the Internet affect our approach. Third, on a set of 224 artists from 14 genres we study (a) how many artists are necessary to define the concept of a genre, (b) which search engines perform best, (c) how to formulate search queries best, (d) which overall performance we can expect for classification, and finally (e) how our approach is suited as a similarity measure for artists.

Keywords: genre classification, community metadata, cultural features

Citation: Knees P., Pampalk E., Widmer G.: Artist Classification with Web-based Data. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-16, 2004


OFAI-TR-2004-15 ( 137kB PDF file)

A Matlab Toolbox to Compute Similarity from Audio

Elias Pampalk

A Matlab toolbox implementing music similarity measures for audio is presented. The implemented measures focus on aspects related to timbre and periodicities in the signal. This paper gives an overview of the implemented functions. In particular, the basics of the similarity measures are reviewed and some visualizations are discussed.

Citation: Pampalk E.: A Matlab Toolbox to Compute Similarity from Audio. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-15, 2004


OFAI-TR-2004-14 ( 77kB g-zipped PostScript file,  240kB PDF file)

Ranking for Medical Annotation: Investigating Performance, Local Search and Homonymy Recognition

Alexander K. Seewald

In this paper we investigate several hypotheses concerning document relevance ranking for biological literature. More specifically, we focus on three topics: performance, risk of local searching, and homonymy recognition.
Surprisingly, we find that a quite simple ranker based on the occurrence of a single word performs best. Adding this word as a new search term to each query yields results comparable to elaborate state-of-the-art approaches.
The risk of our local searching approach is found to be negligible. In some cases retrieval from a large repository even yields worse results than local search on a smaller repository which only contains documents returned by the current query.
The removal of automatically determined homonyms yields almost indistinguishable results to the original query, so it is not inconceivable that the problem of homonymy in biological literature has been overstated.
Concluding, our investigation of three hypotheses has been useful to decide implementation issues within our research projects as well as opening interesting venues for further research.

Keywords: Information Retrieval, Machine Learning, Data, information and knowledge integration

Citation: Seewald A.K.: Ranking for Medical Annotation: Investigating Performance, Local Search and Homonymy Recognition. Proceedings of the Symposium on Knowledge Exploration in Life Science Informatics (KELSI 2004), Milano, Italy. Also available as Technical Report (ext.vers.), Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-14, 2004.


OFAI-TR-2004-13 ( 133kB g-zipped PostScript file,  203kB PDF file)

Automated iterative requirements analysis and evolution

Werner Gaisbauer, Brian Sallans

We present a novel technique for requirements analysis and evolution for problems involving the optimal control of a software or physical system. The requirements take the form of a function indicating when the system has reached a desired state. A solution which meets the requirements takes the form of a controller which specifies how the system should act. A human operator iteratively analyses and refines a given presentation of the requirements on a human-readable high symbolic level and evaluates the resulting solution by the means of a graphical display. Our approach forms a closed-loop system where the requirements and solutions iteratively evolve towards the desired requirements and solutions over time. In this context we use existing machine learning techniques, i.e., reinforcement learning, to automatically compute a solution for a given set of requirements.

Keywords: Markov decision processes, reinforcement learning, requirements engineering, software engineering, human-computer interfaces, visualization

Citation: Gaisbauer W., Sallans B.: Automated iterative requirements analysis and evolution. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-13, 2004


OFAI-TR-2004-12

Relational IBL in Classical Music

Asmir Tobudic, Gerhard Widmer

It is well known that many hard tasks considered in machine learning and data mining can be solved in a rather simple and robust way with an instance- and distance-based approach. In this work we present another difficult task: learning, from large numbers of complex performances by concert pianists, to play music expressively. We model the problem as a multi-level decomposition and prediction task. We show that this is a fundamentally relational learning problem, and propose a relational instance-based learning algorithm named DISTALL. Experiments with data derived from a substantial number of Mozart piano sonata recordings by a skilled concert pianist demonstrate that the approach is viable. We show that the instance-based learner operating on structured, relational data outperforms a propositional k-NN algorithm. In qualitative terms, some of the piano performances produced by DISTALL after learning from the human artist are of substantial musical quality; one even won a prize in an international `computer music performance' contest. The experiments thus provide evidence of the capabilities of ILP in a highly complex domain such as music.

Keywords: Relational instance-based learning, music

Citation: Tobudic A., Widmer G.: Relational IBL in Classical Music. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-12, 2004


OFAI-TR-2004-11 ( 34kB g-zipped PostScript file,  57kB PDF file)

Combining Bayesian and Rule Score Learning: Automated Tuning for SpamAssassin

Alexander K. Seewald

Spam (Unsolicited Bulk Email), has become a global problem of high economic impact. In this paper, we discuss an applied approach to automatically adjust all parameters of a hybrid spam recognition system, SpamAssassin. We investigate both learning of rule scores and selective training of the integrated bayesian spam filter. We report competitive results concerning ham misclassification rate (i.e. legal mails misclassified as spam) which are comparable to human accuracy, and are able to significantly improve the spam misclassification rate (i.e. spam misclassified as legal mails) by a factor of around twenty versus SA with default scores. A variant of Total Cost Ratio also shows the same trend. The results of the training process can be transformed into a SA preference file. Also, the training process utilizes mainly non-spam mailboxes which are easily obtained.

Keywords: Multi-view learning, Applications, Spam-filtering, Machine Learning

Citation: Seewald A.: Combining Bayesian and Rule Score Learning: Automated Tuning for SpamAssassin. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-11, 2004


OFAI-TR-2004-10 ( 183kB g-zipped PostScript file,  196kB PDF file)

Evaluating Rhythmic Descriptors for Musical Genre Classification

Fabien Gouyon, Simon Dixon, Elias Pampalk, Gerhard Widmer

Organising or browsing music collections in a musically meaningful way calls for tagging the data in terms of e.g. rhythmic, melodic or harmonic aspects, among others. In some cases, such metadata can be extracted automatically from musical files; in others, a trained listener must extract it by hand. In this article, we consider a specific set of rhythmic descriptors for which we provide procedures of automatic extraction from audio signals. Evaluating the relevance of such descriptors is a difficult task that can easily become highly subjective. To avoid this pitfall, we assessed the relevance of these descriptors by measuring their rate of success in genre classification experiments. We conclude on the particular relevance of the tempo and a set of 15 MFCC-like descriptors.

Keywords: Music information retrieval, Rhythm, Genre Classification

Citation: Proceedings of the AES 25th International Conference, London, June 2004, pp 196-204.


OFAI-TR-2004-09 ( 180kB g-zipped PostScript file,  197kB PDF file)

Case-based Relational Learning of Expressive Phrasing in Classical Music

Asmir Tobudic, Gerhard Widmer

An application of relational case-based learning to the task of expressive music performance is presented. We briefly recapitulate the relational case-based learner DISTALL and empirically show that DISTALL outperforms a straightforward propositional k-NN on the music task. A set distance measure based on maximal matching - incorporated in DISTALL - is discussed in more detail and especially the problem associated with its `penalty part': the distance between a large and a small set is mainly determined by their difference in cardinality. We introduce a method for systematically varying the influence of the penalty on the overall distance measure and experimentally test different variants of it. Interestingly, it turns out that the variants with high influence of penalty clearly perform better than the others on our music task.

Keywords: relational case-based learning, case-based reasoning, music

Citation: Tobudic A., Widmer G.: Case-based Relational Learning of Expressive Phrasing in Classical Music. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-09, 2004


OFAI-TR-2004-08 ( 238kB g-zipped PostScript file,  284kB PDF file)

Relational IBL in Music with a New Structural Similarity Measure

Asmir Tobudic, Gerhard Widmer

It is well known that many hard tasks considered in machine learning and data mining can be solved in an rather simple and robust way with an instance- and distance-based approach. In this paper we present another difficult task: learning, from large numbers of performances by concert pianists, to play music expressively. We model the problem as a multi-level decomposition and prediction task. Motivated by structural characteristics of such a task, we propose a new relational distance measure that is a rather straightforward combination of two existing measures. Empirical evaluation shows that our approach is in general viable and our algorithm, named DISTALL, is indeed able to produce musically interesting results. The experiments also provide evidence of the success of ILP in a complex domain such as music performance: it is shown that our instance-based learner operating on structured, relational data outperforms a propositional k-NN algorithm.

Keywords: relational instance-based learning, music

Citation: Tobudic A., Widmer G.: Relational IBL in Music with a New Structural Similarity Measure. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-08, 2004


OFAI-TR-2004-07 ( 101kB g-zipped PostScript file,  87kB PDF file)

Learning to Play Mozart : Recent Improvements

Asmir Tobudic, Gerhard Widmer

This paper describes basic research on the crossroads between machine learning and musicology. Starting from a system which is able to automatically induce multi-level tempo and dynamics models of expressive performance from a large corpus of real performances by skilled pianists, we discuss several of its shortcomings and present improvements and their empirical evaluation. In particular, we show that in a such complex domain as a concert-class musical performance, one can treat the training data as noisy. Applying a standard machine learning technique for noise handling indeed significantly improve the results. We also discuss the major drawback of standard propositional k nearest neighbor algorithm in case of learning mutually dependent concepts on different levels of resolution and present our solution to these problems by introducing a new relational instance-based learning algorithm. It turns out that it is indeed able to overcome some of the weaknesses of its propositional counterpart.

Citation: Tobudic A., Widmer G.: Learning to Play Mozart : Recent Improvements. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-07, 2004


OFAI-TR-2004-06 ( 539kB g-zipped PostScript file,  1579kB PDF file)

Exploring expressive performance strajectories: Six famous pianists play six Chopin pieces

Werner Goebl, Elias Pampalk, Gerhard Widmer

This paper presents an exploratory approach to analyzing large amounts of expressive performance data. Tempo and loudness information was derived semi-automatically from audio recordings of six famous pianists each playing six complete pieces by Chopin. The two-dimensional data was segmented into musically relevant phrases, normalized, and smoothed in various grades. The whole data set was clustered using a novel computational technique (i.e., aligned self-organizing maps) and visualized via an interactive user interface. Detailed cluster-wise statistics across pianists, pieces, and phrases gave insights into individual expressive strategies as well as common performance principles.

Keywords: Expressive Music Performance, Artificial Intelligence, Aligned Self-Organizing Maps, Visualization, Clustering

Citation: Goebl, W., Pampalk, E., and Widmer, G. (2004). “Exploring expressive performance strajectories: Six famous pianists play six Chopin pieces.” In Proceedings of the 8th International Conference on Music Perception and Cognition (ICMPC8), Evanston, IL, CD-ROM, Causal Productions, Adelaide, pp. 505--509.


OFAI-TR-2004-05 ( 239kB PDF file)

Social Consumer Agents in an Integrated Markets Model

Michael Schoenhart, Brian Sallans, Georg Dorffner

The impact of cognitive and socially bounded consumer agents on an integrated markets model is investigated. The market scenario consists of a financial market with trading agents and a consumer market. The markets are coupled via learning production firm agents offering their products and shares for sale. The consumer agents are embedded in a social structure based on “small-world network” principles. The cognitive model of the consumer agents enables them to make their decisions according to the behavior of the adjacent social neighborhood and based on the degree of satisfaction and uncertainty they are facing. The potential and limitations of the consumer agent model are explored by applying a recently introduced Markov chain Monte Carlo method. Therefore certain empirical phenomena or “stylized facts” are selected for reproduction within the simulation and the conditions of their occurrence are analyzed. It is shown that the properties of the social network structure and the sensitivity of the agents’ cognitive decision making process (heuristics) contribute significantly or are, in fact, enabling the complex phenomena of Bass curves observed in consumer market scenarios. Furthermore the results indicate that the structural properties of the emerged social networks are stable and match real-life social networks. Moreover we show that the network structure has a strong impact on the development of market share. Thus we suggest the use of the social network descriptive parameters, which could be discovered empirically, as predictive factors for marketing forecasts.

Keywords: agent-based economics, artificial stock market, Bass curves, small-world networks, heuristics, market share forecasts

Citation: Schoenhart M., Sallans B., Dorffner G.: Social Consumer Agents in an Integrated Markets Model. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-05, 2004


OFAI-TR-2004-04 ( 180kB g-zipped PostScript file,  280kB PDF file)

Automatic Recognition of Famous Artists by Machine

Gerhard Widmer, Patrick Zanon

The paper addresses the question whether it is possible for a machine to learn to distinguish and recognise famous musicians (concert pianists), based on their style of playing. We extract a number of low-level features related to expressive timing and dynamics from the original audio CD recordings by famous pianists, and apply various machine learning algorithms to the task of learning classifiers based on these features. Experiments show that the computer can learn to identify the performer in a new recording with a probability significantly higher than chance, despite the fact that the features only capture a very limited amount of information about a performance. An analysis of the learned classifiers reveals a number of performance features that seem particularly relevant to style differentiation, and an application of the classifiers to music of a very different style shows that the machine seems to have captured truly fundamental aspects of artistic style. One limitation of the current approach is that sequential information is totally ignored, and we briefly report on ongoing work that tries to address this problem via an interesting conversion of music performances to strings.

Keywords: machine learning, music

Citation: Widmer G., Zanon P.: Automatic Recognition of Famous Artists by Machine. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-04, 2004


OFAI-TR-2004-03 ( 444kB g-zipped PostScript file,  431kB PDF file)

Improving upon the TAEMS/DTC framework in the context of coordinated scheduling

Bernhard Jung, Paolo Petta

TAEMS, DTC and GPGP constitute an evolved framework for coordination in multi-agent systems. In this paper we focus on inconsistencies and semantic interpretation problems encountered during an implementation of a DTC scheduler. We try to disambiguate and simplify concepts and propose extensions and new features for TAEMS and DTC to better understand, use, and integrate the framework in an agent architecture. We propose how to modularise TAEMS/DTC to form a kind of construction kit for local agent coordination and control and how to extend the scope to also cover domain-dependent context. Throughout, we aim to simplify application and integration of the TAEMS/DTC framework while preserving its core ideas.

Keywords: Coordination, Multi-Agent Systems, TAEMS (Task Analysis, Environment Modeling, and Simulation), design-to-criteria scheduling, critique, implementation

Citation: Jung B., Petta P.: Improving upon the TAEMS/DTC framework in the context of coordinated scheduling. in Trappl R. (ed.): Proceedings of the Seventeenth European Meeting on Cybernetics and System Research (EMCSR 2004), 13-16 April 2004, University of Vienna, Austrian Society for Cybernetic Studies, Vienna. also available as Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2004-03, 2004.


OFAI-TR-2004-02 ( 74kB g-zipped PostScript file,  124kB PDF file)

Once again: The perception of piano touch and tone. Can touch audibly change piano sound independently of intensity?

Werner Goebl, Roberto Bresin, Alexander Galembo

This study addresses the old question of whether the timbre of isolated piano tones can be audibly varied independently of their hammer velocities – only through the type of touch. A large amount of single piano tones were played with two prototypical types of touch: depressing the keys with the finger initially resting on the key surface (pressed), and hitting the keys from a certain distance above (struck). Musicians were asked to identify the type of touch of the recorded samples, in a first block with all attack noises before the tone onsets included, in a second block without them. Half of the listeners could correctly identify significantly more tones than chance in the first block (up to 86% accuracy), but no one in block 2. Those who heard no difference tended to give struck ratings for louder tones in both blocks.

Keywords: piano touch, piano tone, perception, touch noise

Citation: Goebl, W., Bresin, R., and Galembo, A. (2004). “Once again: The perception of piano touch and tone. Can touch audibly change piano sound independently of intensity?” In Proceedings of the 2004 ISMA, Japan, (Nara).


OFAI-TR-2004-01 ( 265kB PDF file)

Islands of Music - Analysis, Organization, and Visualization of Music Archives

Elias Pampalk

This report summarizes the master's thesis: Islands of Music: Analysis, Organization, and Visualization of Music Archives, which I submitted to the Vienna University of Technology on December 11th, 2001. I wrote it at the Department of Software Technology and Interactive Systems, supervised by Dr. Andreas Rauber, and assessed by Prof. Dr. Dieter Merkl. Islands of Music are a graphical user interface to music collections based on a metaphor of geographic maps. The thesis deals with the challanges involved in the automatic creation of such interfaces given only raw music data (e.g. MP3s) without any further information such as to which genres the piece of music belong. Themain challenge is to teach machines how to listen to music, i.e., how to calculate the perceived similarity of two pieces of music. Using a neural network algorithm, namely the self-organizing map, the music collection is organized and using a novel visualization technique the map of islands is created. Furthermore, methods to automatically find descriptions for the mountains and hills are demonstrated.

Citation: Journal of the Austrian Society for Artificial Intelligence (ÖGAI), Vol. 22, No. 4, pp 20-23, 2003.


OFAI-TR-2003-35 ( 232kB PDF file)

Modeling Rule Precision

Johannes Fürnkranz

This paper reports first results of an empirical study of the precision of classification rules on an independent test set. We generated a large number of rules using a general covering algorithm and recorded their coverage on training and test sets. These meta data are briefly presented and analyzed with respect to their variance among different domains and search heuristics. The main part of the paper describes experiments that aimed at modeling the precision of the learned rules on the test set in dependence of their coverage on the training set. To this end, we trained a neural network as an evaluation function for a rule learner, and present parameter settings for the $m$-heuristic and the generalized $m$-heuristic that are optimal in the sense that they minimize the squared error of predicting the test set precision with training set coverage of positive and negative examples.

Keywords: Inductive Rule Learning, Overfitting, Error Estimates, Meta-Learning

Citation: Fürnkranz J.: Modeling Rule Precision. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-35, 2003


OFAI-TR-2003-34 ( 2435kB PDF file)

Agent-based Pedagogical Role-Play: The Case of Job Interview Training

Sabine Payr

Keywords: Agents, Pedagogical Agents, Job Interview Training

Citation: Payr S.: Agent-based Pedagogical Role-Play: The Case of Job Interview Training. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-34, 2003


OFAI-TR-2003-33 ( 133kB g-zipped PostScript file,  187kB PDF file)

A Simulation Study of Managerial Compensation

Brian Sallans, Alexander Pfister, Georg Dorffner

A computational economics model of managerial compensation is presented. Risk-averse managers are simulated, and shown to adopt more risk-taking under the influence of stock options. It is also shown that stock options can both help a new entrant compete in an established market; and can help the incumbent firm fight off competition by promoting new exploration and risk-taking. In the case of the incumbent, the stock options are shown to be most effective when introduced as a response to the arrival of a new entrant, rather than used as a standard part of the compensation package.

Keywords: stock options, agent-based economics, compensation,

Citation: Sallans B., Pfister A., Dorffner G.: A Simulation Study of Managerial Compensation. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-33, 2003


OFAI-TR-2003-32

ESAW'03: Workshop Notes of the Fourth International Workshop "Engineering Societies in the Agents World", 29-31 October 2003, Imperial College London, UK

A. Omicini, Petta P., Pitt J. (eds.)

Software systems are undergoing dramatic changes in scale and complexity as we are moving rapidly into the age of micro-cosmic computing: from the planetary scale in which a single application can access the computing power and data resources of the entire world, to nanotech scale computing where a single location can be wired with millions of sensors. But at both ends of the scale, the computing devices (applications, sensors) interact with each other to provide us with increasingly complex, context-aware, and content-adaptive services and functionalities. There is therefore a strong qualitative impact on the nature, substance and style of interaction between components. These interactions will occur in patterns and via mechanisms that can hardly be grasped in terms of classical models of interaction or service-oriented coordination. To some extent, future software systems will exhibit characteristics making them more resemble natural systems and societies than of mechanical systems and traditional software architectures.

This situation poses exciting challenges to computer scientists and software engineers. Already, software agents and multi-agent systems are recognised as both useful abstractions and effective technologies for the modelling and building of complex distributed applications. However, little is done with regard to effective and methodical development of complex software systems in terms of multi-agent societies. An urgent need exists for novel approaches to software modelling and software engineering that can support the successful deployment of software systems made up of a massive number of autonomous components. We need to enable designers to control and predict the behaviour of their systems, but alternatively to enable emergent global system properties and discovered functionality to be commonplace. It is very likely that such innovations will exploit lessons from a variety of different scientific disciplines, such as sociology, economics, organisation science, modern thermodynamics, and biology. Furthermore, since these systems will be ubiquitous, persistent, and pervasive, i.e. embedded in the real world, we need to know what frameworks of law will facilitate their regulation.

The sequel to successful editions in 2000, 2001 and 2002, ESAW'03 remains committed to the use of the notion of multi-agent systems as seed for animated, constructive, and highly inter-disciplinary discussions about technologies, methodologies, and tools for the engineering of complex distributed applications. While the workshop places an emphasis on practical engineering issues, it also welcomes theoretical, philosophical, and empirical contributions, provided that they clearly document their connection to the core applied issues.

We received 34 papers that underwent strict scientific peer-review for quality, with at least three independent reviews per paper. In this process, ten papers were accepted. Given the large interdisciplinary spread of the topical domain, and the intention to exploit best the workshop as a forum for discussion and dissemination of significant and promising---if perhaps not yet fully developed---ideas and approaches, another 17 papers were invited for presentation at the workshop and are also included in these working notes. Based on previous experience, we are confident that at least for a number of these contributions the feedback from the workshop will both enable and motivate authors to improve their papers to have them included in the post-proceedings. As for the earlier workshop editions (LNAI 1972, LNAI 2203, and LNAI 2577), these will be published by Springer-Verlag as a volume of the Lecture Notes in Artificial Intelligence series.

Keywords: Multi-Agent Systems, Agent Oriented Software Engineering, Agent Societies

Citation: Omicini A., Petta P., Pitt J. (eds.): ESAW'03: Workshop Notes of the Fourth International Workshop "Engineering Societies in the Agents World", 29-31 October 2003, Imperial College London, UK. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-32, 2003


OFAI-TR-2003-31 ( 186kB g-zipped PostScript file,  221kB PDF file)

An assessment of the TAEMS/DTC framework in the context of coordinated scheduling and directions for improvements

Bernhard Jung, Paolo Petta

TAEMS, DTC and GPGP constitute an evolved framework for coordination in multi-agent systems. In this paper we focus on inconsistencies and semantic interpretation problems encountered during an implementation of a DTC scheduler. We try to disambiguate and simplify concepts and propose extensions and new features for TAEMS and DTC to better understand, use, and integrate the framework in an agent architecture. We indicate how TAEMS/DTC can be modularised to form a kind of construction kit for local agent coordination and control and how they can be extended to support even domain-dependent context. With this, we aim to simplify application and integration of the TAEMS/DTC framework while preserving its core ideas.

Keywords: Coordination, Multi-Agent Systems, TAEMS (Task Analysis, Environment Modeling, and Simulation), design-to-criteria scheduling, critique, implementation

Citation: Jung B., Petta P.: An assessment of the TAEMS/DTC framework in the context of coordinated scheduling and directions for improvements. in D'Iverno M. et al. (eds.): The First European Workshop on Multi-Agent Systems (EUMAS 2003), Dec. 18-19, 2003, St. Catherine's College, Oxford University. also available as Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-31, 2003.


OFAI-TR-2003-30 ( 84kB g-zipped PostScript file,  130kB PDF file)

Variational Bayesian Autoregressive Conditional Heteroskedastic Models

Brian Sallans

A variational Bayesian autoregressive conditional heteroskedastic (VB-ARCH) model is presented. The ARCH class of models is one of the most popular for economic time series modeling. It assumes that the variance of the time series is an autoregressive process. The variational Bayesian approach results in an approximation to the full posterior distribution over ARCH model parameters, and provides a method for model selection. A novel application of Monte Carlo sampling is presented, wherein sampling is used to evaluate difficult terms in the variational free energy. A description of the variational approximation is followed by encouraging experimental results on model selection and volatility prediction on synthetic and historical financial data.

Keywords: Variational Methods, econometrics, ARCH model

Citation: Sallans B.: Variational Bayesian Autoregressive Conditional Heteroskedastic Models. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-30, 2003


OFAI-TR-2003-29 ( 72kB g-zipped PostScript file,  95kB PDF file)

Variational Action Selection for Influence Diagrams

Brian Sallans

Influence diagrams provide a compact way to represent problems of decision making under uncertainty. As the number of variables in the problem increases, computing exact expectations and making optimal decisions becomes computationally intractable. A new method of action selection is presented, based on variational approximate inference. A policy is approximated where high-probability actions under the policy have high utility. Actions are then selected which have high probability under the approximating policy. The variational action selection method is shown to compare favorably to greedy and sampling-based action selection.

Keywords: Influence Diagrams, Variational Methods

Citation: Sallans B.: Variational Action Selection for Influence Diagrams. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-29, 2003


OFAI-TR-2003-28 ( 1244kB g-zipped PostScript file,  2878kB PDF file)

The Role of Timing and Intensity in the Production and Perception of Melody in Expressive Piano Performance

Werner Goebl

This thesis addresses the question of how pianists make individual voices stand out from the background in a contrapuntal musical context, how they realise this with respect to the constraints of the piano keyboard construction, and finally how much each of the expressive parameters employed by the performers contributes to the perception of particular voices. Three different empirical approaches were used to investigate these questions: a study in the area of piano acoustics investigated the temporal properties of three different grand piano actions, a performance study with a Bösendorfer computer-controlled grand piano examined intensity and onset time differences between the principal voice and the accompaniment, and a series of perception studies looked at the relative effect of asynchrony and intensity variation on the perceived salience of individual tones in musical chords and real music contexts. (Extended abstract in the thesis)

Keywords: music performance, music perception, piano acoustics, listening test, melody lead, velocity artifact, computer-controlled piano

Citation: Goebl, W. (2003). The Role of Timing and Intensity in the Production and Perception of Melody in Expressive Piano Performance. Unpublished doctoral thesis, Karl-Franzens-Universität Graz, Graz, Austria.


OFAI-TR-2003-27 ( 152kB g-zipped PostScript file,  242kB PDF file)

Symbolic Distance Measurements Based on Characteristic Subspaces

Marcus-Christopher Ludl

We introduce the subspace difference metric, a novel heterogeneous distance metric for calculating distances between points with both continuous and (unordered) categorical attributes. Our approach is based on the computation and comparison of characteristic subspaces (i.e. contexts) for each of the symbols and can be viewed as a generalization of the well-known value difference metric. Subsequently, as one possible extension, we propose a linearization of the computed symbolic distances by multidimensional scaling, thereby mapping a set of symbols onto the interval [0, 1]. Thus, even algorithms, which have originally been designed for usage with continuous attributes (e.g. clustering algorithms like k-means), may be applied to datasets containing discrete attributes, without having to adapt the algorithm itself. Finally, we evaluate the proposed metric and the linearization in quantitative and qualitative settings and exemplify the applicability in clustering domains.

Keywords: categorical metric, symbolic distances, clustering

Citation: Ludl M.: Symbolic Distance Measurements Based on Characteristic Subspaces. In Proceedings of the 7th European Conference for Principles of Data Mining and Knowledge Discovery (PKDD 2003), Dubrovnik, Croatia.


OFAI-TR-2003-26 ( 785kB PDF file)

Preference Learning: Models, Methods, Applications - Proceedings of the KI-2003 Workshop

Eyke Hüllermeier, Johannes Fürnkranz

The preferences of an individual, say, the participant of an electronic auction or the customer of an electronic store, can be expressed in various ways, either explicitly, e.g. in the form of preference statements or implicitly, e.g. through the way of acting in different situations. The problem of finding out about an individual's preferences, or about those of a group of individuals, is referred to as preference elicitation. This requires, among other things, formal models for representing preferences and methods for their (automatic) acquisition. Touching on various aspects of AI, both theoretical and practical, preference elicitation is one of this field's most recent and interesting research topics. This workshop, held at the German Conference for Artificial Intelligence (KI-2003), focused on learning methods for preference elicitation, that is on methods for inducing preferences from given observations. Like other types of complex learning tasks that have recently entered the stage in machine learning and related fields, preference learning deviates strongly from the standard problems of classification and regression. It is particularly challenging as it involves the prediction of complex structures, such as weak or partial order relations, rather than single values. Moreover, training input will not, as is usually the case, be offered in the form of complete examples but may comprise more general types of information, such as relative preferences or different kinds of indirect feedback.

Citation: Hüllermeier E., Fürnkranz J.: Preference Learning: Models, Methods, Applications - Proceedings of the KI-2003 Workshop. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-26, 2003


OFAI-TR-2003-25 ( 200kB PDF file)

Predicting the Subjective Similarity Between Expressive Performances of Music from Objective Measurements of Tempo and Dynamics

Renee Timmers

Measurements of expression in music performances have highlighted the subtle variations in tempo, timing and dynamics that musicians apply when performing music. Comparisons between the variations have given a first insight into the diversity of interpretations of music. Correlation is an often-used method for such comparisons. This study investigated the perceptual validity of such measurements and, more specifically, of such objective comparisons. 20 participants rated the similarity between 5 performances of a fragment from a Chopin Prelude and between 6 performances of two fragments from a Mozart Sonata. Variations in tempo and dynamics were measured from the audio recordings of the performances. These measurements were input to different models that predicted the perceived similarity between performances. Overall, the models could predict a fair amount of the similarity ratings. Tempo was especially important for the prediction, more important than loudness, followed by the interaction between tempo and loudness. Models based on absolute measures were stronger than models based on normalized measures. These results were independent of the musical background of the participants. The implications for future research include a reconsideration of correlation for the comparison of performances and a reevaluation of absolute local measures. In addition, the study suggests directions for the further investigation of the perception of music performance.

Citation: Timmers R.: Predicting the Subjective Similarity Between Expressive Performances of Music from Objective Measurements of Tempo and Dynamics. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-25, 2003


OFAI-TR-2003-24 ( 142kB g-zipped PostScript file,  301kB PDF file)

Classification of Dance Music by Periodicity Patterns

Simon Dixon, Elias Pampalk, Gerhard Widmer

This paper addresses the genre classification problem for a specific subset of music, standard and Latin ballroom dance music, using a classification method based only on timing information. We compare two methods of extracting periodicities from audio recordings in order to find the metrical hierarchy and timing patterns by which the style of the music can be recognised: the first method performs onset detection and clustering of inter-onset intervals; the second uses autocorrelation on the amplitude envelopes of band-limited versions of the signal as its method of periodicity detection. The relationships between periodicities are then used to find the metrical hierarchy and to estimate the tempo at the beat and measure levels of the hierarchy. The periodicities are then interpreted as musical note values, and the estimated tempo, meter and the distribution of periodicities are used to predict the style of music using a simple set of rules. The methods are evaluated with a test set of standard and Latin dance music, for which the style and tempo are given on the CD cover, providing a "ground truth" by which the automatic classification can be measured.

Keywords: Rhythm, Beat induction, Music information retrieval

Citation: Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR 2003), Baltimore MD, October 2003, pp 159-165.


OFAI-TR-2003-23 ( 765kB PDF file)

Exploring Music Collections by Browsing Different Views

Elias Pampalk, Simon Dixon, Gerhard Widmer

The availability of large music collections calls for ways to efficiently access and explore them. We present a new approach which combines descriptors derived from audio analysis with meta-information to create different views of a collection. Such views can have a focus on timbre, rhythm, artist, style or other aspects of music. For each view the pieces of music are organized on a map in such a way that similar pieces are located close to each other. The maps are visualized using an Islands of Music metaphor where islands represent groups of similar pieces. The maps are linked to each other using a new technique to align self-organizing maps. The user is able to browse the collection and explore different aspects by gradually changing focus from one view to another. We demonstrate our approach on a small collection using a meta-information-based view and two views generated from audio analysis, namely, beat periodicity as an aspect of rhythm and spectral information as an aspect of timbre.

Citation: Pampalk E., Dixon S., Widmer G.: Exploring Music Collections by Browsing Different Views. Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR 2003), Baltimore MD, October 2003, pp 201-208.


OFAI-TR-2003-22 ( 865kB PDF file)

On the Evaluation of Perceptual Similarity Measures for Music

Elias Pampalk, Simon Dixon, Gerhard Widmer

Several applications in the field of content-based interaction with music repositories rely on measures which estimate the perceived similarity of music. These applications include automatic genre recognition, playlist generation, and recommender systems. In this paper we study methods to evaluate the performance of such measures. We compare five measures which use only the information extracted from the audio signal and discuss how these measures can be evaluated qualitatively and quantitatively without resorting to large scale listening tests.

Citation: Pampalk E., Dixon S., Widmer G.: On the Evaluation of Perceptual Similarity Measures for Music. Proceedings of the 6th International Conference on Digital Audio Effects (DAFx-03), pp 6-12.


OFAI-TR-2003-21 ( 166kB g-zipped PostScript file,  493kB PDF file)

Evaluation of Term Utility Functions for Very Short Multi-Document Summaries

A.K. Seewald, C. Holzbaur, G. Widmer

We describe results from an application for relevance assessment in a setting related to multi-document summarization. For the task of characterizing given document collections by a short list of relevant terms, we have proposed the term utility function PxR. The measure is competitive to a variety of utility functions commonly used in text mining. Our function incorporates a user-definable parameter which allows for explicit, continuous trade-off between precision and recall, which was preferred by our users over the more opaque term utility functions from text mining. The F-beta measure is similar but not identical to our measure and will also be discussed. Despite our users' preference for a user-definable parameter, the improvements by setting different user-defined parameter values for each document collection are limited, and a static value for the parameter works almost as well. This seems to be true for the F-beta measure as well. A simple measure, SR, also performs competitively. In light of this evidence, a user-definable parameter seems to be unnecessary to achieve competitive performance.

Keywords: Information Retrieval, Term Utility Function, 3DSearch

Citation: Seewald A., Holzbaur C., Widmer G.: A Simple Term Utility Function for Information Retrieval, Applied Artificial Intelligence 20(1) pp.57-78, January 2006.


OFAI-TR-2003-20 ( 483kB PDF file)

Aligned Self-Organizing Maps

Elias Pampalk

The concept of similarity is important for many data mining related applications such as content-based music retrieval. Defining similarity can be very difficult if several aspects are involved. For example, music similarity depends on the melody, rhythm, or instruments. The Self-Organizing Map is a powerful tool to visualize how the data looks like from a certain perspective of similarity. In this paper a new technique is proposed to align different SOMs to enable the user to gradually and smoothly change focus from one similarity aspect to another without losing orientation. Furthermore, two applications of these Aligned-SOMs are presented in the music domain.

Keywords: Interactive Exploration, Similarity Aspects, Applications in Music

Citation: Pampalk E.: Aligned Self-Organizing Maps. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-20, 2003


OFAI-TR-2003-19 ( 46kB g-zipped PostScript file,  129kB PDF file)

Evaluating Protein Name Recognition: An Automatic Approach

A.K. Seewald

In some domains, named entity recognition might be considered a solved problem. This does not hold for biological text mining, where protein and gene name recognition are still open research problems.

In this paper, we compare two current approaches to the problem of protein name recognition, KeX and Yapex. Unlike manual evaluation which relies on domain experts' judgement concerning position and extent of all relevant names and entails a high workload, our comparison methodology is fully automatic. Our results agree with previous manual evaluations of KeX and Yapex which validates our approach.

Keywords: BioInformatics, Text Mining, Information Retrieval, Protein Name Recognition

Citation: Seewald A.K.: Evaluating Protein Name Recognition: An Automatic Approach. Workshop on Data Mining and Text Mining for BioInformatics, 14th European Conference on Machine Learning (ECML-2003), Dubrovnik-Cavtat, Croatia, 2003.


OFAI-TR-2003-18 ( 1244kB g-zipped PostScript file,  166kB PDF file)

Recognizing Domain and Species from MEDLINE Proteomics Publications

A.K. Seewald

In text mining for bioinformatics, one important bottle-neck is the availability of high-quality tagged corpora. We introduce a novel approach to learn extraction patterns from pre-classified but untagged corpora, which are easier to generate automatically. We apply our approach to two datasets derived from SWISS-PROT plus associated MEDLINE references. In both experiments a Ripper-like rule learner, JRip, is competitive to all other learners; outputs a manageable number of understandable rules; and performs comparably to a human domain expert investigating the same task. Based on our results, we note weaknesses and strengths of both human and machine learning approaches, which indicates that they have distinct areas of expertise. Our approach may be used to generate initial rulesets for information extraction, to be iteratively refined by domain experts; or as a stand-alone approach with some losses in precision.

Keywords: Text Mining, Bioinformatics, Information Extraction

Citation: Seewald A.: Recognizing Domain and Species from MEDLINE Proteomics Publications. Workshop on Data Mining and Text Mining for Bioinformatics, 14th European Conference on Machine Learning (ECML-2003), Dubrovnik-Cavtat, Croatia, 2003.


OFAI-TR-2003-17 ( 591kB PDF file)

Visualizing Changes in the Structure of Data for Exploratory Feature Selection

Elias Pampalk, Werner Goebl, Gerhard Widmer

Using visualization techniques to explore and understand high-dimensional data is an efficient way to combine human intelligence with the immense brute force computation power available nowadays. Several visualization techniques have been developed to study the cluster structure of data, i.e., the existence of distinctive groups in the data and how these clusters are related to each other. However, only few of these techniques lend themselves to studying how this structure changes if the features describing the data are changed. Understanding this relationship between the features and the cluster structure means understanding the features themselves and is thus a useful tool in the feature extraction phase. In this paper we present a novel approach to visualizing how modification of the features with respect to weighting or normalization changes the cluster structure. We demonstrate the application of our approach in two music related data mining projects.

Keywords: High-Dimensional Data, Interactive Data Mining, Music, Exploration, Visualization, Self-Organizing Maps

Citation: Pampalk E., Goebl W., Widmer G.: Visualizing Changes in the Inherent Structure of Data for Exploratory Feature Selection. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-17, 2003


OFAI-TR-2003-16 ( 102kB g-zipped PostScript file,  492kB PDF file)

Measurement and reproduction accuracy of computer-controlled grand pianos

Werner Goebl, Roberto Bresin

The recording and reproducing capabilities of a Yamaha Disklavier grand piano and a Bösendorfer SE290 computer-controlled grand piano were tested, with the goal of examining their reliability for performance research. An experimental setup consisting of accelerometers and a calibrated microphone was used to capture key and hammer movements, as well as the acoustic signal. Five selected keys were played by pianists with two types of touch (staccato and legato). Timing and dynamic differences between the original performance, the corresponding MIDI file recorded by the computer-controlled pianos, and its reproduction were analysed. The two devices performed quite differently with respect to timing and dynamic accuracy. The Disklavier's onset capturing was slightly more precise (+/–12~ms) than its reproduction (from –20 to +30 ms). The Bösendorfer performed generally better, but its timing accuracy was slightly less precise for recording (–9 to 3 ms) than for reproduction (+/–2 ms). Both devices exhibited a systematic (linear) error in recording over time. In the dynamic dimension, the Bösendorfer showed higher consistency over the whole dynamic range, while the Disklavier performed well only in a wide middle range. Neither device was able to capture or reproduce different types of touch.

Keywords: computer-controlled piano, Disklavier, Bösendorfer SE, performance research

Citation: Goebl W., Bresin R.: Measurement and reproduction accuracy of computer-controlled grand pianos. Proceedings of the Stockholm Music Acoustics Conference, August 6–9, 2003 (SMAC03), Stockholm, Sweden


OFAI-TR-2003-15 ( 147kB g-zipped PostScript file,  831kB PDF file)

The piano action as the performer's interface: Timing properties, dynamic behaviour and the performer's possibilities

Werner Goebl, Roberto Bresin, Alexander Galembo

A concert pianist is able to produce a wide range of imaginable nuances of musical expression by actuating the 88 keys on a piano, none of which travel through a distance greater than one centimeter. In this study, we investigated the temporal behaviour of grand piano actions from different manufacturers using different types of touch (`legato' versus `staccato'). An experimental setup consisting of accelerometers and a calibrated microphone was used to capture key and hammer movements, as well as the acoustic signal. Five selected keys were played by pianists with the two types of touch. The analysis of the three-channel data was automated by computer software. Discrete measurements (e.g., finger–key, hammer–string, and key bottom contact times, hammer velocity) were extracted for each of the over 4000 recorded tones in order to study several temporal relations. Travel times of the hammer (from finger–key to hammer–string) as a function of hammer velocity varied clearly between the two types of touch, but only slightly between pianos. A travel time versus hammer velocity function found in earlier work [W. Goebl, J. Acoust. Soc. Am. 110, 563–572 (2001)] derived from a computer-controlled piano was replicated. Key bottom contact times exhibited larger variability between types of touch and pianos. However, no effect of touch type was found in the peak sound level (in dB as a function of hammer velocity).

Keywords: music, piano action, timing properties, instrumental acoustics

Citation: Goebl W., Bresin R., Galembo A.: The piano action as the performer's interface: Timing properties, dynamic behaviour and the performer's possibilities. To appear in Proceedings of the Stockholm Music Acoustics Conference, August 6–9, 2003 (SMAC03), Stockholm, Sweden.


OFAI-TR-2003-14 ( 159kB PDF file)

Pairwise Preference Learning and Ranking

Johannes Fürnkranz, Eyke Hüllermeier

We consider supervised learning of a ranking function, which is a mapping from instances to total orders over a set of labels (options). The training information consists of examples with partial (and possibly inconsistent) information about their associated rankings. From these, we induce a ranking function by reducing the original problem to a number of binary classification problems, one for each pair of labels. The main objective of this work is to investigate the trade-off between the quality of the induced ranking function and the computational complexity of the algorithm, both depending on the amount of preference information given for each example. To this end, we present theoretical results on the complexity of pairwise preference learning. We also carry out some controlled experiments investigating the predictive performance of our method for different types of preference information, such as top-ranked labels and complete rankings. The domain of this study is the prediction of a rational agent's ranking of actions in an uncertain environment.

Keywords: Ranking, Pairwise Classification, Round Robin Learning

Citation: Fürnkranz J., Hüllermeier E.: Pairwise Preference Learning and Ranking. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-14, 2003


OFAI-TR-2003-13 ( 632kB g-zipped PostScript file,  154kB PDF file)

Towards Automatic Analysis of Expressive Performance

Simon Dixon

We outline a system for automatic analysis of audio recordings of known musical works, by utilising the musical score to aid the signal processing algorithms. The proposed system matches the audio data to the score note by note, predicts the timing of future notes and then searches in the neighbourhood of the prediction to estimate the actual onset time. In this paper, we address possible signal processing approaches for processing solo piano music, and describe the planned architecture of the rest of the system. We present results of testing the signal processing algorithm on a performance of a Mozart piano sonata. The motivation for this work is the analysis of expressive performance, that is, measuring the subtle interpretative choices which distinguish the great masters of performance.

Keywords: audio content analysis, automatic transcription, musical performance

Citation: In Proceedings of the 5th Triennial Conference of the European Society for the Cognitive Sciences of Music (ESCOM5), Hanover, Germany, September 2003.


OFAI-TR-2003-12 ( 205kB PDF file)

VIE-DIAB: a Support Program for Telemedical Glycaemic Control

Christian Popow, Werner Horn, Birgit Rami, Edith Schober

Ambulatory care supporting long-term treatment of type I diabetes mellitus (DM) is based on the analysis of daily notes of serum glucose meas-urements, carbohydrate intake, and insulin dosage. In order to improve glycae-mic control, telemedicine support aims at improving communication between patients and diabetologists. Patient data are collected using mobile phone ser-vices. Weekly responses from the diabetes care centre should help the patient to optimize glycaemic control. The telemedical support system VIE-DIAB inte-grates data collection, visualization, and recommendation handling by using mobile phone and internet services. Its core is a module which visualizes a sum-mary of the patient s diary data of the last 4 weeks. Traditional methods for displaying these data mainly use (coloured) scatter plots, line graphs or sum-mary histograms to give a graphical overview of the data. VIE-DIAB supports a more intuitive view by presenting 4x7 multiples. Each multiple represents the serum glucose values of one day. First results indicate that this form of data presentation is very useful to physicians.

Keywords: Diabetes, Telemedicine, Glycaemic Control

Citation: Popow C., Horn W., Rami B., Schober E.: VIE-DIAB: a Support Program for Telemedical Glycaemic Control, in: Dojat M., et al.(eds.): Artificial Intelligence in Medicine. Proceedings of the 9th Conference on Artificial Intelligence in Medicine in Europe (AIME-2003), Springer, Berlin, 2003, pp.350-354


OFAI-TR-2003-11 ( 623kB g-zipped PostScript file,  134kB PDF file)

Asynchrony versus intensity as cues for melody perception in chords and real music

Werner Goebl, Richard Parncutt

In expressive piano performance, the performer emphasises a melody by increasing its intensity and by anticipating it by some tens of milliseconds (melody lead). In this contribution, we continue previous research on the influence of asynchrony and intensity variation on the perceived salience of a particular tone or voice with three experiments. In Experiment I, three-tone piano chords are presented with each of the three tones simultaneously manipulated in timing and intensity by up to ±55 ms and +30/–22 MIDI velocity units. Loudness ratings depended mainly on relative intensity and relatively little on timing (e.g., anticipated tones were sometimes rated louder than delayed ones). The lower voice was generally rated louder than the middle voice. In Experiment II, a sequence of chords produced similar results; streaming enhanced the effect of asynchrony only marginally. In Experiment III, a short musical excerpt by Chopin was presented. Again, intensity was the dominating cue. In contrast to previous findings, a melody that was both delayed and louder in intensity was rated significantly louder than a melody that was simultaneous and louder.

Keywords: melody perception, piano, asynchrony, intensity, streaming, masking

Citation: To appear in: Proceedings of the 5th ESCOM conference, Sept 8–13, 2003 Hanover, Germany


OFAI-TR-2003-10 ( 592kB g-zipped PostScript file,  751kB PDF file)

Processing and Clustering Time Series of Mobile Robot Sensory Data

Patrick M. Pölz, Hörtnagl Erik, Prem Erich

This report describes work on clustering the time series produced by sensor readings of a mobile robot carried out within the European IST FET project SIGNAL (Systemic Intelligence for GrowiNg up Artefacts that Live, IST-2000-29255). Work reported here was driven by the necessity to distinguish different levels of processing. The first part of this report copes with experiences gained and mechanisms developed in our implementation of several algorithms in the different stages of processing the time series. This also includes event detection. The second part attends to clustering the pre-processed time series using a fixed window. We compare the technique of dynamic time warping versus using a Euclidian distance as measure for aligning series elements. We found the Euclidian distance technique to perform considerably faster than dynamic time warping while achieving results of comparable quality.

Keywords: time-series clustering, mobile robotics, sensing and perception, perceptually interesting points, dynamic timewarping, signal processing

Citation: Pölz P., Erik H., Erich P.: Processing and Clustering Time Series of Mobile Robot Sensory Data. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-10, 2003


OFAI-TR-2003-09 ( 454kB PDF file)

A New Approach to Hierarchical Clustering and Structuring of Data with Self-Organizing Maps

Elias Pampalk, Gerhard Widmer, Alvin Chan

The Self-Organizing Map (SOM) is a powerful tool for exploratory data analysis which has been employed in a wide range of data mining applications. We present a novel approach to reveal the inherent hierarchical structure of data using multiple SOMs together with heuristics which optimize the stability. In particular, we address shortcomings of the Growing Hierarchical Self-Organizing Map (GHSOM) regarding the decision which areas in the hierarchical structure need to be represented by a finer granularity and which areas do not. We introduce the Tension and Mapping Ratio} extension to exploit specific characteristics of the SOM based on the topology preservation. As a main result, in contrast to the GHSOM, the inherent hierarchical structure of the data is revealed without requiring the user to define a threshold parameter which controls the map sizes of the individual SOMs. We evaluate our approach using data from real-world data mining projects in the music domain.

Keywords: Exploratory Data Analysis, Growing Hierarchical Self-Organizing Maps, Tension and Mapping Ratio

Citation: Pampalk E., Widmer G., Chan A.: A New Approach to Hierarchical Clustering and Structuring of Data with Self-Organizing Maps. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-09, 2003


OFAI-TR-2003-08 ( 40kB g-zipped PostScript file,  100kB PDF file)

Towards a Theoretical Framework for Ensemble Classification

Alexander K. Seewald

Ensemble learning schemes such as AdaBoost and Bagging enhance the performance of a single classifier by combining predictions from multiple classifiers of the same type. The predictions from an ensemble of diverse classifiers can be combined in related ways, e.g. by voting or simply by selecting the best classifier via cross-validation -- a technique widely used in machine learning.

However, since no ensemble scheme is always the best choice, a deeper insight into the structure of meaningful approaches to combine predictions is needed to achieve further progress.

In this paper we offer an operational reformulation of common ensemble learning schemes -- Voting, Selection by Crossvalidation (XVal), Grading and Bagging -- as a Stacking scheme with appropriate parameter settings. Thus, from a theoretical point of view all these schemes can be reduced to Stacking with an appropriate combination method. This result is an important step towards a general theoretical framework for the field of ensemble learning.

Keywords: Machine Learning, Ensembles

Citation: Seewald A.K.: Towards a Theoretical Framework for Ensemble Classification (extended version). In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI-03), Morgan Kaufmann, 2003.


OFAI-TR-2003-07 ( 154kB g-zipped PostScript file,  245kB PDF file)

Using Algebraic Datatypes as Uniform Representation for Structured Data

Markus Mottl

The question of how to uniformly encode structured data for the purpose of machine learning seems to have been rather neglected so far by researchers in comparison to the abundance of approaches how knowledge could be inferred from certain representations. We therefore propose a well-understood concept from the formal semantics of programming languages as vehicle for the uniform representation of discrete data, namely \emph{algebraic datatypes}. It will be demonstrated in theory and practice that current encodings severely limit the power of widespread machine learning techniques especially what concerns handling of structured information, how algebraic datatypes elegantly extend expressiveness to evade these limitations and that this concept can guide the way to new learning algorithms. As an example, it will be shown how ordinary decision tree learning can be efficiently generalized to this data representation and that the latter provides for an interesting solution both to the missing value problem and to the representation of structured multi-attribute goals. Tight theoretical relations to logic yield valuable insights into complexity and expressiveness.

Keywords: Structured data, Algebraic datatypes, Decision tree learning

Citation: Mottl M.: Using Algebraic Datatypes as Uniform Representation for Structured Data. Submitted to: Machine Learning Journal, Special Issue on Inductive Logic Programming and Relational Learning.


OFAI-TR-2003-06 ( 142kB g-zipped PostScript file,  371kB PDF file)

Using ICA for removal of ocular artifacts in EEG recorded from blind subjects

Arthur Flexer, Herbert Bauer, Juergen Pripfl, Georg Dorffner

One of the standard applications of Independent Component Analysis (ICA) to EEG is removal of artifacts due to movements of the eye bulbs. Short blinks as well as slower saccadic movements are removed by subtracting respective independent components (ICs). EEG recorded from blind subjects poses special problems since it shows a higher quantity of eye movements which are also more irregular and very different across subjects. It is demonstrated that ICA can still be of use by comparing results from four blind subjects with results from one subject without eye bulbs who therefore does not show eye movement artifacts at all.

Keywords: ICA, EEG

Citation: Flexer A., Bauer H., Pripfl J., Dorffner G.: Using ICA for removal of ocular artifacts in EEG recorded from blind subjects, in R.Trappl (ed.): "Cybernetics and Systems 2004", Vienna, Austrian Society for Cybernetic Studies, pp. 491-496, 2004.


OFAI-TR-2003-05 ( 1124kB g-zipped PostScript file,  2460kB PDF file)

Towards Understanding Stacking - Studies of a General Ensemble Learning Scheme

Alexander K. Seewald

This thesis consists of complementary studies concerned with the ensemble learning scheme Stacking. We will explore various aspects of its behaviour and also clarify its relation to related ensemble learning schemes in two ways: by showing that it is usually the best choice and also by demonstrating that most ensemble learning schemes can be simulated by Stacking, making it the most general ensemble learning scheme.

We explore the parameter state space of Stacking. We systematically investigate Stacking with an exhaustive set of base classifiers, diverse meta classifiers and two related types of meta data. We propose default settings of all these parameters, grounded by empirical and theoretical arguments.

We introduce the variant StackingC, which improves Stacking's performance further, reduces computational cost and also resolves a significant weakness.

We present results from an alternative paradigm to compare classifiers. We investigate the hypothesis that StackingC is more stable than other ensemble learning schemes, i.e. that its learning curve is at the uppermost level of all learning curves. Surprisingly, we find that there are no significant differences between all considered schemes within this paradigm.

We show that most ensemble learning systems, including StackingC, Grading (Seewald & Fuernkranz, 2001) and even Bagging (Breiman, 1996) can be simulated by Stacking. For this we give functionally equivalent definitions of most schemes as meta classifiers for Stacking.

Finally we shortly introduce the field of Information Visualization. We find that the majority of examples are misclassified because none of the base classifiers predict correctly. Although Stacking would potentially be able to learn from such a setting, this is not observed. On the contrary, Stacking even predicts incorrectly in some cases where a majority of base classifiers predicts correctly.

Keywords: Machine Learning, Ensembles

Citation: Seewald A.K.: Towards Understanding Stacking - Studies of a General Ensemble Learning Scheme, Institute for Med.Cybernetics and Artificial Intelligence, University of Vienna, PhD dissertation, 2003. Also available as Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-05, 2003


OFAI-TR-2003-04 ( 367kB g-zipped PostScript file,  1078kB PDF file)

Measurement and Reproduction Accuracy of Computer Controlled Grand Pianos

Werner Goebl, Roberto Bresin

The recording and reproducing capabilities of a Yamaha Disklavier grand piano and a Bösendorfer SE290 computer controlled grand piano were tested, with the goal of examining their usefulness for performance research. An experimental setup consisting of accelerometers and a calibrated microphone is used to capture key and hammer movements, as well as the acoustic signal. Five selected keys are played by pianists with two types of touch ('staccato – legato'). Timing and dynamic differences between the original performance, the corresponding MIDI file recorded by the computer-controlled pianos, and its reproduction are analyzed. The two devices performed quite differently with respect to timing and dynamic accuracy. The Disklavier's onset capturing is slightly more precise (+/–10 ms) than its reproduction (–20 to +30 ms), the Bösendorfer performs generally better, but its timing accuracy is slightly less precise for recording (–10 to 3 ms) than for reproduction (+/–2 ms). Both devices exhibit a systematic (linear) error in recording over time. In the dynamic dimension, the Bösendorfer shows higher consistency over the whole dynamic range, while the Disklavier performs well only in a wide middle range. Neither device is able to capture or reproduce different types of touch.

Keywords: music performance, reproducing piano, Bösendorfer, Yamaha Disklavier

Citation: Goebl, W., & Bresin, R. (2003). Measurement and reproduction accuracy of computer-controlled grand pianos, submitted.


OFAI-TR-2003-03 ( 660kB g-zipped PostScript file,  657kB PDF file)

Simulation and Validation of an Integrated Markets Model

Brian Sallans, Alexander Pfister, Alexandros Karatzoglou, Georg Dorffner

The behavior of boundedly rational agents in two interacting markets is investigated. A discrete-time model of coupled financial and consumer markets is described. The integrated model consists of heterogenous consumers, financial traders, and production firms. The production firms operate in the consumer market, and offer their shares to be traded on the financial market. The model is validated by comparing its output to known empirical properties of real markets. In order to better explore the influence of model parameters on behavior, a novel Markov chain Monte Carlo method is introduced. This method allows for the efficient exploration of large parameter spaces, in order to find which parameter regimes lead to reproduction of empirical phenomena. It is shown that the integrated markets model can reproduce a number of empirical "stylized facts", including learning-by-doing effects, fundamental price effects, low autocorrelations, volatility clustering, high kurtosis, and volatility-volume correlations.

Keywords: agent-based economics, artificial stock market, reinforcment learning

Citation: Sallans B., Pfister A., Karatzoglou A., Dorffner G.: Simulation and Validation of an Integrated Markets Model. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-03, 2003


OFAI-TR-2003-02 ( 282kB g-zipped PostScript file,  319kB PDF file)

Playing Mozart Phrase by Phrase

Asmir Tobudic, Gerhard Widmer

The article presents an application of instance-based learning to the problem of expressive music performance. A system is described that tries to learn to shape tempo and dynamics of a musical performance by analogy to timing and dynamics patterns found in performances by a concert pianist. The learning algorithm itself is a straightforward k-nearest-neighbour algorithm. The interesting aspects of this work are application-specific: we show how a complex, multi-level artifacts like the tempo/dynamics variations applied by a musician can be decomposed into well-defined training examples for a learner, and that case-based learning is indeed a sensible strategy in an artistic domain like music performance. While the results of a first quantitative experiment turn out to be rather disappointing, we will show various ways in which the results can be improved, finally resulting in a system that won a prize in a recent 'computer music performance' contest.

Keywords: instance-based learning, music

Citation: Tobudic A., Widmer G.: Playing Mozart Phrase by Phrase. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2003-02, 2003


OFAI-TR-2003-01 ( 240kB g-zipped PostScript file,  156kB PDF file)

Recognition of Famous Pianists Using Machine Learning Algorithms: First Experimental Results

Patrick Zanon, Gerhard Widmer

The paper addresses the question whether a machine can learn to identify famous performers (pianists) based on their style of playing. A preliminary study is presented where different machine learning algorithms are applied to performance data derived from Mozart sonata recordings by several famous pianists. It is shown that the algorithms learn to recognize pianists at a level better than chance, and that some pianists seem easier to recognize than others. The study identifies a number of limitations of the current approach (regarding both data and learning algorithms) and points to a variety of fruitful directions for further research.

Keywords: Artificial Intelligence, Machine recognition of music, Music analysis

Citation: Submitted to the XIV Colloquium on Musical Informatics (XIV CIM 2003), Firenze, Italy, May 8-9-10, 2003.


OFAI-TR-2002-41 ( 100kB PDF file)

Engineering Agent Systems: Best of ``From Agent Theory to Agent Implementation (AT2AI)-3''

Paolo Petta, Jörg P. Müller

Keywords: Multi-Agent Systems, AOSE (Agent Oriented Software Engineering), Agent Oriented Technologies

Citation: Petta P., Müller J.: Engineering Agent Systems: Best of ``From Agent Theory to Agent Implementation (AT2AI)-3''. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-41, 2002 (published in Applied Artificial Intelligence 16(9-10):671-676, 2002).


OFAI-TR-2002-40 ( 1018kB g-zipped PostScript file,  634kB PDF file)

Visualizing Expressive Performance in Tempo-Loudness Space

Jörg Langner, Werner Goebl

This paper introduces a new method for an integrated display of tempo and loudness variations as measured in expressive music performance. This visualization technique includes data acquisition from both MIDI instruments and audio recordings, data reduction by smoothing measured performance data, and animated display on computer screen in synchrony with the music: A dot moves through a two-dimensional space of tempo (x axis) and loudness (y axis), leaving behind it a trajectory that may be interpreted as the intrinsic performance path of a particular performance. Snapshots of these trajectories can be used for detailed performance analyses. Expert performances of Chopin's E major Etude (op. 10, No. 3) and an algorithmic performance of Schubert's G flat major Impromptu (D. 899, No. 3) are compared with performances by famous pianists (Maurizio Pollini, Alfred Brendel). This method allows efficient display and analysis of large amounts of performance data and elucidates interactions between timing and dynamics.

Keywords: performance visualization, tempo, loudness, motion,

Citation: Langner J., Goebl W.: Visualizing Expressive Performance in Tempo-Loudness Space . Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-40, 2002


OFAI-TR-2002-39 ( 616kB g-zipped PostScript file,  527kB PDF file)

Analysis of Musical Content in Digital Audio

Simon Dixon

Automatic analysis of digital audio with musical content is an intensely difficult task which is important for various applications in computer music, audio compression and music information retrieval. This paper contains a brief review of audio analysis as it relates to music, followed by three case studies of recently developed systems which analyse specific aspects of music. The first system is BeatRoot, a beat tracking system that finds the temporal location of musical beats in an audio recording, analogous to the way that people tap their feet in time to music. The second system is JTranscriber, an interactive automatic transcription system, which recognises musical notes and converts them into MIDI format, displaying the audio data as a spectrogram with the MIDI data overlaid in piano roll notation, and allowing interactive monitoring and correction of the extracted MIDI data. The third system is the Performance Worm, a real time system for visualisation of musical expression, which presents in real time a two dimensional animation of variations in tempo and loudness.

Keywords: Content analysis, Musical expression, Beat tracking, Automatic transcription, Performance Worm

Citation: Draft version of a paper appearing in "Computer Graphics and Multimedia: Applications, Problems, and Solutions" (ed. J. DiMarco), Idea Group, 2004, pp 214-235.


OFAI-TR-2002-38 ( 546kB g-zipped PostScript file,  393kB PDF file)

On the analysis of musical expression in audio signals

Simon Dixon

In western art music, composers communicate their work to performers via a standard notation which specificies the musical pitches and relative timings of notes. This notation may also include some higher level information such as variations in the dynamics, tempo and timing. Famous performers are characterised by their expressive interpretation, the ability to convey structural and emotive information within the given framework. The majority of work on audio content analysis focusses on retrieving score-level information; this paper reports on the extraction of parameters describing the performance, a task which requires a much higher degree of accuracy. Two systems are presented: BeatRoot, an off-line beat tracking system which finds the times of musical beats and tracks changes in tempo throughout a performance, and the Performance Worm, a system which provides a real-time visualisation of the two most important expressive dimensions, tempo and dynamics. Both of these systems are being used to process data for a large-scale study of musical expression in classical and romantic piano performance, which uses artificial intelligence (machine learning) techniques to discover fundamental patterns or principles governing expressive performance.

Keywords: beat tracking, musical expression, content analysis, digital audio

Citation: Proceedings of the Conference on Storage and Retrieval for Media Databases 2003, SPIE and IS&T 15th Annual Symposium on Electronic Imaging, Santa Clara CA, Jan 2003, pp 122-132.


OFAI-TR-2002-37 ( 130kB PDF file)

Round Robin Ensembles

Johannes Fürnkranz

In this paper we investigate the performance of pairwise (or round robin) classification, originally a technique for turning multi-class problems into two-class problems, as a general ensemble technique. In particular, we show that the use of round robin ensembles will also increase the classification performance of decision tree learners, even though they can directly handle multi-class problems. The performance gain is not as large as for bagging and boosting, but on the other hand round robin ensembles have a clearly defined semantics. Furthermore, we investigate whether confidence estimates can be used to improve the accuracy of the predictions of the ensemble. Finally, we show that the advantage of pairwise classification over direct multi-class classification and one-against-all binarization increases with the number of classes, and that round robin ensembles form an interesting alternative for problems with ordered class values.

Keywords: Ensemble Methods, Round Robin Learning, Rule Learning, Decision Tree Learning, Ordered Classification

Citation: Fürnkranz J.: Round Robin Ensembles. Intelligent Data Analysis 7(5), 2003.


OFAI-TR-2002-36 ( 154kB PDF file)

On the Use of Fast Subsampling Estimates for Algorithm Recommendation

Johannes Fürnkranz, Johann Petrak, Pavel Brazdil, Carlos Soares

The use of subsampling for scaling up the performance of learning algorithms has become fairly popular in the recent literature. In this paper, we investigate the use of performance estimates obtained on a subsample of the data for the task of recommending the best learning algorithm(s) for the problem. In particular, we examine the use of subsampling estimates as features for meta-learning, thereby generalizing previous work on landmarking and on direct algorithm recommendation via subsampling. The main goal of the paper is to investigate the influence of various parameter choices on the meta-learning performance, in particular the size of training and test sets and the number of subsamples.

Citation: Fürnkranz J., Petrak J., Brazdil P., Soares C.: On the Use of Fast Subsampling Estimates for Algorithm Recommendation. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-36, 2002


OFAI-TR-2002-35 ( 5347kB PDF file)

ESAW'02, Workshop Notes of the Third International Workshop ``Engineering Societies in the Agents World'', 16-17 September 2002, Universidad Rey Juan Carlos, Madrid, Spain

Paolo Petta, Robert Tolksdorf, Franco Zambonelli, Sascha Ossowski (eds.)

The sequel to successful editions in 2000 and 2001, ESAW'02 remains committed to the use of the notion of multi-agent systems as seed for animated, constructive, and highly inter-disciplinary discussions about technologies, methodologies, and tools for the engineering of complex distributed applications. While the workshop places an emphasis on practical engineering issues, it also welcomes theoretical philosophical, and empirical contributions, provided that they clearly document their connection to the core applied issues. This volume collects the twenty accepted papers as the workshop notes. After the workshop, asubset of the presented papers will be included after revisions in the workshop post-proceedings to appear with Springer-Verlag in the Lecture Notes on Artificial Intelligence series.

Keywords: Agent Oriented Engineering, Multi-Agent Systems

Citation: Petta P., Tolksdorf R., Zambonelli F., Ossowski S. (eds.): ESAW'02, Workshop Notes of the Third International Workshop ``Engineering Societies in the Agents World'', 16-17 September 2002, Universidad Rey Juan Carlos, Madrid, Spain. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-35, 2002


OFAI-TR-2002-34 ( 93kB g-zipped PostScript file,  444kB PDF file)

Offline Evaluation of Term Utility Functions

A.K. Seewald, Holzbaur C., Widmer G.

In this paper we investigate the problem of automatically finding terms (words) that are characteristic of documents collected in Melvil ontology nodes. We choose a variety of term utility functions, commonly used in text mining, to determine relative importance of terms for the task of deciding if a given document is part of a certain concept or not. We evaluated each utility function both quantitatively, by considering precision and recall of the top ten terms returned, and qualitatively, by analyzing which of the original patterns and obviously related terms were recovered. This approach could be used to suggest promising terms to a human ontology editor during creation of a new node. Our results look somewhat promising but still needful of improvement -- so we also report on probable causes of unsatisfactory results.

Keywords: Information Retrieval, Term Utility Function, 3DSearch

Citation: Seewald A.K., Holzbaur C., Widmer G.: Offline Evaluation of Term Utility Functions. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-34, 2002


OFAI-TR-2002-33 ( 153kB PDF file)

Web Structure Mining - Exploiting the Graph Structure of the World-Wide Web

Johannes Fürnkranz

The World-Wide Web provides every internet citizen with access to an abundance of information, but it becomes increasingly difficult to identify the relevant pieces of information. Web mining is a new research area that tries to address this problem by applying techniques from data mining and machine learning to Web data and documents. In this paper, we will give a brief overview of Web mining, with a special focus on techniques that aim at exploiting the graph structure of the Web for improved retrieval performance and classification accuracy.

Keywords: Web Mining, Web Structure Mining, Survey

Citation: J. Fürnkranz: Web Structure Mining - Exploiting the Graph Structure of the World-Wide Web, ÖGAI Journal 21(2):17-26, 2002.


OFAI-TR-2002-32 ( 87kB g-zipped PostScript file,  270kB PDF file)

Exploring the Parameter State Space of Stacking

Alexander K. Seewald

Ensemble learning schemes are a new field in data mining. While current research concentrates mainly on improving the performance of single learning algorithms, an alternative is to combine learners with different biases. Stacking is the best-known such scheme which tries to combine learners' predictions or confidences via another learning algorithm. However, the adoption of Stacking into the data mining community is hampered by its large parameter space, consisting mainly of other learning algorithms: (1) the set of learning algorithms to combine, (2) the meta-learner responsible for the combining and (3) the type of meta-data to use: confidences or predictions. None of these parameters are obvious choices. Furthermore, little is known about the relation between parameter settings and performance of Stacking. By exploring all of Stacking's parameter settings and their interdependencies, we intend make Stacking a suitable choice for mainstream data mining applications.

Keywords: Machine Learning, Classification, Data Mining, Ensembles

Citation: Seewald A.K.: Exploring the Parameter State Space of Stacking (extended version). In Proceedings of International Conference on Data Mining (ICDM-2002), Maebashi TERRSA, Maebashi City, Japan. IEEE Computer Society Press, Los Alamitos, California.


OFAI-TR-2002-31 ( 408kB g-zipped PostScript file,  551kB PDF file)

In Search of the Horowitz Factor: Interim Report on a Musical Discovery Project

Gerhard Widmer

The paper gives an overview of an inter-disciplinary research project whose goal is to elucidate the complex phenomenon of {\em expressive music performance} with the help of machine learning and automated discovery methods. The general research questions that guide the project are laid out, and some of the most important results achieved so far are briefly summarized (with an emphasis on the most recent and still very speculative work). A broad view of the discovery process is given, from data acquisition issues through data visualization to inductive model building and pattern discovery. It is shown that it is indeed possible for a machine to make novel and interesting discoveries even in a domain like music. The report closes with a few general lessons learned and with the identification of a number of open and challenging research problems.

Keywords: machine learning, data mining, knowledge discovery, music, music performance

Citation: Draft version of an invited paper to appear in Proceedings of the 5th International Conference on Discovery Science (DS'02), Lübeck, Germany. Berlin: Springer Verlag.


OFAI-TR-2002-30 ( 491kB PDF file)

Content-based Organization and Visualization of Music Archives

Elias Pampalk, Andreas Rauber, Dieter Merkl

With Islands of Music we present an approach which facilitates exploration of music libraries without requiring manual genre classification. Given pieces of music in raw audio format we calculate their perceived similarities based on psychoacoustic models. Subsequently, the pieces are organized on a 2-dimensional map so that similar pieces are located close to each other. A visualization using a metaphor of geographic maps provides an intuitive interface where islands resemble genres or styles of music. We demonstrate the approach using a collection of 359 popular pieces of music.

Keywords: Content-based Music Retrieval, Feature Extraction, Clustering, Self-Organizing Map, User Interface, Genre, Rhythm

Citation: Pampalk E., Rauber A., Merkl D.: Content-based Organization and Visualization of Music Archives. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-30, 2002


OFAI-TR-2002-29 ( 394kB PDF file)

Using Smoothed Data Histograms for Cluster Visualization in Self-Organizing Maps

Elias Pampalk, Andreas Rauber, Dieter Merkl

Several methods to visualize clusters in high-dimensional data sets using the Self-Organizing Map (SOM) have been proposed. However, most of these methods only focus on the information extracted from the model vectors of the SOM. This paper introduces a novel method to visualize the clusters of a SOM based on smoothed data histograms. The method is illustrated using a simple 2-dimensional data set and similarities to other SOM based visualizations and to the posterior probability distribution of the Generative Topographic Mapping are discussed. Furthermore, the method is evaluated on a real world data set consisting of pieces of music.

Citation: Pampalk E., Rauber A., Merkl D.: Using Smoothed Data Histograms for Cluster Visualization in Self-Organizing Maps. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-29, 2002


OFAI-TR-2002-28 ( 218kB PDF file)

A Pathology of Bottom-Up Hill-Climbing in Inductive Rule Learning

Johannes Fürnkranz

In this paper, we close the gap between the simple and straight-forward implementations of top-down hill-climbing that can be found in the literature, and the comparably complex strategies for greedy bottom-up generalization. Our main result is that the simple bottom-up counterpart of the top-down hill-climbing algorithm is unable to learn in domains with comparably dispersed examples. In particular, we show that greedy generalization from a seed example is impossible if it differs from its nearest neighbor in more than one attribute value. We also perform an empirical study of the how frequent this case is in popular benchmark datasets, and present average-case and worst-case results for binary domains.

Keywords: Rule Learning, Hill-Climbing

Citation: Fürnkranz J.: A Pathology of Bottom-Up Hill-Climbing in Inductive Rule Learning. In Proceedings of the 13th International Conference on Algorithmic Learning Theory (ALT-02), Lübeck, Germany. Springer-Verlag 2002.


OFAI-TR-2002-27 ( 105kB g-zipped PostScript file,  168kB PDF file)

Modelling Large Datasets Using Algebraic Datatypes: A Case Study of the CONFMAN Database

Markus Mottl

Being able to provide clear specifications of large datasets comprising hundreds of variables, each of which can take on many different values, while still being able to efficiently and accurately learn functional relations from such data would certainly make data mining techniques even more viable in the real world. In this report we describe a new modelling approach, which essentially generalizes discrete decision tree learning to induction of non-recursive functions over algebraic datatypes. Taking the CONFMAN mediation database as guiding example, we demonstrate how this approach allows us to give more natural data specifications that can take into account semantic aspects which are hard or even impossible to model in common attribute-value representations. We will also explain how this can have a positive impact on accuracy and efficiency.

Keywords: data mining, decision tree learning, algebraic datatypes

Citation: Mottl M.: Modelling Large Datasets Using Algebraic Datatypes: A Case Study of the CONFMAN Database. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-27, 2002


OFAI-TR-2002-26

(German) Corpus Representativity, Bigrams, and PoS-Tagging Quality

Karel Oliva, Pavel Kveton

After some theoretical discussion on the issue of (different possible meanings of the term) representativity of a corpus, this paper takes this issue into practice and shows how a representative (in one of the meanings) corpus of German can be achieved. The approach is based on the idea of application of "invalid bigrams", i.e. of abolishing pairs of adjacent tags which constitute an incorrect configuration in a text of German (e.g., the bigram [ARTICLE,FINITE VERB]). On this spot, the paper puts forward a list of such bigrams for the STTS tagset (widely used for PoS-tagging German corpora). The power of the approach is illustrated on the results achieved on the NEGRA corpus. Finally, some general implications for tagging and taggers are mentioned. A reasonable theoretical knowledge of German language (esp. of German syntax) as well as a reasonable acquaintance with the STTS tagset (Schiller et al. 1999; STTS) is essential for understanding examples and some (but not all, and not the central) argumentation within the paper.

Keywords: Corpus Representativity, Bigrams, PoS-Tagging, German

Citation: Oliva K., Kveton P.: (German) Corpus Representativity, Bigrams, and PoS-Tagging Quality. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-26, 2002


OFAI-TR-2002-25 ( 89kB g-zipped PostScript file,  59kB PDF file)

Benefits of a Knowledge-based System for Parenteral Nutrition Support: a Report after 5 Years of Routine Daily Use

Werner Horn, Christian Popow, Silvia Miksch, Andreas Seyfang

Calculating the daily changing composition of parenteral nutrition for small newborn infants is troublesome and time consuming routine work in neonatal intensive care. The task needs expertise and experience and is prone to inherent calculation errors. In 1996 we introduced a knowledge-based system called VIE-PNN at the neonatal intensive care unit (NICU). It supports the daily calculation of nutrition plans for combined parenteral and enteral nutrition of newborn infants utilizing textbook knowledge and clinical rules of expert neonatologists. VIE-PNN uses a HTML-based client-server architecture and is integrated into the intranet of the local patient data management system (PDMS). The system is now in daily routine use for more than 5 years at 2 NICUs. Its main benefits are considerable time savings for clinicians and an increased quality of care. The main factors for success are its ease of use, its robustness, the integration into the PDMS, and the maintainability by the clinical experts. Most important, physicians highly value the time savings the system provides.

Keywords: VIE-PNN, Parenteral Nutrition, Neonatal ICU, Knowledge-Based System

Citation: Horn W., Popow C., Miksch S., Seyfang A.: Benefits of a Knowledge-based System for Parenteral Nutrition Support: a Report after 5 Years of Routine Daily Use, in F.van Harmelen (ed.), ECAI 2002, Proceedings of the 15th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2002.


OFAI-TR-2002-24 ( 124kB g-zipped PostScript file,  1032kB PDF file)

Improvements on continuous unsupervised sleep staging

Arthur Flexer, Georg Gruber, Georg Dorffner

We report improvements on automatic continuous sleep staging using Hidden Markov Models (HMM). Contrary to our previous efforts we trained the HMMs on data from single sleep labs instead of generalizing to data from diverse sleep labs. Our totally unsupervised approach detects the cornerstones of human sleep (wakefulness, deep and rem sleep) with around 80% accuracy based on data from a single EEG channel recorded at the sleep lab for which we already achieved the best results so far. Experiments with data from the worst sleep lab so far cannot be improved by training a separate model. This means that our previous problem of detecting rem sleep is not a general problem of our method but rather due to insufficient information in the data for some of the sleep labs.

Citation: Flexer A., Gruber G., Dorffner G.: Improvements on continuous unsupervised sleep staging, in Bourlard H. et al. (eds.), Neural Networks for Signal Processing XII, Proceedings of NNSP 2002, Institute of Electrical and Electronics Engineers, Inc., New York, NY, pp. 687-695, 2002.


OFAI-TR-2002-23 ( 78kB g-zipped PostScript file,  131kB PDF file)

Playing Mozart by Analogy: Learning Multi-level Timing and Dynamics Strategies

Gerhard Widmer, Asmir Tobudic

The paper describes basic research in the area of machine learning and musical expression. A first step towards automatic induction of multi-level models of expressive performance (currently only tempo and dynamics) from real performances by skilled pianists is presented. The goal is to learn to apply sensible tempo and dynamics `shapes' at various levels of the hierarchical musical phrase structure. We propose a general method for decomposing given expression curves into elementary shapes at different levels, and for separating phrase-level expression patterns from local, note-level ones. We then present a hybrid learning system that learns to predict, via two different learning algorithms, both note-level and phrase-level expressive patterns, and combines these predictions into complex composite expression curves for new pieces. Experimental results indicate that the approach is generally viable; however, we also discuss a number of severe limitations that still need to be overcome in order to arrive at truly musical machine-generated performances.

Keywords: Machine Learning, Music Performance,

Citation: Widmer G., Tobudic A.: Playing Mozart by Analogy: Learning Multi-level Timing and Dynamics Strategies. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-23, 2002


OFAI-TR-2002-22

Linguistics-based PoS-tagging of Czech: desambiguation of 'se' as a test case

karel oliva

This paper describes a set of explicit rules for the desambiguation of the Czech wordform 'se', ambiguous between a reflexive particle and (a vocalized form of) a preposition. The rules are given in full detail, both as to their form and (more importantly) also to their linguistic background. At the end of the paper, the approach is evaluated and compared to the current state-of-the-art statistical desambiguation. Even when the paper can generally serve as a pure overview and possibly also as a rough guideline for creating desambiguation modules for other Slavic languages, the full understanding of the argumentation used requires – due to the level of detail and of complexity – a decent command of Czech and a profound knowledge of Czech grammar from the reader; for this reason (and also for the reasons of space), Czech words, sentences and other material are neither glossed nor translated in the paper.

Keywords: ambiguity resolution, rule-based tagging, linguistics-based part-of-speech (PoS) tagging, reflexivity, Czech language

Citation: oliva k.: Linguistics-based PoS-tagging of Czech: desambiguation of 'se' as a test case . Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-22, 2002


OFAI-TR-2002-21 ( 35kB g-zipped PostScript file,  68kB PDF file)

Unsupervised discovery of morphologically related words based on orthographic and semantic similarity

Marco Baroni, Johanees Matiasek, Harald Trost

We present an algorithm that takes an unannotated corpus as its input, and returns a ranked list of probable morphologically related pairs as its output. The algorithm tries to discover morphologically related pairs by looking for pairs that are both orthographically and semantically similar, where orthographic similarity is measured in terms of minimum edit distance, and semantic similarity is measured in terms of mutual information. The procedure does not rely on a morpheme concatenation model, nor on distributional properties of word substrings (such as affix frequency). Experiments with German and English input give encouraging results, both in terms of precision (proportion of good pairs found at various cutoff points of the ranked list), and in terms of a qualitative analysis of the types of morphological patterns discovered by the algorithm.

Keywords: AI-Austria Publication, OEFAI Technical Report, Publications List AI (IMKAI+OESGK+OEFAI), Publications List NLU, Morphology, Unsupervised Learning, Minimum Edit Distance, Mutual Information,

Citation: Baroni M., Matiasek J., Trost H.: Unsupervised discovery of morphologically related words based on orthographic and semantic similarity. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-21, 2002


OFAI-TR-2002-20 ( 212kB PDF file)

Pairwise Classification as an Ensemble Technique

Johannes Fürnkranz

In this paper we investigate the performance of pairwise (or round robin) classification, originally a technique for turning multi-class problems into two-class problems, as a general ensemble technique. In particular, we show that the use of round robin ensembles will also increase the classification performance of decision tree learners, which could directly handle multi-class problems. The performance gain is not as large as for bagging and boosting, but on the other hand round robin ensembles have a clear semantics. Furthermore, we show that the advantage of pairwise classification over direct multi-class classification and one-against-all binarization increases with the number of classes, and that round robin ensembles form an interesting alternative for problems with ordered class values.

Keywords: Class Binarization, Pairwise Classification, Round Robin Learning, Inductive Rule Learning, Ensemble Methods

Citation: Fürnkranz J.: Pairwise Classification as an Ensemble Technique. Proceedings of the 13th European Conference on Machine Learning (ECML'02), Helsinki, Finland.


OFAI-TR-2002-19 ( 69kB g-zipped PostScript file,  249kB PDF file)

Density-Based Centroid Approximation for Initializing Iterative Clustering Algorithms

Marcus-Christopher Ludl, Gerhard Widmer

We present KDI (Kernel Density Initialization), a density-based procedure for approximating centroids for the initialization step of iteration-based clustering algorithms. We show empirically that a rather low number of distance calculations in conjunction with a fast algorithm for finding the highest peaks are sufficient for effectively and efficiently finding a pre-specified number of good centroids, which can subsequently be used as initial cluster centers. Finally we evaluate our algorithm in several real-world datasets against two well-known methods from the literature and show that KDI achieves favorable results.

Keywords: Clustering, KMeans, Initialization

Citation: Ludl M., Widmer G.: Density-Based Centroid Approximation for Initializing Iterative Clustering Algorithms. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-19, 2002


OFAI-TR-2002-18 ( 55kB g-zipped PostScript file,  230kB PDF file)

Towards A Simple Clustering Criterion Based On Minimum Length Encoding

Marcus-Christopher Ludl, Gerhard Widmer

We propose a simple and intuitive clustering evaluation criterion based on the minimum description length principle which yields a particularly simple way of describing and encoding a set of examples. The basic idea is to view a clustering as a restriction of the attribute domains, given an example's cluster membership. As a special operational case we develop the so-called rectangular uniform message length measure that can be used to evaluate clusterings described as sets of hyper-rectangles. We theoretically prove that this measure punishes cluster boundaries in regions of uniform instance distribution (i.e., unintuitive clusterings), and we experimentally compare a simple clustering algorithm using this measure with the well-known algorithms KMeans and AutoClass.

Keywords: Clustering, Minimum Description Length

Citation: Ludl M., Widmer G.: Towards A Simple Clustering Criterion Based On Minimum Length Encoding. In Proceedings of the 13th European Conference on Machine Learning (ECML 2002), Helsinki, Finland.


OFAI-TR-2002-17 ( 48kB g-zipped PostScript file,  56kB PDF file)

The influence of relative intensity on the perception of onset asynchronies

Werner Goebl, Richard Parncutt

We address the perception of small onset asynchronies as typically found in expressive piano performance (melody lead), which is associated with differences in hammer velocity and hence loudness or salience of chord tones. In three experiments, 26 musicians heard harmonic major-sixth dyads with both tones in the range B4 to Bb5. The tones in each dyad were either both pure, both sawtooth, or both recorded acoustic piano; and either synchronous or asynchronous. First, participants adjusted the relative level of the two tones until they sounded equally loud. This resulted in roughly equal SPL for pure and piano tones, but in the sawtooth tones the higher tone was typically 6 dB more intense, possibly due to simultaneous masking among the par-tials. In the next experiment, the relative timing and loudness of the two tones were simultaneously manipulated by up to ±54 ms and ±20 MIDI units. The relative perceptual salience of the tones was found to depend on their relative intensity, but not on their asynchrony. Then, in a further experiment, listeners were asked whether the tones were simultaneous, asynchrony was harder to detect when the louder tone began earlier (melody lead). Two possible explanations: either musicians perceive familiar combi-nations of asynchrony and intensity difference as more synchro-nous than unfamiliar combinations, or sensitivity to synchrony is reduced in the melody-lead condition by forward masking.

Keywords: Music, Piano, Perception, Melody lead

Citation: Goebl, W., & Parncutt, R. (2002). The influence of loudness on the perception of onset asynchronies, 7th International Conference on Music Perception and Cognition (ICMPC'2002). Sydney: to appear.


OFAI-TR-2002-16 ( 21kB g-zipped PostScript file,  28kB PDF file)

Robust Interpretation of User Requests for Text Retrieval in a Multimodal Environment

Alexandra Klein, Estela Puig-Waldmüller, Harald Trost

We describe a parser for robust and flexible interpretation of user utterances in a multi-modal system for web search in newspaper databases. Users can speak or type, and they can navigate and follow links using mouse click. Language queries may combine search expressions with browser commands and search space restrictions. In interpreting input queries, the system has to be fault-tolerant to account for spontanous speech phenomena as well as typing or speech recognition errors which often distort the meaning of the utterance and are difficult to detect and correct. We present a parser integrating shallow parsing techniques with knowledge-based text retrieval to allow for robust processing and coordination of input modes. Parsing consists of two layers: typical meta-expressions like those for search, newspaper types and dates are identified and excluded from the search string to be sent to the search engine. The search terms which are left after preprocessing are then grouped according to co-occurence statistics which have been derived from a newspaper which are left after preprocessing are then grouped according to co-occurence statistics which have been derived from a newspaper corpus. These co-occurence statistics consist of typical noun phrases as they appear in newspaper texts.

Keywords: Multimodality, NLI, Text Retrieval

Citation: Klein A., Puig-Waldmüller E., Trost H.: Robust Interpretation of User Requests for Text Retrieval in a Multimodal Environment. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-16, 2002


OFAI-TR-2002-15 ( 48kB g-zipped PostScript file,  715kB PDF file)

The Performance Worm: Real Time Visualisation of Expression based on Langner's Tempo-Loudness Animation

Simon Dixon, Werner Goebl, Gerhard Widmer

In an expressive performance, a skilled musician shapes the music by continuously modulating aspects like tempo and loudness to communicate high level information such as musical structure and emotion. Although automatic modelling of this phenomenon remains beyond the current state of the art, we present a system that is able to measure tempo and dynamics of a musical performance and to track their development over time. The system accepts raw audio input, tracks tempo and dynamics changes in real time, and displays the development of these expressive parameters in an intuitive and aesthetically appealing graphical format which provides insight into the expressive patterns applied by skilled artists.

Keywords: Computer Music, Beat tracking, Visualisation, Expressive performance

Citation: Dixon S., Goebl W., Widmer G.: The Performance Worm: Real Time Visualisation of Expression based on Langner's Tempo-Loudness Animation. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-15, 2002


OFAI-TR-2002-14 ( 32kB g-zipped PostScript file,  46kB PDF file)

Pinpointing the Beat: Tapping to Expressive Performances

Simon Dixon, Werner Goebl

In this study we report on an experiment in which listeners were asked to tap in time with expressively performed music, and compare the results to two other experiments using the same stimuli which investigated beat and tempo perception through other modalities. Many computational models of beat tracking assume that beats correspond with the onset of musical notes; we consider the hypothesis that the beat times are rather given by a curve that is ``smoother'' than the tempo curve of the note onset times, which nevertheless can be derived from the onset times. The tapping results show a tendency to underestimate the tempo changes, which supports the smoothing hypothesis, and agrees with listening experiments and other tapping studies.

Keywords: Computer music, Rhythm, Expressive performance, Beat tracking

Citation: To appear in Proceedings of the 7th International Conference on Music Perception and Cognition, Sydney, Australia, July 2002.


OFAI-TR-2002-13 ( 72kB g-zipped PostScript file,  96kB PDF file)

Quantifying the Differences between Music Performers: Score vs. Norm

Efstathios Stamatatos

In this study, a comparison of features for discriminating between different music performers playing the same piece is presented. Based on a series of statistical experiments on a data set of piano pieces played by 22 performers, it is shown that the deviation from the performance norm (average performance) is better able to reveal the performers' individualities in comparison to the deviation from the printed score. In the framework of automatic music performer recognition, the norm-based features prove to be very accurate in intra-piece tests (training and test set taken from the same piece) and very stable in inter-piece tests (training and test sets taken from different pieces). Moreover, it is empirically demonstrated that the average performance is at least as effective as the best of the constituent individual performances while 'extreme' performances have the lowest discriminatory potential when used as norm.

Keywords: Expressive music performance, Music performer recognition

Citation: Stamatatos E.: Quantifying the Differences between Music Performers: Score vs. Norm. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-13, 2002


OFAI-TR-2002-12 ( 31kB g-zipped PostScript file,  73kB PDF file)

Wordform- and class-based prediction of the components of German nominal compounds in an AAC system

Marco Baroni, Johannes Matiasek, Harald Trost

In word prediction systems for augmentative and alternative communication (AAC), productive word-formation processes such as compounding pose a serious problem. We present a model that predicts German nominal compounds by splitting them into their modifier and head components, instead of trying to predict them as a whole. The model is improved further by the use of class-based modifier-head bigrams constructed using semantic classes automatically extracted from a corpus. The evaluation shows that the split compound model with class bigrams leads to an improvement in keystroke savings of more than 8% over a no split compound baseline model. We also present preliminary results obtained with a word prediction model integrating compound and simple word prediction.

Keywords: AAC, predictive typing, language modeling, morphology

Citation: Baroni M., Matiasek J., Trost H.: Wordform- and class-based prediction of the components of German nominal compounds in an AAC system. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-12, 2002


OFAI-TR-2002-11

Robustness with Reason

Tomas Holan, Vladislav Kubon, Karel Oliva, Martin Platek

This report describes the theoretical basis of an implemented parsing framework allowing for flexible, consistent and linguistically adequate stepwise shifting of the separation line between two sets of ill-formed strings which, given a fixed degree of robustness, i.e. tolerable violation of grammatical constraints: (i) either still are assigned a syntactic structure by a robust parser, (ii) or are corrupted so heavily (wrt. the degree of robustness) that no structure can be assigned. An important feature of the framework is that it both allows for and is based on explicit measuring of the degree of robustness needed for parsing a certain input string. This provides for creating a scale of robust parsers with stepwise increasing coverage of ill-formed input (e.g., in dependence on the intended application), all based on the same initial grammar. Since our primary practical interests lie in the area of grammar-checking of a language with a rich flection and a high degree of word order freedom, the report concentrates on different degrees of robustness with respect to errors in feature cooccurrence constraints (e.g., agreement) and to errors in word order.

Keywords: parsing, robustness, constraint relaxation

Citation: Holan T., Kubon V., Oliva K., Platek M.: Robustness with Reason. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-11, 2002


OFAI-TR-2002-10

(Semi-)Automatic Detection of Errors in PoS-Tagged Corpora

Pavel Kveton, Karel Oliva

This paper presents a simple yet in practice very efficient technique serving for automatic detection of those positions in a part-of-speech tagged corpus where an error is to be suspected. The approach is based on the idea of learning and later application of "negative bigrams", i.e. on the search for pairs of adjacent tags which constitute an incorrect configuration in a text of a particular language (in English, e.g., the bigram ARTICLE - FINITE VERB). Further, the paper describes the generalization of the "negative bigrams" into "negative n-grams", for any natural n, which indeed provides a powerful tool for error detection in a corpus. The implementation is also discussed, as well as evaluation of results of the approach when used for error detection in the NEGRA® corpus of German, and the general implications for the quality of results of statistical taggers.

Keywords: Error detection, corpus, bigram

Citation: Kveton P., Oliva K.: (Semi-)Automatic Detection of Errors in PoS-Tagged Corpora. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-10, 2002


OFAI-TR-2002-09 ( 29kB g-zipped PostScript file,  65kB PDF file)

Predicting the Components of German Nominal Compounds

Marco Baroni, Johannes Matiasek, Harald Trost

Word prediction systems (such as those embedded in most current augmentative and alternative communication systems) aim to predict what a user wants to type next on the basis of corpus-extracted n-gram counts. Good performance of such a system depends crucially on the size and quality of the underlying lexicon. Compounding is a common cross-linguistic mean to form complex words. In German as in some other languages, compounds are commonly written as single orthographic strings. Because compounding is a very productive process, this leads to a considerable amount of orthographic words that cannot, even in principle, be listed in a lexicon. We present a solution to this problem based on the idea that compounds should not be predicted as units, but as the concatenation of their components. In particular, we designed a word prediction system in which the prediction of German two-element nominal compounds (by far the most common compound type in German) is split into the prediction of the modifier (left element) and the prediction of the head (right element). Both components are predicted on the basis of uni- and bigram statistics collected treating modifiers and heads as independent units, and on the basis of the type frequency of nouns in head and modifier context in the training corpus. We show that our system brings a dramatic improvement in keystroke saving rate over a word prediction scheme in which compounds are treated as units. In particular, our results indicate that the type frequency of nouns in head/modifier context in the training corpus is a very good predictor of which nouns will occur in head/modifier context in new text.

Keywords: NLU, Text Prediction

Citation: Baroni M., Matiasek J., Trost H.: Predicting the Components of German Nominal Compounds. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-09, 2002


OFAI-TR-2002-08 ( 194kB g-zipped PostScript file,  279kB PDF file)

Transformation-Based Regression

Björn Bringmann, Stefan Kramer, Friedrich Neubarth, Hannes Pirker, Gerhard Widmer

In this paper, we introduce Transformation-Based Regression (TBR), a novel rule-based, symbolic regression technique based on Transformation-Based Learning (TBL). Although Transformation-Based Learning has been introduced already a couple of years ago, it has not yet been considered for regression-type tasks. The proposed method should be particularly useful for learning from examples with a given neighborhood relation, where the dependent variable of one example also depends on neighboring examples. Thus, the method should have a potential for learning from sequence and spatial data. In the paper, we demonstrate the capabilities and limitations of the approach in two highly complex real-world domains, musicology and speech synthesis.

Citation: Bringmann B., Kramer S., Neubarth F., Pirker H., Widmer G.: Transformation-Based Regression. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-08, 2002


OFAI-TR-2002-07 ( 187kB PDF file,  102kB PostScript file)

Cheese: A Generic Search Framework for Data Mining

Marcus-Christopher Ludl

In this paper we present Cheese, a modular Generic Search framework, useful for implementing various tasks common in Data Mining. The main advantage of such a framework is the possibility for splitting and recombining search tasks or executing meta-search (in the space of search algorithms). Rather than going into any details regarding the implementation of single components, we explain the theoretical concepts behind the class based search process and describe some of the Data Mining algorithms we implemented using the system.

Keywords: data mining, modular framework, meta search

Citation: Ludl M.: Cheese: A Generic Search Framework for Data Mining. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-07, 2002


OFAI-TR-2002-06 ( 46kB g-zipped PostScript file,  195kB PDF file)

How to Make Stacking Better and Faster While Also Taking Care of an Unknown Weakness

Alexander K. Seewald

We investigated performance differences between multi-class and two-class datasets for ensemble learning schemes. We were surprised to find that Stacking, the best-known such scheme, performs worse on multi-class datasets. In this paper we will present results concerning this heretofore unknown weakness of Stacking. In addition we will present a new variant of Stacking which is able to compensate for this weakness, improving Stacking significantly on some multi-class datasets. The dimensionality of the meta-data set is reduced by a factor equal to the number of classes, which leads to faster learning. In comparison to other ensemble learning methods this improves Stacking's lead further, making it the most successful system by a variety of measures.

Keywords: Machine Learning, Classification, Ensembles

Citation: Seewald A.K.: How to Make Stacking Better and Faster While Also Taking Care of an Unknown Weakness. In Proceedings of the Nineteenth International Conference on Machine Learning (ICML-2002). Sydney, Australia. Morgan Kaufmann Publishers, San Francisco.


OFAI-TR-2002-05 ( 43kB g-zipped PostScript file,  109kB PDF file)

Meta-Learning for Stacked Classification

Alexander K. Seewald

In this paper we describe new experiments with the ensemble learning method stacking. The central question in these experiments was whether meta-learning methods can be used to accurately predict various aspects of stacking's behaviour. The resulting contributions of this paper are two-fold: When learning to predict the accuracy of stacked classifiers, we found that the single most important feature is the accuracy of the best base classifier. A simple linear model involving just this feature turns out to be surprisingly accurate. The associated regression line has a gradient larger than one, hinting that, in the limit, stacking is indeed better than the best included base classifier. When learning to predict significant differences between stacking and three other ensemble learning methods, we have found simple models, all but one of which are based on single features which can be efficiently computed directly from the dataset. These models can be used to decide in advance which ensemble learning method to use on a given dataset, since neither of them is always the best choice.

Keywords: Machine Learning, Meta Learning, Ensembles

Citation: Seewald A.K.: Meta-Learning for Stacked Classification (extended version). In Proceedings of the Second International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM-2002), University of Helsinki, Department of Computer Science, Report B-2002-3.


OFAI-TR-2002-04 ( 110kB g-zipped PostScript file,  723kB PDF file)

Real Time Tracking and Visualisation of Musical Expression

Simon Dixon, Werner Goebl, Gerhard Widmer

Skilled musicians are able to shape a given piece of music (by continuously modulating aspects like tempo, loudness, etc.) to communicate high level information such as musical structure and emotion. This activity is commonly referred to as expressive music performance. The present paper presents another step towards the automatic high-level analysis of this elusive phenomenon with AI methods. A system is presented that is able to measure tempo and dynamics of a musical performance and to track their development over time. The system accepts raw audio input, tracks tempo and dynamics changes in real time, and displays the development of these expressive parameters in an intuitive and aesthetically appealing graphical format which provides insight into the expressive patterns applied by skilled artists. The paper describes the tempo tracking algorithm (based on a new clustering method) in detail, and then presents an application of the system to the analysis of performances by different pianists.

Keywords: Music, Expressive Music Performance, Clustering, Perception,

Citation: Dixon S., Goebl W., Widmer G.: Real Time Tracking and Visualisation of Musical Expression. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-04, 2002


OFAI-TR-2002-03 ( 104kB g-zipped PostScript file,  981kB PDF file)

Continuous unsupervised sleep staging based on a single EEG signal

Arthur Flexer, Georg Gruber, Georg Dorffner

We report improvements on automatic continuous sleep staging using Hidden Markov Models (HMM). Our totally unsupervised approach detects the cornerstones of human sleep (wakefulness, deep and rem sleep) with around 80% accuracy based on data from a single EEG channel. Contrary to our previous efforts we trained the HMM on data from a single sleep lab instead of generalizing to data from diverse sleep labs. This solved our previous problem of detecting rem sleep.

Keywords: Time series processing, Sleep Analysis, Hidden Markov Models, EEG

Citation: Flexer A., Gruber G., Dorffner G.: Continuous unsupervised sleep staging based on a single EEG signal, in Dorronsoro J.R. (ed.), Artificial Neural Networks - ICANN 2002, Lecture Notes in Computer Science, Springer, pp. 1013-1018, 2002.


OFAI-TR-2002-02 ( 142kB g-zipped PostScript file,  57kB PDF file)

Music Performer Recognition Using an Ensemble of Simple Classifiers

Efstathios Stamatatos, Gerhard Widmer

This paper addresses the problem of identifying the most likely music performer, given a set of performances of the same piece by a number of skilled candidate pianists. We propose a set of features for representing the stylistic characteristics of a music performer. A database of piano performances of 22 pianists playing two pieces by F. Chopin is used in the presented experiments. Due to the limitations of the training set size and the characteristics of the input features we propose an ensemble of simple classifiers derived by both subsampling the training set and subsampling the input features. Preliminary experiments show that the resulting ensemble is able to efficiently cope with this difficult musical task, displaying a level of accuracy unlikely to be matched by human listeners (under the same conditions).

Keywords: Expressive Music Performance, Machine Learning, Ensemble Methods

Citation: Stamatatos E., Widmer G.: Music Performer Recognition Using an Ensemble of Simple Classifiers. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-02, 2002


OFAI-TR-2002-01 ( 249kB g-zipped PostScript file,  416kB PDF file)

The Dynamics of Interacting Markets: First Results

Brian Sallans, Georg Dorffner, Alexandros Karatzoglou

The behavior of boundedly rational agents in two interacting markets is investigated. A discrete-time model of coupled financial and consumer markets is described. The integrated model is then used to investigate feedback effects between the coupled markets. In particular, the influence of the financial market on product development is demonstrated. The types of traders present in the financial market is shown to have a large effect on firm behavior and product development. In a financial market where traders favour particular products the firms are shown to develop these favored products instead of more profitable ones. The effect is quite strong despite the only feedback being through a noisy stock price, and despite the fact that only a third of share traders are directly influenced by product position.

Keywords: OeFAI Technical Report, Publications List

Citation: Sallans B., Dorffner G., Karatzoglou A.: The Dynamics of Interacting Markets: First Results. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2002-01, 2002


OFAI-TR-2001-38

Invisible Person II: Ippys Gesicht

Monika Farukuoye, Paolo Petta

Citation: Farukuoye M., Petta P.: Invisible Person II: Ippys Gesicht. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-38, 2001


OFAI-TR-2001-37

Invisible Person II: Game Engine

Paolo Petta, Gudrun Novak

Citation: Petta P., Novak G.: Invisible Person II: Game Engine. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-37, 2001


OFAI-TR-2001-36

Invisible Person II: Lokomotion

Gudrun Novak, Paolo Petta

Citation: Novak G., Petta P.: Invisible Person II: Lokomotion. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-36, 2001


OFAI-TR-2001-35 ( 167kB g-zipped PostScript file,  383kB PDF file)

An automatic, continuous and probabilistic sleep stager based on a Hidden Markov Model

Arthur Flexer, Georg Dorffner, Peter Sykacek, Iaed Rezek

We report about an automatic continuous sleep stager which is based on probabilistic principles employing Hidden Markov Models (HMM). Our sleep stager offers the advantage of being objective by not relying on human scorers, having much finer temporal resolution (one second instead of 30 seconds), and being based on solid probabilistic principles rather than a predefined set of rules (Rechtschaffen & Kales). Results obtained for nine whole night sleep recordings are reported.

Citation: Flexer A., Dorffner G., Sykacek P., Rezek I.: An automatic, continuous and probabilistic sleep stager based on a Hidden Markov Model, Applied Artificial Intelligence, Vol. 16, Num. 3, pp.199-207, 2002.


OFAI-TR-2001-34 ( 128kB g-zipped PostScript file,  393kB PDF file)

On the Use of Self-organizing Maps for Clustering and Visualization

Arthur Flexer

We show that the number of output units used in a self-organizing map (SOM) influences its applicability for either clustering or visualization. By reviewing the appropriate literature and theory and own empirical results, we demonstrate that SOMs can be used for clustering or visualization separately, for simultaneous clustering and visualization, and even for clustering via visualization. For all these different kinds of application, SOM is compared to other statistical approaches. This will show SOM to be a flexible tool which can be used for various forms of explorative data analysis but it will also be made obvious that this flexibility comes with a price in terms of impaired performance.

Citation: Flexer A.: On the Use of Self-organizing Maps for Clustering and Visualization, Intelligent Data Analysis, Volume 5, Number 5, pp. 373-384, 2001.


OFAI-TR-2001-33 ( 482kB g-zipped PostScript file,  345kB PDF file)

Visualisierung von Diabetesdaten

Jochen Schneider

Die Arbeit zeigt Ergebnisse einer Designstudie auf, die sich mit der Visualisierung von Diabetesdaten befasst hat. Dabei stand die Frage im Vordergrund, wie dem behandelnden Arzt in rascher, übersichtlicher und intuitiv zu erfassender Form die Blutzuckerwerte der letzten Wochen eines Patienten präsentiert werden können. Zusätzlich stellt sich dann die Frage, wie an Problempunkten die weitern Daten (neben Blutzuckerwerten Insulingaben, Kohlehydrate und sportliche Aktivität) problemadäquat dargestellt werden können. Im Rahmen der Studie werden Möglichkeiten der Visualisierung exemplarisch aufgezeigt. Eine konkrete Visualisierungsform, die die Blutzuckerwerte von 4 Wochen in abstrahierter Form darstellt, wurde als Prototyp in Java realisiert. Die Studie ist Teil des Projekts "Telemedicine, computer assisted data analysis and knowledge-based system support for improving glycemic control in adolescent type I diabetes".

Keywords: Diabetes, Visualisierung,

Citation: Schneider J.: Visualisierung von Diabetesdaten. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-33, 2001


OFAI-TR-2001-32 ( 161kB g-zipped PostScript file,  626kB PDF file)

Machine Discoveries: A Few Simple, Robust Local Expression Principles

Gerhard Widmer

The paper presents a new approach to discovering general rules of expressive music performance from real performance data via inductive machine learning. A new learning algorithm is briefly presented, and then an experiment with a very large data set (performances of 13 Mozart piano sonatas) is described. The new learning algorithm succeeds in discovering some extremely simple and general principles of musical performance (at the level of individual notes), in the form of categorical prediction rules. These rules turn out to be very robust and general: when tested on performances by a different pianist and even on music of a different style (Chopin), they exhibit a surprisingly high degree of predictive accuracy.

Keywords: Expressive Music Performance, Machine Learning, Knowledge Discovery,

Citation: Widmer G.: Machine Discoveries: A Few Simple, Robust Local Expression Principles. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-32, 2001


OFAI-TR-2001-31 ( 168kB g-zipped PostScript file,  645kB PDF file)

Discovering Simple Rules in Complex Data: A Meta-learning Algorithm and Some Surprising Musical Discoveries

Gerhard Widmer

This article presents a new rule discovery algorithm named PLCG that can find simple, robust partial rule models (sets of classification rules) in complex data where it is difficult or impossible to find models that completely account for all the phenomena of interest. Technically speaking, PLCG is an ensemble learning method that learns multiple models via some standard rule learning algorithm, and then combines these into one final rule set via clustering, generalization, and heuristic rule selection. The algorithm was developed in the context of an interdisciplinary research project that aims at discovering fundamental principles of expressive music performance from large amounts of complex real-world data (specifically, measurements of actual performances by concert pianists). The article will show that PLCG succeeds in finding some surprisingly simple and robust performance principles, some of which represent truly novel and musically meaningful discoveries. A set of more systematic experiments shows that PLCG usually discovers significantly simpler theories than more direct approaches to rule learning (including the state-of-the-art learning algorithm RIPPER), while striking a compromise between coverage and precision. The experiments also show how easy it is to use PLCG as a meta-learning strategy to explore different parts of the space of rule models.

Keywords: machine learning, data mining, rule discovery, ensemble methods, meta-learning, partial models, expressive music performance

Citation: Widmer G.: Discovering Simple Rules in Complex Data: A Meta-learning Algorithm and Some Surprising Musical Discoveries. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-31, 2001


OFAI-TR-2001-30 ( 80kB g-zipped PostScript file,  2453kB PDF file)

Hyperlink Ensembles: A Case Study in Hypertext Classification

Johannes Fürnkranz

In this paper, we introduce hyperlink ensembles, a novel type of ensemble classifier for classifying hypertext documents. Instead of using the text on a page for deriving features that can be used for training a classifier, we suggest to use portions of texts from all pages that point to the target page. A hyperlink ensemble is formed by obtaining one prediction for each hyperlink that points to a page. These individual predictions for each hyperlink are subsequently combined to a final prediction for the class of the target page. We explore four different ways of combining the individual predictions and four different techniques for identifying relevant text portions. The utility of our approach is demonstrated on a set of Web-pages that relate to Computer Science Departments.

Keywords: web mining, hypertext classification, ensemble techniques, inductive rule learning

Citation: Fürnkranz J.: Hyperlink Ensembles: A Case Study in Hypertext Classification. Information Fusion 3(4):299-312, December 2002, Special Issue on Fusion of Multiple Classifiers.


OFAI-TR-2001-29 ( 417kB g-zipped PostScript file,  644kB PDF file)

User Profiling for the Melvil Knowledge Retrieval System

Johannes Fürnkranz, Christian Holzbaur, Robert Temel

Melvil is an ontology-based knowledge retrieval platform that provides a three-dimensional visualization of search results. The user can tailor the presentation of the search results to her preferences by changing the settings of various parameters on the screen. In this paper, we report on a prototype implementation of a user profiling device that learns to predict appropriate settings for these parameters for the current search results based on previous experiences. In a preliminary study, we evaluated several off-the-shelf machine learning algorithms on parts of the problem. The final implementation required the flexibility of handling both regression and classification problems, being able to deal with set-valued input and output attributes, as well as incorporating Melvil's ontologies for the respective application domain. Thus, we selected a nearest-neighbor approach for the prototype implementation. An evaluation on off-line data collected from several users showed a satisfactory performance.

Citation: Fürnkranz J., Holzbaur C., Temel R.: User Profiling for the Melvil Knowledge Retrieval System. Applied Artificial Intelligence 16(4):243-281, April 2002.


OFAI-TR-2001-28 ( 359kB g-zipped PostScript file,  287kB PDF file)

Was kennzeichnet die Interpretation eines guten Musikers? Die integrierte Analyse von Tempo- und Lautstärkegestaltung und ihre musikpädagogischen Anwendungsperspektiven.

Jörg Langner, Werner Goebl

Keywords: Musik, Interpretation, Dynamik, Tempo,

Citation: Langner, J., & Goebl, W. (2001). Was kennzeichnet die Interpretation eines guten Musikers? Die integrierte Analyse von Tempo- und Lautstärkegestaltung und ihre musikpädagogischen Anwendungsperspektiven, Multimedia als Gegenstand musikpädagogischer Forschung, 5.-7. Okt. 2001. Regensburg, Germany.


OFAI-TR-2001-27 ( 910kB g-zipped PostScript file,  497kB PDF file)

Are computer-controlled pianos a reliable tool in music performance research? Recording and reproduction precision of a Yamaha Disklavier grand piano.

Werner Goebl, Roberto Bresin

In this study, a Yamaha Disklavier is tested on its measuring and reproducing capabilities, with the goal to examine its use in performance research. An experimental setup with accelerometers and a calibrated microphone is used to capture key and hammer movements, as well as the sound signal. Five selected keys are played by pianists with two types of touch ('staccato - legato'). Timing and dynamic differences between the original performance, the corresponding MIDI file recorded by a Disklavier, and its reproduction are analysed. Information of the MIDI file was more precise than the reproduction by the Disklavier. Timing errors are larger for soft tones and hammer velocities higher than 3.5 m/s could not be reproduced by the solenoids.

Keywords: Music, Yamaha Disklavier, piano performance, reproducing piano

Citation: Goebl, W., & Bresin, R. (2001). Are computer-controlled pianos a reliable tool in music performance research? Recording and reproduction precision of a Yamaha Disklavier grand piano, MOSART workshop on Current Research Directions in Computer Music, 15-17 Nov 2001, Barcelona.


OFAI-TR-2001-26 ( 1784kB g-zipped PostScript file,  1545kB PDF file)

Melody lead in piano performance: Expressive device or artifact?

Werner Goebl

As reported in the recent literature on piano performance, an emphasized voice (the melody) tends to be played not only louder than the other voices, but also about 30 ms earlier (melody lead). It remains unclear whether pianists deliberately apply melody lead to separate different voices, or whether it occurs because the melody is played louder (velocity artifact). The velocity artifact explanation implies that pianists initially strike the keys simultaneously; it is only different velocities that make the hammers arrive at different points in time. The measured note onsets in these studies, mostly derived from computer-monitored pianos, represent the hammer-string impact times. In the present study, the finger-key contact times are calculated and analyzed as well. If the velocity artifact hypothesis is correct, the melody lead phenomenon should disappear at the finger-key level. Chopin's Ballade op. 38 (45 measures) and Etude op. 10/3 (21 measures) were performed on a Bösendorfer computer-monitored grand piano by 22 skilled pianists. The hammer-string asynchronies among voices closely resemble the results reported in the literature. However, the melody lead decreases almost to zero at the finger-key level, which supports the velocity artifact hypothesis. In addition to this, expected onset asynchronies are predicted from differences in hammer velocity, if finger-key asynchronies are assumed to be zero. They correlate highly with the observed melody lead.

Keywords: music, melody lead, piano performance, bösendorfer

Citation: Goebl, W. (2001). Melody lead in piano performance: Expressive device or artifact? Journal of the Acoustical Society of America, 110(1), 563-572


OFAI-TR-2001-25

Improving biosignal processing through modeling uncertainty: Bayes vs. non-Bayes in sleep staging

Peter Sykacek, Georg Dorffner, Peter Rappelsberger, Josef Zeitlhofer

In this paper we report about an investigation of Bayesian inference applied to neural networks -- multilayer perceptrons (MLP), in particular -- in the task of automatic sleep staging based on electroencephalogram (EEG) and electrooculogram (EOG) signals. The main focus was on evaluating the use of so-called ``doubt-levels'' and ``confidence intervals'' (``error bars'') in improving the results by rejecting uncertain cases and patterns not well represented by the training set. Bayesian inference is used to arrive at distributions of network weights based on training data. We compare the results of the full-blown Bayesian method with results obtained from a k-nearest neighbor classifier. The results show that the Bayesian technique significantly outperforms the k-nearest neighbor classifier. At the same time, we show that Bayesian inference, for which we have developed an extension for the calculation of error bars in the latent space of hidden units, can indeed be used for improving results by rejecting cases below a doubt-level threshold of probability, as well as for the rejection of artefacts. The performance of the Bayesian solution, however, is not significantly better than alternative techniques such as doubt levels applied to a maximum posterior approach, or the use of density estimation for outlier rejection. We conclude that Bayesian inference is a valid and valuable technique for model estimation but in the given application does not lead to improved results over simpler techniques.

Keywords: Bayesian inference, sleep analysis, uncertainty modeling, neural networks

Citation: Sykacek P., Dorffner G., Rappelsberger P., Zeitlhofer J.: Improving biosignal processing through modeling uncertainty: Bayes vs. non-Bayes in sleep staging. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-25, 2001


OFAI-TR-2001-24

Bayesian Inference for Reliable Biomedical Signal processing

Peter Sykacek

This report investigates whether Bayesian inference can improve the reliability of biomedical diagnosis. In particular we discuss time series classification as is for example needed for an analysis of all-night sleep EEG recordings. Such an attempt needs 4 steps that are further analyzed. First we must process the raw data. In this report we suggest for that step a Bayesian analysis of an autoregressive lattive filter model. We are especially interested in model selection, and, whether the model probabilities are viable means for artifact detection in EEG signals. The subsequent chapter treats feature subset selection within the Bayesian framework. We infer the probabilities of different feature subsets, which tell us in how far each of the subsets contributes to the classiifcation. Inference of the classifier is done with Markov chain Monte Carlo (MCMC) techniques. The next topic is the classification stage. We use variational approximations to derive a Bayesian posterior over model coefficients and different model orders. The final chapter proposes an improvement of the classical approach that was followed so far. By using one probabilistic model that unifies preprocessing and classification, we obtain a classifier that, in the sense of relying more on reliable information, performs optima sensor fusion. The proposed classifier is similar to the well known hidden Markov model. The proposed architecture is again applied to sleep analysis.

Keywords: Bayesian inference, classification, mixture models, sleep analysis, uncertainty modeling

Citation: Sykacek P.: Bayesian Inference for Reliable Biomedical Signal processing. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-24, 2001


OFAI-TR-2001-23 ( 108kB g-zipped PostScript file,  119kB PDF file)

Analysis of tempo classes in performances of Mozart sonatas

Werner Goebl, Simon Dixon

This preliminary study investigates the relationship between tempo indications in a score and the tempo performed by musicians. As several 18th century theorists point out, the chosen tempo should depend not only on the tempo indication, but also on other factors such as the time signature and the fastest note values to be played. It is examined whether the four main tempo indications (Adagio, Andante, Allegro, and Presto) imply specific tempo classes which can be found in professional performances or whether other factors influence the choice of performed tempo. In Experiment I, 34 movements of Mozart sonatas performed by one professional pianist are analysed. The mode of the inter-beat interval distribution of a performance is considered to be a representation of the performed tempo. The tempo values depend on what is taken as the beat level; performed tempo did not group according to tempo indications. Event density (score events per second) is found to separate the data into just two clusters, namely slow and fast movements. In experiment II, the tempo values of 12 movements of the first experiment were derived from commercial recordings (Barenboim, Pires, Schiff, Uchida) with the help of an interactive beat tracking system. The pianists' tempos are surprisingly homogenous; they deviate from each other more in slower movements than in the faster ones.

Keywords: Music, Tempo, Expressive Performance, Mode, Beat Tracking

Citation: Goebl, W., & Dixon, S. E. (2001). Analyses of tempo classes in performances of Mozart piano sonatas. In H. Lappalainen (Ed.), Proceedings of the VII International Symposium on Systematic and Comparative Musicology, III International Conference on Cognitive Musicology, 16 - 19 August 2001 (pp. 65-76). Jyväskylä, Finland.


OFAI-TR-2001-22 ( 115kB g-zipped PostScript file,  232kB PDF file)

Beat Extraction from Expressive Musical Performances

Simon Dixon, Werner Goebl, Emilios Cambouropoulos

In order to analyse timing in musical performance, it is necessary to develop reliable and efficient methods of deriving musical timing information (e.g. tempo, beat and rhythm) from the physical timing of audio signals or MIDI data. We report the results of an experiment in which subjects were asked to mark the positions of beats in musical excerpts, using a multimedia interface which provides various forms of audio and visual feedback. Six experimental conditions were tested, which involved disabling various parts of the system's feedback to the user. Even in extreme cases such as no audio feedback or no visual feedback, subjects were often able to find the regularities corresponding to the musical beat. In many cases, the subjects' placement of markers corresponded closely to the onsets of on-beat notes (according to the score), but the beat sequences were much more regular than the corresponding note onset times. The form of feedback provided by the system had a significant effect on the chosen beat times: visual feedback encouraged a closer alignment of beats with notes, whereas audio feedback led to a smoother beat sequence.

Keywords: Music, Tempo, Beat,

Citation: Presented at the 2001 Meeting of the Society for Music Perception and Cognition (SMPC2001)


OFAI-TR-2001-21 ( 36kB g-zipped PostScript file,  56kB PDF file)

An Empirical Comparison of Tempo Trackers

Simon Dixon

One of the difficulties with assessing tempo or beat tracking systems is that there is no standard corpus of data on which they can be tested. This situation is partly because the choice of data set often depends on the goals of the system, which might be, for example, automatic transcription, computer accompaniment of a human performer, or the analysis of expressive timing in musical performance. Without standard test data, there is the risk of overfitting a system to the data on which it is tested, and developing a system which is not suitable for use outside a very limited musical domain. In this paper, we use a large, publicly available set of performances of two Beatles songs recorded on a Yamaha Disklavier in order to compare two models of tempo tracking: a probabilistic model which uses a Kalman filter to estimate tempo and beat times, and a tempo tracker based on a multi-agent search strategy. Both models perform extremely well on the test data, with the multi-agent search achieving marginally better results. We propose two simple measures of tempo tracking difficulty, and argue that a broader set of test data is required for comprehensive testing of tempo tracking systems.

Keywords: Music, Rhythm, Tempo tracking, Beat tracking

Citation: 8th Brazilian Symposium on Computer Music


OFAI-TR-2001-20 ( 36kB g-zipped PostScript file,  2982kB PDF file)

An Interactive Beat Tracking and Visualisation System

Simon Dixon

This paper describes BeatRoot, a system which performs automatic beat tracking on audio or MIDI data and creates a graphical and audio representation of the data and results, as part of an interactive interface for correcting errors or selecting alternative metrical levels for beat tracking. The graphical interface displays the input data and the computed beat times, and allows the user to add, delete and adjust the beat times and then automatically re-track the remaining data based on the user input. The system also provides audio feedback consisting of the original input data accompanied by a percussion instrument sounding at the computed beat times. At the heart of the system is a beat tracking algorithm which estimates tempo based on the frequency of occurrence of the various time durations between pairs of note onset times, and then uses a multiple hypothesis search to find the sequence of note onsets that best matches one of the possible tempos. The primary application of this system is in the analysis of tempo and timing in musical performance, although the beat tracking algorithm itself has been shown to perform at least as well as other state-of-the-art systems.

Keywords: Music, Tempo, Rhythm, Beat Tracking

Citation: Proceedings of the 2001 International Computer Music Conference


OFAI-TR-2001-19 ( 164kB g-zipped PostScript file,  176kB PDF file)

Automatic Extraction of Tempo and Beat from Expressive Performances

Simon Dixon

We describe a computer program which is able to estimate the tempo and the times of musical beats in expressively performed music. The input data may be either digital audio or a symbolic representation of music such as MIDI. The data is processed off-line to detect the salient rhythmic events and the timing of these events is analysed to generate hypotheses of the tempo at various metrical levels. Based on these tempo hypotheses, a multiple hypothesis search finds the sequence of beat times which has the best fit to the rhythmic events. We show that estimating the perceptual salience of rhythmic events significantly improves the results. No prior knowledge of the tempo, meter or musical style is assumed; all required information is derived from the data. Results are presented for a range of different musical styles, including classical, jazz, and popular works with a variety of tempi and meters. The system calculates the tempo correctly in most cases, the most common error being a doubling or halving of the tempo. The calculation of beat times is also robust. When errors are made concerning the phase of the beat, the system recovers quickly to resume correct beat tracking, despite the fact that there is no high level musical knowledge encoded in the system.

Keywords: Music, Rhythm, Tempo, Beat tracking

Citation: Journal of New Music Research, 30, 1, 2001, to appear


OFAI-TR-2001-18 ( 75kB g-zipped PostScript file,  154kB PDF file)

Round Robin Classification

Johannes Fürnkranz

In this paper, we discuss round robin classification (aka pairwise classification), a technique for handling multi-class problems with binary classifiers by learning one classifier for each pair of classes. We present an empirical evaluation of the method, implemented as a wrapper around the ripper rule learning algorithm, on 20 multi-class datasets from the UCI database repository. Our results show that it is very likely to improve ripper's classification accuracy without having a high risk of decreasing it. More importantly, we give a general theoretical analysis of the complexity of the approach and show that its training effort is below that of the commonly used one-against-all technique. These theoretical results are not restricted to rule learning but are also of interest to other communities where pairwise classification has recently received some attention. Furthermore, we investigate its properties as a general ensemble technique and show that round robin classification with C5.0 may improve C5.0's performance on multi-class problems. However, this improvement does not reach the performance increase of boosting, and a combination of boosting and round robin classification does not produce any gain over conventional boosting. Finally, we show that the performance of round robin classification can be further improved by performing multiple round comparisons, i.e., by integrating it with bagging.

Keywords: pairwise classification, inductive rule learning, multi-class problems, class binarization, ensemble techniques

Citation: Fürnkranz J.: Round Robin Classification. Journal of Machine Learning Research 2:721-747, March 2002.


OFAI-TR-2001-17

2nd International Workshop Engineering Societies in the Agents' World (ESAW'01), 7 July 2001, Czech Technical University, Prague, Czech Republic, Workshop Notes

Andrea Omicini, Paolo Petta, Robert Tolksdorf

Building on the success of its previous edition, ESAW'01 is devoted to discuss technologies, methodologies, and models for the engieering of complex applications based on multi-agent systems, and aims at bringing together researchers and contributions from a broad range of pertinent areas so as to promote sharing of experiences and cross-fertilisation. By focussing on the social aspects of multi-agent systems, ESAW'01 concentrates on the space of agent interaction, rather than on intra-agent issues, and on technology and methodology issues rather than on pure theoretical aspects.

Keywords: Multi-Agent Systems, Software Engineering, Coordination

Citation: Omicini A., Petta P., Tolksdorf R.: 2nd International Workshop Engineering Societies in the Agents' World (ESAW'01), 7 July 2001, Czech Technical University, Prague, Czech Republic, Workshop Notes. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-17, 2001


OFAI-TR-2001-16 ( 241kB g-zipped PostScript file,  89kB PDF file)

Human Preferences for Tempo Smoothness

Emilios Cambouropoulos, Simon Dixon, Werner Goebl, Gerhard Widmer

In this study we investigate the relationship between beat and musical performance. It is hypothesised that listeners prefer beat sequences that are smoother than beat tracks that are fully aligned with the actual onsets of performed notes. In order to examine this hypothesis, an experiment was designed whereby six different smoothed beat tracks generated are rated by subjects in relation to how well they correspond to a number of performed piano excerpts. It is shown that there is a preference of listeners for beat sequences that are slightly smoother than the onset times of the corresponding musical notes. This outcome was strongly supported by the results obtained from the group of trained musicians whereas it seems to have no bearing for the group of non-musicians.

Keywords: music, beat tracking,

Citation: Proceedings of the VII International Symposium on Systematic and Comparative Musicology and III International Conference on Cognitive Musicology 16-19 August 2001, Jyväskylä, Finland, pp 18-26.


OFAI-TR-2001-15 ( 253kB g-zipped PostScript file,  626kB PDF file)

The Musical Expression Project: A Challenge for Machine Learning and Knowledge Discovery

Gerhard Widmer

This paper reports on a long-term inter-disciplinary research project that aims at analysing the complex phenomenon of expressive music performance with machine learning and data mining methods. The goals and general research framework of the project are briefly explained, and then a number of challenges to machine learning (and also to computational music analysis) are discussed that arise from the complexity and multi-dimensionality of the musical phenomenon being studied. We also briefly report on first experiments that address some of these issues.

Keywords: Machine Learning, Knowledge Discovery, Music,

Citation: Invited talk/paper. In Proceedings of the 12th European Conference on Machine Learning (ECML'2001). Berlin: Springer Verlag.


OFAI-TR-2001-14 ( 129kB g-zipped PostScript file,  557kB PDF file)

Discovering Strong Principles of Expressive Music Performance with the PLCG Rule Learning Strategy

Gerhard Widmer

We present a new rule learning algorithm named PLCG - a kind of ensemble learning method - that can find simple, robust partial theories (sets of classification rules) in complex data where neither high coverage nor high precision can be expected. The motivating application problem comes from an interdisciplinary research project that aims at discovering fundamental principles of expressive music performance from large amounts of complex real-world data (measurements of actual performances by concert pianists). It is shown that PLCG succeeds in finding some surprisingly simple and robust performance principles, some of which represent truly novel and musically meaningful discoveries. A more systematic experiment shows that PLCG learns significantly simpler theories than more direct approaches to rule learning, while striking a compromise between coverage and precision.

Keywords: Machine Learning, Rule Learning, Knowledge Discovery, Music, Music Performance,

Citation: In Proceedings of the 12th European Conference on Machine Learning (ECML'2001). Berlin: Springer Verlag.


OFAI-TR-2001-13 ( 50kB g-zipped PostScript file,  206kB PDF file)

An Evaluation of Landmarking Variants

Johannes Fürnkranz, Johann Petrak

Landmarking is a novel technique for data characterization in meta-learning. While conventional approaches typically describe a database with its statistical measurements and properties, landmarking proposes to enrich such a description with quick and easy-to-obtain performance measures of simple learning algorithms. In this paper, we will discuss two novel aspects of landmarking. First, we investigate relative landmarking, which tries to exploit the relative order of the landmark measures instead of their absolute value. Second, we propose to the use of subsampling estimates as a different way for efficiently obtaining landmarks. In general, our results are mostly negative. The most interesting result is a surprisingly simple rule that predicts quite accurately when it is worth to boost decision trees.

Keywords: Meta-Learning, Landmarking, Subsampling

Citation: Fürnkranz J., Petrak J.: An Evaluation of Landmarking Variants. In C. Giraud-Carrier, N. Lavrac, S. Moyle & B. Kavsek (eds.) Proceedings of the ECML/PKDD-01 Workshop Integrating Aspects of Data Mining, Decision Support and Meta-learning, pp.57-68, Freiburg, Germany, 2001.


OFAI-TR-2001-12 ( 152kB g-zipped PostScript file,  171kB PDF file)

Automatic Pitch Spelling: From Numbers to Sharps and Flats

Emilios Cambouropoulos

In this paper a computational model is described that transcribes polyphonic MIDI pitch files into the Western traditional music notation. Input to the proposed algorithm is merely a sequence of MIDI pitch numbers in the order they appear in a MIDI file. No a priori knowledge is required such as key signature, tonal centers, time signature, voice separation and so on. Output of the algorithm is a sequence of 'correctly' spelled pitches. The algorithm was evaluated on 8 complete piano sonatas by Mozart and had a success rate that is greater than 96% (10476 pitches were spelled correctly out of 10900 notes that required accidental - overall number of pitches in 8 sonatas is 40058). The proposed algorithm was also compared to and tested against other pitch spelling algorithms. Pitch spelling algorithms are important not only for applications such as musical notation software packages but also for a multitude of tonal analytical tasks such as key-finding and harmonic analysis.

Keywords: Music, Pitch Spelling

Citation: Proceedings of the VIII Brazilian Symposium on Computer Music, 31 July - 3 August, 2001, Fortaleza, Brasil (forthcoming)


OFAI-TR-2001-11 ( 128kB g-zipped PostScript file,  81kB PDF file)

The Local Boundary Detection Model (LBDM) and its Application in the Study of Expressive Timing

Emilios Cambouropoulos

In this paper two main topics are addressed. Firstly, the Local Boundary Detection Model (LBDM) is described; this computational model enables the detection of local boundaries in a melodic surface and can be used for musical segmentation. The proposed model is tested against the punctuation rule system developed by Friberg et al. (1998) at KTH, Stockholm. Secondly, the expressive timing deviations found in a number of expert piano performances are examineded in relation to the local boundaries discovered by LBDM. As a result of a set of preliminary experiments, it is suggested that the assumption of final-note lengthening of a melodic gesture is not always valid and that, in some cases, the end of a melodic group is marked by lengthening the second-to-last note (or, seeing it from a different viewpoint, by delaying the last note).

Keywords: Music, Melodic Segmentation, Expressive Timing

Citation: In Proceedings of the International Computer Music Conference (ICMC'2001) 17-22 September, Havana, Cuba


OFAI-TR-2001-10 ( 61kB g-zipped PostScript file,  195kB PDF file)

Model-based noise reduction for single trial evoked potentials

Arthur Flexer, Herbert Bauer, Claus Lamm, Georg Dorffner

Two model-based techniques, Gaussian Mixture Models with integrated noise component and Principal Component Analysis, are applied to noise reduction for single trial evoked potentials which are buried in noise up to five times stronger than the signal. An empirical study using artificial data is presented and results are compared to the standard technique of averaging.

Keywords: Gaussian mixture models, Principal Component Analysis, Denoising, EEG

Citation: Flexer A., Bauer H., Lamm C., Dorffner G.: Model-based Noise Reduction for Single Trial Evoked Potentials, in Miller D.J., et al.(eds.), Neural Networks for Signal Processing XI, Institute of Electrical and Electronics Engineers, Inc., New York, NY, pp.499-508, 2001.


OFAI-TR-2001-09 ( 88kB g-zipped PostScript file,  238kB PDF file)

Detecting Temporal Change in Event Sequences: An Application to Demographic Data

Hendrik Blockeel, Johannes Fürnkranz, Alexia Prskawetz, Francesco C. Billari

In this paper, we discuss an approach for discovering temporal changes in event sequences, and present first results from a study on demographic data. The data encode characteristic events in a person's life course, such as their birth date, the begin and end dates of their partnerships and marriages, and the birth dates of their children. The goal is to detect significant changes in the chronology of these events over people from different birth cohorts. To solve this problem, we encoded the temporal information in a first-order logic representation, and employed Warmr, an ILP system that discovers association rules in a multi-relational data set, to detect frequent patterns that show significant variance over different birth cohorts. As a case study in multi-relational association rule mining, this work illustrates the flexibility resulting from the use of first-order background knowledge, but also uncovers a number of important issues that hitherto received little attention.

Keywords: Data Mining, Association Rules, Temporal Patterns, Inductive Logic Programming, Demography, Life Course Analysis

Citation: Blockeel H., Fürnkranz J., Prskawetz A., Billari F.: Detecting Temporal Change in Event Sequences: An Application to Demographic Data. In L. De Raedt and A. Siebes (Eds.), Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD-01), Freiburg, Germany. Springer-Verlag 2001.


OFAI-TR-2001-08 ( 63kB g-zipped PostScript file,  141kB PDF file)

GARCH vs Stochastic Volatility: Option Pricing and Risk Management

Christian Schittenkopf, Alfred Lehar, Martin Scheicher

This paper examines the out-of-sample performance of two common extensions of the Black-Scholes framework, namely a GARCH and a stochastic volatility option pricing model. The models are calibrated to intraday FTSE 100 option prices. We apply two sets of performance criteria, namely out-of-sample valuation errors and Value-at-Risk oriented measures. When we analyze the fit to observed prices, GARCH clearly dominates both stochastic volatility and the benchmark Black-Scholes model. However, the predictions of the market risk from hypothetical derivative positions show sizable errors. The fit to the realized profits and losses is poor and there are no notable differences between the models. Overall, we therefore observe that the more complex option pricing models can improve on the Black-Scholes methodology only for the purpose of pricing, but not for the Value-at-Risk forecasts.

Keywords: option pricing, GARCH, stochastic process, volatility, risk-neutral density, finance,

Citation: Schittenkopf C., Lehar A., Scheicher M.: GARCH vs Stochastic Volatility: Option Pricing and Risk Management. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-08, 2001


OFAI-TR-2001-07 ( 1755kB g-zipped PostScript file,  2620kB PDF file)

An Intelligent Web-based Tool For The Virtual Research Enterprise

Stefan Roiser, Georg Dorffner

This report describes an intelligent web-based interface of a database. The database contains projects in the area of neural computation and the institutes/ enterprises that developed them. The users of this interface, confronted with a specific problem in the area of neural computation, may use this tool to search for institutes that carried out these or similar projects in the past. The report starts with a general description of the problem. Next a detailed description of the classification system for the projects in the database follows. After a discussion of several possible implementations the one which was chosen will be discussed in more detail. A description of the human computer interface of the several possible steps of the interace concludes the report.

Keywords: Neural Computing, Database Interface

Citation: Roiser S., Dorffner G.: An Intelligent Web-based Tool For The Virtual Research Enterprise. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-07, 2001


OFAI-TR-2001-06 ( 181kB g-zipped PostScript file,  652kB PDF file)

Using AI and Machine Learning to Study Expressive Music Performance: Project Survey and First Report

Gerhard Widmer

This article presents a long-term inter-disciplinary research project situated at the intersection of the scientific disciplines of Musicology and Artificial Intelligence. The goal is to develop AI, and in particular machine learning and data mining, methods to study the complex phenomenon of expressive music performance. Formulating formal, quantitative models of expressive performance is one of the big open research problems in contemporary (empirical and cognitive) musicology. Our project develops a new direction in this field: we use inductive learning techniques to discover general and valid expression principles from (large amounts of) real performance data. The project is currently starting its third year and is planned to continue for at least four more years. In the following, we explain the basic notions of expressive music performance, and why this is such a central phenomenon in music. We present the general research framework of the project, and discuss the various challenges and research opportunities that emerge in this framework. We then briefly describe the current state of the project and list the main achievements made so far. In the rest of the paper, we discuss in more detail one particular data mining approach (including a new algorithm for learning characterisation rules) that we have developed just recently. Preliminary experimental results demonstrate that this algorithm can discover very general and robust expression principles, some of which actually constitute novel discoveries from a musicological viewpoint.

Keywords: Machine Learning, Data Mining, Expressive Music Performance

Citation: Widmer G.: Using AI and Machine Learning to Study Expressive Music Performance: Project Survey and First Report. Draft version of a paper to appear in AI Communications 14, 2001.


OFAI-TR-2001-05 ( 66kB g-zipped PostScript file,  139kB PDF file)

Single Trial Estimation of Evoked Potentials using Gaussian Mixture Models with Integrated Noise Component

Arthur Flexer, Herbert Bauer, Claus Lamm, Georg Dorffner

Gaussian Mixture Models with integrated noise component are a method developed for speech analysis to estimate signals hidden in background noise. We apply this technique to estimate single trial evoked potentials which are buried in noise up to five times stronger than the signal. An empirical study using artificial data is presented and results are shown to compare favourably to other techniques for single trial estimation.

Keywords: signal processing, mixture models, EEG, noise removal

Citation: Flexer A., Bauer H., Lamm C., Dorffner G.: Single Trial Estimation of Evoked Potentials Using Gaussian Mixture Models with Integrated Noise Component, in Dorffner G., et al.(eds.), Artificial Neural Networks - ICANN 2001, International Conference, Vienna, Austria, Lecture Notes In Computer Science 2130, Springer, pp. 609-616, 2001.


OFAI-TR-2001-04 ( 2541kB PDF file)

Virtual Encounters: Avatars, Actors, Agents

Sabine Payr

Work on the research project "An Inquiry into the Cultural Context of the Design and Use of Synthetic Actors" has started in spring 2000. In the first phase of the project, we collected examples of applications and tried to get an overview of the state of reseach and development in this field. An overview of current research issues is given in part I of this report in the form of examples. Fictitious diary entries of a computer user in the near future serve as a first approach and introduction. Part II is concerned with the issues and problems of a cultural studies approach to this domain. As a key issue, we discuss the double nature of virtual characters - both as technical products and as social actors. This question cannot and should not be decided prematurely, as we presume that this double character is essential for understanding the speicific issue at hand. Part III is an attempt to open the field from the viewpoint of social interaction between human and virtual characters. We refer to sociological action theory as a theoretical framework, in particular to Habermasą "Theory of Communicative Action". The discussion of "communicative rationality" is to be seen as a preparatory work for the inquiry into the cultural aspects of social interaction with virtual characters which are the main subject of the project.

Keywords: embodied agent, virtual character, synthetic actor, avatar, cultural studies

Citation: Payr S.: Virtual Encounters: Avatars, Actors, Agents. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2001-04, 2001


OFAI-TR-2001-03 ( 95kB g-zipped PostScript file,  192kB PDF file)

Using Time-Oriented Data Abstraction Methods to Optimize Oxygen Supply for Neonates

Andreas Seyfang, Silvia Miksch, Werner Horn, Michael S. Urschitz, Christian Popow, Christian F. Poets

Therapy management needs sophisticated patient monitoring and therapy planning, especially in high-frequency domains, like Neonatal Intensive Care Units (NICUs), where complex data sets are collected every second. An elegant method to tackle this problem is the use of time-oriented, skeletal plans. Asgaard is a framework for the representation, visualization, and execution of such plans. These plans work on qualitative abstracted time-oriented data which closely resemble the concepts used by experienced clinicians. This papers presents the data abstraction unit of the Asgaard system. It provides a range of connectable data abstraction methods bridging the gap between the raw data collected by monitoring devices and the abstract concepts used in therapeutic plans. The usability of this data abstraction unit is demonstrated by the implementation of a controller for the automated optimization of the fraction of inspired oxygen (FiO2). The use of the time-oriented data abstraction methods results in safe and smooth adjustment actions of our controller in a neonatal care setting.

Keywords: Temporal Data Abstraction, Pulse Oximetry, Neonates, ICU

Citation: Seyfang A., Miksch S., Horn W., Urschitz M., Popow C., Poets C.: Using Time-Oriented Data Abstraction Methods to Optimize Oxygen Supply for Neonates. in S.Quaglini et al.(eds.), Artificial Intelligence in Medicine, Proc.AIME-01, Springer, Berlin, 2001.


OFAI-TR-2001-02 ( 57kB g-zipped PostScript file,  130kB PDF file)

Round Robin Rule Learning

Johannes Fürnkranz

In this paper, we discuss a technique for handling multi-class problems with binary classifiers. The idea - learning one classifier for each pair of classes - is known as pairwise classification but - to our knowledge - has not yet been thoroughly investigated in the context of inductive rule learning. We present an empirical evaluation of the method as a wrapper around the Ripper rule learning algorithm on 20 multi-class datasets from the UCI database repository. Our results show that the method is very likely to improve Ripper's classification performance without having a high risk of decreasing it. The size of this improvement is similar to that obtained by boosting C5. In addition, we give a theoretical analysis of the complexity of the approach and show that its training time is within a small constant bound of the training time of the sequential class learning technique that is currently used in Ripper.

Keywords: Rule Learning, Pairwise Classification, Class Binarization

Citation: Fürnkranz J.: Round Robin Rule Learning. In Proceedings of the 18th International Conference on Machine Learning (ICML-01). Williamstown, MA. Morgan Kaufmann, 2001.


OFAI-TR-2001-01 ( 55kB g-zipped PostScript file,  117kB PDF file)

An Evaluation of Grading Classifiers

Alexander K. Seewald, Johannes Fürnkranz

In this paper, we introduce grading, a novel meta-classification scheme. While stacking uses the predictions of the base classifiers as meta-level attributes, we use ``graded'' predictions (i.e., predictions that have been marked as correct or incorrect) as meta-level classes. For each base classifier, one meta classifier is learned whose task is to predict when the base classifier will err. Hence, just like stacking may be viewed as a generalization of voting, grading may be viewed as a generalization of selection by cross-validation and therefore fills a conceptual gap in the space of meta-classification schemes. Grading may also be interpreted as a technique for turning the error-characterizing technique introduced by Bay and Pazzani (2000) into a powerful learning algorithm by resorting to an ensemble of meta-classifiers. Our experimental evaluation shows that this step results in a performance gain that is quite comparable to that achieved by stacking, while both, grading and stacking outperform their simpler counter-parts voting and selection by cross-validation.

Keywords: Machine Learning, Classification, Ensembles

Citation: Seewald A., Fürnkranz J.: An Evaluation of Grading Classifiers. In Advances in Intelligent Data Analysis: Proceedings of the 4th International Symposium (IDA-01). Lisbon, Portugal. Springer-Verlag 2001.


OFAI-TR-2000-35 ( 23kB g-zipped PostScript file,  39kB PDF file)

AI in Medicine on its Way from Knowledge-intensive to Data-intensive Systems

Werner Horn

The last 20 years of research and development in the field of artificial intelligence in medicine show a path from knowledge-intensive systems, which try to capture the essential knowledge of experts in a knowledge-based system, to data-intensive systems available today. Nowadays enormous amounts of information is accessible electronically. Large data sets are collected continuously monitoring physiological parameters of patients. Knowledge-based systems are needed to make use of all these data available and to help us to cope with the information explosion. In addition, temporal data analysis and intelligent information visualization can help us to get a summarized view of the change over time of clinical parameters. Integrating AIM modules into the daily-routine software environment of our care providers gives us a great chance for maintaining and improving quality of care.

Keywords: Artificial Intelligence in Medicine

Citation: Horn W.: AI in Medicine on its Way from Knowledge-intensive to Data-intensive Systems. Artificial Intelligence in Medicine, 23(1)5-12, 2001.


OFAI-TR-2000-34 ( 143kB g-zipped PostScript file,  940kB PDF file)

Monitoring human information processing via intelligent data analysis of EEG recordings

Arthur Flexer, Herbert Bauer

Human information processing can be monitored by analysing cognitive evoked potentials (EP) measurable in the electro encephalogram (EEG) during cognitive activities. In technical terms, both visualization of high dimensional sequential data and unsupervised discovery of patterns within this multivariate set of real valued time series is needed. Our approach towards visualization is to discretize the sequences via vector quantization and to perform a Sammon mapping of the codebook. Instead of having to conduct a time-consuming search for common subsequences in the set of multivariate sequential data, a multiple sequence alignment procedure can be applied to the set of one-dimensional discrete time series. The methods are described in detail and results obtained for spatial and verbal information processing are shown to be statistically valid, to yield an improvement in terms of noise attenuation and to be well in line with psychophysiological literature.

Keywords: time series analysis, visualization, eeg, neuroscience

Citation: Flexer A., Bauer H.: Monitoring human information processing via intelligent data analysis of EEG recordings, Intelligent Data Analysis, 4: 113-128, 2000.


OFAI-TR-2000-33 ( 35kB g-zipped PostScript file,  40kB PDF file)

The Vienna Prosodic Speech Corpus: Purpose, Content and Encoding

Friedrich Neubarth, Kai Alter, Hannes Pirker, Elli Rieder, Harald Trost

This paper presents a corpus of spoken German especially designed for the investigation of prosodic properties of speech. After a short discussion of the content and set-up of the corpus, we describe in detail the additional linguistic information, introduced into the corpus by labeling and annotation. In this project, both qualitative and quantitative methods have been used for the acquisition of data. Our main concern is the development of a well-defined and transparent scheme for the structuring of this heterogeneous information. A second task is to incorporate all these data - generated by different tools with different data- formats - into a single data-base.

Keywords: Speech-Corpus, Prosody,

Citation: in Zühlke W. & Schukat-Talamazzini E.G.(eds.), Konvens 2000 - Sprachkommunikation, VDE Verlag, Berlin


OFAI-TR-2000-32 ( 27kB g-zipped PostScript file,  107kB PDF file)

Die Modellierung von Lautdauervariationen im Österreichischen Deutsch

Hannes Pirker, Friedrich Neubarth

Dieses Papier stellt das Projekt SpeeDurCont (Speech Duration in Context-to-Speech) vor, das sich der Untersuchung von Lautdauervariationen im Österreichischen Deutsch widmet. Das Projekt zielt darauf ab, die zahlreichen Einflüsse auf die Lautdauer zu quantifizieren. Aus praktischer Sicht sollen die so gewonnenen Dauermodelle die prosodische Qualität eines Sprachsynthetisators verbessern. Die Ergebnisse sollen aber auch das theoretische Verständnis der Wirkung einzelner Faktoren vertiefen. Als Methode werden in SpeeDurCont Maschinelle Lernverfahren verwendet, die auf einer eigens erstellten Sprachdatensammlung angewendet werden. Dieser Beitrag stellt den Aufbau des verwendeten Korpus dar und diskutiert die Anforderungen an mögliche Kodierungsstrategien für die Datensammlung.

Keywords: Speech, Phonology

Citation: Pirker H., Neubarth F.: Die Modellierung von Lautdauervariationen im Österreichischen Deutsch. in Fortschritte der Akustik, Universität Oldenburg, March 2000


OFAI-TR-2000-31 ( 109kB g-zipped PostScript file,  255kB PDF file)

Machine Learning in Games: A Survey

Johannes Fürnkranz

This paper provides a survey of previously published work on machine learning in game playing. The material is organized around a variety of problems that typically arise in game playing and that can be solved with machine learning methods. This approach, we believe, allows both, researchers in game playing to find appropriate learning techniques for helping to solve their problems as well as machine learning researchers to identify rewarding topics for further research in game-playing domains. The paper covers learning techniques that range from neural networks to decision tree learning in games that range from poker to chess. However, space constraints prevent us from giving detailed introductions to the used learning techniques or games. Overall, we aimed at striking a fair balance between being exhaustive and being exhausting.

Keywords: Machine Learning, Game Playing

Citation: To appear in J. Fürnkranz & M. Kubat (eds.): Machines that Learn to Play Games, Nova Scientific Publishers, Chapter 2, pp. 11--59, Huntington, NY, 2001.


OFAI-TR-2000-30 ( 118kB PDF file)

Timing, Sequencing, and Quantum of Life Course Events: a Machine Learning Approach

Francesco C. Billari, Johannes Fürnkranz, Alexia Prskawetz

In this methodological paper we discuss and apply machine learning techniques, a core research area in the artificial intelligence literature, to analyse simultaneously timing, sequencing, and quantum of life course events from a comparative perspective. We outline the need for techniques which allow the adoption of a holistic approach to the analysis of life courses, illustrating the specific case of the transition to adulthood. We briefly introduce machine learning algorithms to build decision trees and rule sets and then apply such algorithms to delineate the key features which distinguish Austrian and Italian pathways to adulthood, using Fertility and Family Survey data. The key role of sequencing and synchronisation between events emerges clearly from the methodology used.

Keywords: life course, event history, data mining, machine learning, transition to adulthood

Citation: Billari F., Fürnkranz J., Prskawetz A.: Timing, Sequencing, and Quantum of Life Course Events: a Machine Learning Approach. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2000-30, 2000. Also appeared as Working Paper WP 2000-10, Max-Planck Institute for Demographic Research, Rostock, Germany.


OFAI-TR-2000-29 ( 548kB g-zipped PostScript file,  977kB PDF file)

Entertainment Robots - Myth Or Reality

Alexander K. Seewald

This paper presents an overview of the current field of entertainment robotics based on experiences as spectator during the RoboCup 1999 and building an experimental entertainment robot based on the LEGO platform with digital color camera and various other sensors. RoboCat is a robot cat prototype that shows cat-like behaviour in the real- life environment of typical households. For behavioural modelling, the Hamsterdam architecture was chosen. While showing that Blumberg's Hamsterdam offers a new programming paradigm to design intelligent entertainment robots, this paper also aims to decide whether or not truly intelligent entertainment robots are as of yet a myth.

Keywords: Entertainment Robotics, Embodied Artificial Intelligence, Robot Architectures, Robot Programming

Citation: Seewald A.K.: Entertainment Robots - Myth Or Reality. In Proceedings of the 14th International FLAIRS Conference (FLAIRS-2001), AAAI Press, Menlo Park, California. Based on A Mobile Robot Toy Cat Controlled by Vision and Motivation, Institute for Medical Cybernetics and Artificial Intelligence, University of Vienna, Diploma thesis, 1999.


OFAI-TR-2000-28 ( 45kB g-zipped PostScript file,  90kB PDF file)

On Information Integration in Large Scientific Collaborations

Christoph Koch, Paolo Petta, Jean-Marie Le Goff, Richard McClatchey

We discuss the requirements for information integration in large scientific collaborations and arrive at the conclusion that an architecture is needed that follows the declarative paradigm for reasoning completeness, maintainability and reuse of previously encoded knowledge but does not take the classical approach of integrating all sources against a single common ``global'' information model. Instead, we propose a local-as-view infrastructure that allows to make integrated information from remote sources available to individual (legacy) information systems across multiple different integration models. We discuss our architecture and compare it to previous approaches in the literature.

Keywords: information integration, computer supporter cooperative work, declarative model,

Citation: Koch C., Petta P., Le Goff J., McClatchey R.: On Information Integration in Large Scientific Collaborations. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2000-28, 2000


OFAI-TR-2000-27 ( 533kB g-zipped PostScript file,  905kB PDF file)

Introducing Emotions into the Computational Study of Social Norms: A First Evaluation

Alexander Staller, Paolo Petta

It is now generally recognised that emotions play an important functional role within both individuals and societies, thereby forming an important bond between these two levels of analysis. In particular, there is a bi-directional interrelationship between social norms and emotions, with emotions playing an instrumental role for the sustenance of social norms and social norms being an essential element of regulation in the individual emotional system. This paper lays the foundations for a computational study of this interrelationship, drawing upon the functional appraisal theory of emotions. We describe a first implementation of a situated agent architecture, TABASCOJAM, that incorporates a simple appraisal mechanism and report on its evaluation in a well-known scenario for the study of aggression control as a function of a norm, that was suitably extended. The simulation results reported in the original aggression control study were successfully reproduced, and consistent performances were achieved for extended scenarios with conditional norm obeyance. In conclusion, it is argued that the present effort indicates a promising lane towards the necessary abandonment of logical models for the explanation and simulation of human social behaviour.

Keywords: norms, emotions

Citation: Staller A., Petta P.: Introducing Emotions into the Computational Study of Social Norms: A First Evaluation. In Edmonds B., Dautenhahn K. (eds.): Journal of Artificial Societies and Social Simulation, Special Issue on Starting from Society - the application of social analogies to computational systems, 2001


OFAI-TR-2000-26 ( 69kB g-zipped PostScript file,  211kB PDF file)

Risk-neutral Density Extraction from Option Prices: Improved Pricing with Mixture Density Networks

Christian Schittenkopf, Georg Dorffner

One of the central goals in finance is to find better models for pricing and hedging financial derivatives such as call and put options. We present a semi-nonparametric approach to risk-neutral density extraction from option prices which is based on an extension of the concept of mixture density networks. The central idea is to model the shape of the risk-neutral density in a flexible, non-linear way as a function of the time horizon. Thereby, stylized facts such as negative skewness and excess kurtosis are captured. The approach is applied to a very large set of intraday options data on the FTSE 100 recorded at LIFFE. It is shown to yield significantly better results in terms of out-of-sample pricing in comparison to the basic Black-Scholes model and to an extended model adjusting the skewness and kurtosis terms. From the perspective of risk management, the extracted risk-neutral densities provide valuable information about market expectations.

Keywords: Hedging, Mixture density networks, Options, Pricing, Risk-neutral densities

Citation: Schittenkopf C., Dorffner G.: Risk-neutral Density Extraction from Option Prices: Improved Pricing with Mixture Density Networks. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2000-26, 2000


OFAI-TR-2000-25 ( 40kB g-zipped PostScript file,  97kB PDF file)

Extraction of Musical Performance Parameters from Audio Data

Simon Dixon

We present a system for the automatic extraction of musical content from audio signals containing polyphonic music. The system works off-line, taking data from audio files and producing MIDI output, representing the pitch, timing and volume of the musical notes. The initial signal processing stage is based on a STFT enhanced by a tracking phase vocoder, which interprets stable frequency components as partials of musical notes. Heuristic methods combine these partials, using a generic instrument model, to produce note estimates. The system is tested on a large corpus of professionally performed music from the standard classical piano repertoire.

Keywords: Automatic transcription, Audio content analysis

Citation: Proceedings of the First IEEE Pacific-Rim Conference on Multimedia (PCM 2000), Sydney, Australia, December 2000


OFAI-TR-2000-24 ( 33kB g-zipped PostScript file,  94kB PDF file)

On the Computer Recognition of Solo Piano Music

Simon Dixon

We present work towards a computer system for the automatic transcription of piano performances. The system takes audio files containing polyphonic piano music as input, and produces MIDI output, representing the pitch, timing and volume of the musical notes. The aim of this work is not to reduce the performance data to common music notation, but to extract the performance parameters for a quantitative study of musical expression in piano performance. Standard signal processing techniques based on the short time Fourier transform are used to create a time-frequency representation of the signal, and adaptive peak-picking and pattern matching algorithms are employed to find the musical notes. In order to perform large scale testing, the test process is automated by synthesizing audio data from MIDI files using high quality sofware synthesis, and comparing results with the original MIDI data. The test data used is Mozart piano sonatas performed by a concert pianist.

Keywords: Automatic transcription, Audio content analysis

Citation: Proceedings of the Australasian Computer Music Association Conference 2000, Brisbane Australia, pages 31-37.


OFAI-TR-2000-23 ( 172kB g-zipped PostScript file,  112kB PDF file)

Explicit Modeling of the Semantics of Large Multi-layered Object-Oriented Databases

Christoph Koch, Zsolt Kovacs, Jean-Marie Le Goff, Richard McClatchey, Paolo Petta, Tony Solomonides

Description-driven systems based on meta-objects are an increasingly popular way to handle complexity in large-scale object-oriented database applications. Such systems facilitate the management of large amounts of data and provide a means to avoid database schema evolution in many settings. Unfortunately, the description-driven approach leads to a loss of simplicity of the schema, and additional software behaviour is required for the management of dependencies, description relationships, and other Design Patterns that recur across the schema. This leads to redundant implementations of software that cannot be handled by using a framework-based approach. This paper presents an approach to address this problem which is based on the concept of an ontology of Design Patterns. Such an ontology allows the convenient separation of the structure and the semantics of database schemata. Through that, reusable software can be produced which separates application behaviour from the database schema.

Citation: Koch C., Kovacs Z., Goff J., McClatchey R., Petta P., Solomonides T.: Explicit Modeling of the Semantics of Large Multi-layered Object-Oriented Databases. In: Proceedings of the 19th International Conference on Conceptual Modeling, 9-12 October 2000, Salt Lake City, Utah, USA.


OFAI-TR-2000-22 ( 89kB g-zipped PostScript file,  214kB PDF file)

Temporal Pattern Recognition in Noisy Non-stationary Time Series Based on Quantization into Symbolic Streams: Lessons Learned from Financial Volatility Trading

Peter Tino, Christian Schittenkopf, Georg Dorffner

In this paper we investigate the potential of the analysis of noisy non-stationary time series by quantizing it into streams of discrete symbols and applying finite-memory symbolic predictors. The main argument is that careful quantization can reduce the noise in the time series to make model estimation more amenable given limited numbers of samples that can be drawn due to the non-stationarity in the time series. As a main application area we study the use of such an analysis in a realistic setting involving financial forecasting and trading. In particular, using historical data, we simulate the trading of straddles on the financial indexes DAX and FTSE 100 on a daily basis, based on predictions of the daily volatility differences in the underlying indexes. We propose a parametric, data-driven quantization scheme which transforms temporal patterns in the series of daily volatility changes into grammatical and statistical patterns in the corresponding symbolic streams. As symbolic predictors operating on the quantized streams we use the classical fixed-order Markov models, variable memory length Markov models and a novel variation of fractal-based predictors introduced in its original form in (Tino, 2000b). The fractal-based predictors are designed to efficiently use deep memory. We compare the symbolic models with continuous techniques such as time-delay neural networks with continuous and categorical outputs, and GARCH models. Our experiments strongly suggest that the robust information reduction achieved by quantizing the real-valued time series is highly beneficial. To deal with non-stationarity in financial daily time series, we propose two techniques that combine ``sophisticated'' models fitted on the training data with a fixed set of simple-minded symbolic predictors not using older (and potentially misleading) data in the training set. Experimental results show that by quantizing the volatility differences and then using symbolic predictive models, market makers can generate a statistically significant excess profit. However, with respect to our prediction and trading techniques, the option market on the DAX does seem to be efficient for traders and non-members of the stock exchange. There is a potential for traders to make an excess profit on the FTSE 100. We also mention some interesting observations regarding the memory structure in the studied series of daily volatility differences.

Keywords: Markov models, prediction suffix trees, iterative function systems, fractal machines, volatility, straddles, options

Citation: Tino P., Schittenkopf C., Dorffner G.: Temporal Pattern Recognition in Noisy Non-stationary Time Series Based on Quantization into Symbolic Streams: Lessons Learned from Financial Volatility Trading. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2000-22, 2000


OFAI-TR-2000-21 ( 80kB g-zipped PostScript file,  167kB PDF file)

The Benefit of Information Reduction for Trading Strategies

Christian Schittenkopf, Peter Tino, Georg Dorffner

Motivated by previous findings that discretization of financial time series can effectively filter the data and reduce the noise, this experimental study compares the trading performance of predictive models based on different modelling paradigms in a realistic setting. Different methods ranging from real-valued time series models to predictive models on a symbolic level are applied to predict the daily change in volatility of two major stock indices. The predicted volatility changes are interpreted as trading signals for buying or selling a straddle portfolio on the underlying stock index. Profits realized by this trading strategy are tested for statistical significance taking into account transactions costs. The results indicate that symbolic information processing is a promising approach to financial prediction tasks undermining the hypothesis of efficient capital markets.

Keywords: discretization, prediction, symbolic models, trading strategy, volatility

Citation: Schittenkopf C., Tino P., Dorffner G.: The Benefit of Information Reduction for Trading Strategies. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2000-21, 2000


OFAI-TR-2000-20 ( 42kB g-zipped PostScript file,  116kB PDF file)

CoIL Challenge 2000 - Submitted Solution

Alexander K. Seewald

This paper describes my solution to the CoIL Challenge 2000 (www.dsc.napier.ac.uk/coil). The challenge was to predict who would buy a Caravan insurance and why. There were two subtasks: to predict caravan insurance ownership and to describe caravan owners according to this prediction model. My model was trained using a MetaCost extended C4.5R8-clone and achieved a score of 109 out of a theoretical maximum of 238 while the winner achieved 121.

Keywords: Data Mining, Cost-Sensitive Classification, Decision trees

Citation: Seewald A.K.: CoIL Challenge 2000 - Submitted Solution. In P. van der Putten, M. van Someren (eds.), CoIL Challenge 2000: The Insurance Company Case, LIACS Technical Report 2000-09, Leiden Institute of Advanced Computer Science, Leiden, published by Sentient Machine Research, Amsterdam


OFAI-TR-2000-19 ( 49kB g-zipped PostScript file,  125kB PDF file)

Large-scale Induction of Expressive Performance Rules: First Quantitative Results

Gerhard Widmer

The paper presents first experimental results of a research project that aims at identifying basic principles of expressive music performance with the help of machine learning methods. Various learning algorithms were applied to a large collection of real performance data (recordings of 13 Mozart sonatas by a skilled pianist) in order to induce general categorical expression rules for tempo, dynamics, and articulation. Preliminary results show that the algorithms can indeed find some structure in the data. It also turns out that meter and global tempo have a strong influence on expression patterns. Finally, we briefly describe an experiment that demonstrates how machine learning can be used to study and possibly resolve some specialized questions.

Keywords: Machine Learning, Music, Expressive Music Performance

Citation: Widmer, G. (2000). Large-scale Induction of Expressive Performance Rules: First Quantitative Results. In Proceedings of the International Computer Music Conference (ICMC'2000). San Francisco, CA: International Computer Music Association.


OFAI-TR-2000-18 ( 16kB g-zipped PostScript file,  31kB PDF file)

The profitability of trading volatility using real-valued and symbolic models

Christian Schittenkopf, Peter Tino, Georg Dorffner

Essentially, there are two notions of volatility in literature: historical volatility and implied volatility. While measures of the former notion are derived from historical returns by (weighted) averaging over a time window, measures of the latter are estimated from observed option prices. Whatever particular volatility measure one is willing to apply, a central question is that of predictability of volatility. In particular, predictability in a statistical sense and economically meaningful predictability must be distinguished. In this paper we concentrate on the latter by analyzing the profitability of a pure volatility trading strategy which is delta-neutral and independent of an option pricing model, for the German stock index DAX. Several very different methods ranging from linear and non-linear, real-valued models to symbolic models of volatility changes are applied to predict the change in volatility to the next trading day and to gain profits by buying or selling straddles accordingly. The trading performance is evaluated for one historical and one implied volatility measure. The results are carefully evaluated concerning transaction costs, stationarity issues, and statistical significance. The main contribution of this paper is that, for the first time, the trading performance of models based on different modelling paradigms (real-valued versus symbolic) is compared. Furthermore, it is shown that the combination of different models can lead to improved performance, i.e., higher profits.

Keywords: options, profitability, symbolic dynamics, trading strategy, volatility

Citation: Schittenkopf C., Tino P., Dorffner G.: The profitability of trading volatility using real-valued and symbolic models, Proc. of the 2000 Conference on Computational Intelligence for Financial Engineering (CIFEr), pp. 8-11, New York City, NY, USA, March 26-28, 2000.


OFAI-TR-2000-17 ( 52kB g-zipped PostScript file,  93kB PDF file)

Multi-Agent Coordination of Distributed Event Data Processing

Christoph Koch, Paolo Petta

In this paper, we present a multi-agent systems approach to distributed event-data processing as it is e.g. pervasive in scientific computing environments. The task investigated is the one of configuration and execution of event-data processing pipelines to be assembled from single computational services (agents) that perform an asynchronous mapping of streams of inputs to streams of outputs, where the specification is given in terms of characterization of the final output of the pipeline.

Comprehensive declarative descriptions of the capabilities of single agents in such a system can be shown to be computationally intractable because of the complexity of the mapping between the inputs and outputs of individual agents. We therefore investigate the consequences of circumventing this problem by only publishing the capabilities of the outputs of agents, performing the transformation of output requirements to input requirements opaquely within individual agents, and utilizing recursive runtime contracting to configure complete data processing pipelines.

The information loss entailed by this kind of information propagation opens up the possibility of pipeline misconfigurations that in turn lead to runtime exceptions when constraints between interfaces that were not explicitly enforced by the published capability descriptions are violated. We characterize the ensuing coordination needs and related design requirements for such kinds of multi-agent systems and propose the introduction of social laws as a promising principled solution approach to be further researched.

Keywords: Multi-Agent Systems, Coordination, Capability Descriptions, Exception Handling, Social Models

Citation: Koch C., Petta P.: Multi-Agent Coordination of Distributed Event Data Processing. Engineering Societies in the Agents' World (ESAW'00), ECAI-2000 Workshop, Berlin, Germany, 21 August 2000


OFAI-TR-2000-16 ( 177kB g-zipped PostScript file,  113kB PDF file)

Melodic Cue Abstraction, Similarity and Category Formation: A Computational Approach.

Emilios Cambouropoulos

This study attempts to replicate, by means of computational modeling, two psychological experiments on cue abstraction and categorisation performed on a monophonic piece by J.S.Bach. The Unscramble clustering algorithm organises a number of melodic segments into motivic categories, determines a prototype for each cluster and uses these prototypical descriptions for membership prediction tasks. The results of the computational approach are compared to the empirical results, and convergences and deviations are reported. The clusters produced by the algorithm correspond closely to the categories provided in the empirical study. The application of the algorithm confirms most of the suggestions presented in the psychological studies regarding which cues play a most significant role in categorisation tasks.

Keywords: Music, Melodic Similarity, Clustering, Categorisation

Citation: In Proceedings of ICMPC 2000 (International Conference on Music Perception and Cognition), 5-10 August 2000, Keele, U.K.


OFAI-TR-2000-15 ( 112kB g-zipped PostScript file,  99kB PDF file)

From MIDI to Traditional Musical Notation.

Emilios Cambouropoulos

In this paper a system that is designed to extract the musical score from a MIDI performance is described. The proposed system comprises of a number of modules that perform the following tasks: identification of elementary musical objects, calculation of accent (salience) of musical events, beat induction, beat tracking, onset quantisation, streaming, duration quantisation and pitch spelling. The system has been applied on 13 complete Mozart sonata performances giving very encouraging results.

Keywords: Music, Beat Tracking, Quantisation, Pitch Spelling

Citation: In Proceedings of the AAAI Workshop on AI and Music, 30 July - 3 Aug. 2000, Austin, Texas


OFAI-TR-2000-14 ( 104kB g-zipped PostScript file,  119kB PDF file)

Extracting 'Significant' Patterns from Musical Strings: Some Interesting Problems.

Emilios Cambouropoulos

In this paper a number of issues relating to the application of string processing techniques on musical sequences are discussed. Special attention is given to musical pattern extraction. Firstly, a number of general problems are presented in terms of musical representation and pattern processing methodologies. Then a number of interesting melodic pattern matching problems are presented. Finally, issues relating to pattern extraction are discussed, with special attention being drawn to defining musical pattern 'significance'. This paper is not intended towards providing solutions to string processing problems but rather towards raising awareness of primarily music-related particularities that can cause problems in matching applications and also suggesting some interesting string processing problems that require efficient computational solutions.

Keywords: Music, Pattern Matching, String Processing

Citation: Invited paper presented at LSD 2000 (London String Days), 3-4 April 2000, King's College London, U.K.


OFAI-TR-2000-13 ( 237kB g-zipped PostScript file,  5491kB PDF file)

Melodic Clustering: Motivic Analysis of Schumann's Träumerei

Emilios Cambouropoulos, Gerhard Widmer

In this paper a formal model will be presented that attempts to organise melodic segments into 'significant' musical categories (e.g. motives). Given a segmentation of a melodic surface, the proposed model constructs an appropriate representation for each segment in terms of a number of attributes (these reflect melodic and rhythmic aspects of the segment at the surface and at various abstract levels) and then a clustering algorithm (the Unscramble algorithm) is applied for the organisation of these segments into 'meaningful' categories. The proposed clustering algorithm automatically determines an appropriate number of clusters and also the characteristic (or defining) attributes of each category. As a test case this computational model has been used for obtaining a motivic analysis of Schumann's Träumerei.

Keywords: Music, Melodic Similarity, Clustering, Categorisation

Citation: In Proceedings of JIM 2000 (Journées d'Informatique Musicale), 15-18 May 2000, Bordeaux, France.


OFAI-TR-2000-12 ( 252kB g-zipped PostScript file,  1148kB PDF file)

Data mining and EEG

Arthur Flexer

An overview of Data Mining (DM) and its application to the analysis of EEG is given by (i) presenting a working definition of DM, (ii) motivating why EEG analysis is a challenging field of application for DM technology and (iii) by reviewing exemplary work on DM applied to EEG analysis. The current status of work on DM and EEG is discussed and some general conclusions are drawn.

Keywords: Data Mining, EEG, Signal Processing, Neural Networks, Machine Learning, Statistics

Citation: Flexer A.: Data mining and electroencephalography, Statistical Methods in Medical Research, 9: 395-413, 2000.


OFAI-TR-2000-11 ( 184kB g-zipped PostScript file,  147kB PDF file)

Skilled Piano Performance: Melody Lead Caused by Dynamic Differentiation

Werner Goebl

Background: Simultaneous notes in the printed score (chords) are not played strictly simultaneously by pianists. As reported in the literature, an emphasised voice is not only played louder, but additionally precedes the other voices typically by around 30ms (melody lead). It is still unclear whether this phenomenon is a common expressive feature in music performance that aids listeners in identifying the melody in multivoiced music, or that it is mostly due to the timing characteristics of the piano action (velocity artefact) and therefore a result of a dynamic differentiation of voices. Especially in chords played by the right hand, high correlations between velocity difference and melody lead (between melody notes and accompaniment) seem to confirm this velocity artefact assumption.
Aims: The investigated data, derived mostly from computer-monitored pianos, represents the asynchronies at the hammer-string contact points. The present study will be focused on asynchrony patterns at the finger-key contact times as well. Finger-key profiles represent what pianists initially do when striking different keys simultaneously. In this paper, we show that the melody lead phenomenon disappears at the finger-key level. That means that pianists tend to strike the keys almost simultaneously, only different dynamics (velocities) result in the typical hammer-string asynchronies (melody lead).

Keywords: Music Expression, Piano Performance

Citation: Goebl, W. (2000). Skilled Piano Performance: Melody Lead Caused by Dynamic Differentiation. In Proceedings of the 6th International Conference on Music Perception and Cognition (ICMPC'2000), Keele, UK.


OFAI-TR-2000-10 ( 60kB g-zipped PostScript file,  160kB PDF file)

Hybrid Decision Tree Learners with Alternative Leaf Classifiers: An Empirical Study

Alexander K. Seewald, Johann Petrak, Gerhard Widmer

There has been surprisingly little research so far that systematically investigated the possibility of constructing hybrid learning algorithms by simple local modifications to decision tree learners. In this paper we analyze three variants of a C4.5-style learner, introducing alternative leaf models (Naive Bayes, IB1, and multi-response linear regression, respectively) which can replace the original C4.5 leaf nodes during reduced error post-pruning. We empirically show that these simple modifications can improve upon the performance of the original decision tree algorithm and even upon both constituent algorithms. We see this as a step towards the construction of learners that locally optimize their bias for different regions of the instance space.

Keywords: Data Mining, Learning, Machine Learning

Citation: Seewald A.K., Petrak J., Widmer G.: Hybrid Decision Tree Learners with Alternative Leaf Classifiers: An Empirical Study (extended version). In Proceedings of the 14th International FLAIRS Conference (FLAIRS-2001), AAAI Press, Menlo Park, California


OFAI-TR-2000-09 ( 39kB g-zipped PostScript file,  69kB PDF file)

Beat Tracking with Musical Knowledge

Simon Dixon, Emilios Cambouropoulos

When a person taps a foot in time with a piece of music, they are performing beat tracking. Beat tracking is fundamental to the understanding of musical structure, and therefore an essential ability for any system which purports to exhibit musical intelligence or understanding. We present an off-line multiple agent beat tracking system which estimates the locations of musical beats in MIDI performance data. This approach to beat tracking requires no prior information about the input data, such as the tempo or time signature; all required information is derived from the performance data. For constant tempo performances, previous beat tracking systems have proved successful; however, these systems fail when there are large variations in tempo. We examine the role of musical knowledge in guiding the beat tracking process, and show that a system equipped with knowledge of musical salience is able to track the beat of music even in the presence of large tempo variations. Results are presented for a large corpus of expressively performed classical piano music (13 complete sonatas), containing a full range of tempos and much variability in tempo within sections. With the musical knowledge disabled, the beats are tracked about 75% correctly; the inclusion of musical knowledge raises this figure to over 90%.

Keywords: rhythm, music, beat tracking, beat induction

Citation: In ECAI 2000: Proceedings of the 14th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2000, ed. W. Horn.


OFAI-TR-2000-08 ( 56kB g-zipped PostScript file,  44kB PDF file)

A Lightweight Multi-Agent Musical Beat Tracking System

Simon Dixon

Beat tracking is what people do when they tap their feet in time to music. We present a software system which performs this task, processing music in a standard digital audio format and estimating the locations of musical beats. A time-domain algorithm detects salient acoustic events, and then a clustering algorithm groups the time intervals between events to obtain hypotheses about the current tempo. Multiple competing agents track these hypotheses throughout the music, with further agents being created at decision points. The output for each agent is a sequence of beat locations, which is evaluated for its closeness of fit to the data. This approach to beat tracking assumes no previous knowledge of the music such as the style, time signature or approximate tempo; all required information is derived from the audio data. The system has been tested with various styles of music (popular, jazz, and classical) and performs robustly, rarely making errors in popular music, and recovering quickly from errors in more complex styles of music, despite the fact that no high level musical knowledge is encoded in the system. We describe several applications, including musical score extraction and an automatic disc jockey that performs beat mixing in real time.

Keywords: computer music, rhythm, beat induction, beat tracking, automatic transcription

Citation: In Proceedings of the AAAI Workshop on Artificial Intelligence and Music: Towards Formal Models for Composition, Performance and Analysis. Austin, Texas, July 2000.


OFAI-TR-2000-07 ( 41kB g-zipped PostScript file,  104kB PDF file)

Fast Subsampling Performance Estimates for Classification Algorithm Selection

Johann Petrak

The typical data mining process is characterized by the prospective and iterative application of a variety of different data mining algorithms from an algorithm toolbox. While it would be desirable to check many different algorithms and algorithm combinations for their performance on a database, it is often not feasible because of time and other resource constraints. This paper investigates the effectiveness of simple and fast subsampling strategies for algorithm selection. We show that even such simple strategies perform quite well in many cases and propose to use them as a base-line for comparison with meta-learning and other advanced algorithm selection strategies.

Citation: Petrak J.: Fast Subsampling Performance Estimates for Classification Algorithm Selection. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2000-07, 2000


OFAI-TR-2000-06 ( 28kB g-zipped PostScript file,  55kB PDF file)

A Beat Tracking System for Audio Signals

Simon Dixon

We present a system which processes audio signals sampled from recordings of musical performances, and estimates the tempo at each point throughout the piece. The system employs a bottom-up approach to beat tracking from acoustic signals, assuming no a priori high-level knowledge of the music such as the time signature or approximate tempo, but rather deriving this information from the timing patterns of detected note onsets. Results from the beat tracking of several popular songs are presented and discussed.

Keywords: Beat tracking, Music recognition, Rhythm

Citation: In Proceedings of the Diderot Forum on Mathematics and Music: Computational and Mathematical Methods in Music, Vienna, Austria, December 2-4, 1999, pp 101-110.


OFAI-TR-2000-05 ( 44kB g-zipped PostScript file,  70kB PDF file)

Introducing Emotions into the Computational Study of Social Norms

Alexander Staller, Paolo Petta

We argue that modelling emotions among agents in artificial societies will further the computational study of social norms. The appraisal theory of emotions is presented as theoretical underpinning of Jon Elster's view that social norms are sustained not only by material sanctions but also by emotions such as shame and contempt. Appraisal theory suggests the following twofold relationship between social norms and emotions: First, social norms play an important role in the generation of emotions; second, emotion regulation depends heavily on the influence of social norms. Based on these insights, we present an emotion-based view on the influential study by Conte and Castelfranchi (1995); without mentioning emotions, they argue that a function of social norms is aggression control. Appraisal theory offers a principled framework for the development of TABASCO, a three-layer agent architecture incorporating social norms. At the macro level, the computational study of social norms can profit by economic and sociobiological theories, which suggest that emotions play an important role in sustaining norms of cooperation and reciprocity. We show how appraisal theory can serve as a link between the macro and micro levels, and summarize the potential benefits from the development of TABASCO.

Keywords: Social Norms, Social Simulation, Emotion, Appraisal Theory (Psychology), Agent Architectures, Situated Agents

Citation: Staller A., Petta P.: Introducing Emotions into the Computational Study of Social Norms. In Proceedings of the 2000 Convention of the Society for the Study of Artificial Intelligence and the Simulation of Behaviour (AISB'00), Birmingham, UK, April 17-20, 2000.


OFAI-TR-2000-04 ( 85kB g-zipped PostScript file,  197kB PDF file)

Relative Unsupervised Discretization for Association Rule Mining

Marcus-Christopher Ludl, Gerhard Widmer

The paper describes a new, context-sensitive discretization algorithm that can be used to completely discretize a numeric or mixed numeric-categorical dataset. The method combines aspects of unsupervised (class-blind) and supervised methods. The central idea in the algorithm is what might be called ``mutual structure projection'' between the (numeric or categorical) attributes. The goal is to discretize a numeric attribute into intervals that correlate as much as possible with patterns in the value distributions of the other attributes. This is achieved by finding points of distribution changes, mapping them onto the target attribute, and subsequently clustering these points; the result is a set of significant split points that define the interval boundaries of the attribute discretization. This process can be performed for each numeric attribute in a dataset, thereby producing discretizations that reflect potentially complex interrelationships among different attributes of the dataset. The algorithm was designed with a view to the problem of finding association rules or functional dependencies in complex, partly numerical data. The paper describes the algorithm and presents systematic experiments with a synthetic data set that contains a number of rather complex associations. Experiments with varying degrees of noise and ``fuzziness'' demonstrate the robustness of the method. An application to a large real-world dataset produced interesting preliminary results, which are currently the topic of specialized investigations.

Keywords: data mining, association rules, discretization

Citation: Ludl M., Widmer G.: Relative Unsupervised Discretization for Association Rule Mining. In Proceedings of the 4th European Conference for Principles of Data Mining and Knowledge Discovery (PKDD 2000), Lyon, France.


OFAI-TR-2000-03

Neural networks in peace and conflict research: An overview of and proposals for possible uses

Georg Dorffner

This paper attempts to provide an overview of existing and possible uses of neural networks in the area of peace and conflict research. Neural networks are a modelling technique in artificial intelligence with several foci. One is on flexible nonlinear statistical data analysis, another is on cognitive modelling (``connectionism''). While the former perfectly fits into other quantitative approaches in political science, the latter offers interesting perspectives for modelling dynamical and cognitive aspects involved in concflicts. While providing references to existing and similar work, this paper focuses on several important issues such as the proper coding of information and the feasibility of nonlinear analysis in terms of available data. Examples using existing conflict databases are given.

Keywords: Connectionism, Neural Networks, Conflict Research, Political Science, Cognitive Modeling

Citation: Dorffner G.: Neural networks in peace and conflict research: An overview of and proposals for possible uses. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-2000-03, 2000


OFAI-TR-2000-02 ( 60kB g-zipped PostScript file,  74kB PDF file)

Searching for Patterns in Political Event Sequences: Experiments with the KEDS Database

Klaus Kovar, Johannes Fürnkranz, Johann Petrak, Bernhard Pfahringer, Robert Trappl, Gerhard Widmer

The paper presents an empirical study on the possibility of discovering interesting event sequences and sequential rules in a large database of international political events. We have implemented and extended a data mining algorithm, first presented by Mannila and Toivonen (1996), which is able to search for generalized episodes in such event databases. Experiments conducted with this algorithm on the KEDS database, an event data set covering interactions between countries in the Persian Gulf region, are described. We report some qualitative and quantitative results and also discuss our experiences with strategies for reducing the problem complexity and focussing the search on interesting subsets of events.

Citation: Kovar, K., Fürnkranz, J., Petrak, J., Pfahringer, B., Trappl, R., and Widmer, G. (2000). Searching for Patterns in Political Event Sequences. Cybernetics and Systems 31(6).


OFAI-TR-2000-01 ( 399kB g-zipped PostScript file,  192kB PDF file)

Relative Unsupervised Discretization for Regression Problems

Marcus-C. Ludl, Gerhard Widmer

This paper describes a new, context-sensitive discretization algorithm that combines aspects of unsupervised (class-blind) and supervised methods. The algorithm is in principle applicable to a wide range of machine learning and data mining problems where continuous attributes need to be discretized. In this paper, we evaluate the utility of the method in a regression-by-classification setting. Preliminary experimental results indicate that the decision trees induced using this discretization strategy are significantly smaller and thus more comprehensible than those learned with standard discretization methods, while losing only minimally in numerical prediction accuracy. This may be a considerable advantage in machine learning and data mining applications where comprehensibility is an issue.

Citation: Ludl M., Widmer G.: Relative Unsupervised Discretization for Regression Problems. In Proceedings of the 11th European Conference on Machine Learning (ECML 2000), Barcelona, Spain.


OFAI-TR-99-25 ( 144kB g-zipped PostScript file,  196kB PDF file)

Development and evaluation of VIE-PNN, a knowledge-based system for calculating the parenteral nutrition of newborn infants

Werner Horn, Christian Popow, Silvia Miksch, Lieselotte Kirchner, Andreas Seyfang

Calculating the daily changing composition of parenteral nutrition for small newborn infants is troublesome and time consuming routine work in neonatal intensive care. The task needs expertise and experience and is prone to inherent calculation errors. We designed VIE-PNN, a knowledge-based system in order to reduce daily routine work and calculation errors. VIE-PNN was redesigned several times because the clinicians accepted the system only when it saved time. The most recent version of VIE-PNN uses a HTML-based client-server architecture and is integrated into the intranet of the local patient data management system. Since more than three years all parenteral nutrition plans are calculated using VIE-PNN. Evaluating the system's performance and the users contentedness, we compared 50 nutrition plans calculated in parallel using VIE-PNN or a hand-held calculator, retrospectively analyzed more than 5000 nutrition plans stored in VIE-PNNs database and evaluated a users questionnaire. Nutrition plans were calculated in a mean time of 2.4 vs. 7.1 minutes using VIE-PNN or the hand-held calculator. Errors and omissions in the nutrition plans were detected in 22 vs. 56% and errors in the VIE-PNN's plans occurring only with interactively changed values. Reviews of stored plans showed that a mean of 4 out of 16 parameters were interactively changed. VIE-PNN was well accepted. Most important reasons for the successful operation of VIE-PNN in the daily routine work were time savings and robustness of the system.

Keywords: Knowledge-based system, Intensive care unit, Parenteral nutrition, Neonates, VIE-PNN

Citation: W. Horn, C. Popow, S. Miksch, L. Kirchner, A. Seyfang: Development and Evaluation of VIE-PNN, a Knowledge-based System for Calculating the Parenteral Nutrition of Newborn Infants, Artificial Intelligence in Medicine, 24(3)207-218, 2002.


OFAI-TR-99-24 ( 31kB g-zipped PostScript file,  19kB PDF file)

A Clustering Algorithm for Melodic Analysis

Emilios Cambouropoulos, Alan Smaill, Gerhard Widmer

In this paper a formal model will be presented that attempts to organise melodic segments into 'significant' musical categories (e.g. motives). Given a segmentation of a melodic surface, the proposed model constructs an appropriate representation for each segment in terms of a number of attributes (these reflect melodic and rhythmic aspects of the segment at the surface and at various abstract levels) and then a clustering algorithm (the Unscramble algorithm) is applied for the organisation of these segments into 'meaningful' categories. The proposed clustering algorithm automatically determines an appropriate number of clusters and also the characteristic (or defining) attributes of each category. As a test case this computational model has been used for obtaining a motivic analysis of three melodies from diverse musical styles.

Keywords: Music, Melodic Similarity, Clustering, Categorisation

Citation: Cambouropoulos E., Smaill A., Widmer G.: A Clustering Algorithm for Melodic Analysis. In Proceedings of the Diderot'99 Forum on Mathematics and Music, 1-4 December 1999, Vienna.


OFAI-TR-99-23 ( 169kB g-zipped PostScript file,  287kB PDF file)

Algorithms for Computing Approximate Repetitions in Musical Sequences

Emilios Cambouropoulos, Maxime Crochemore, Costas Iliopoulos, Laurent Mouchard, Yoan J. Pinzon

In this paper two new notions of approximate matching with application in computer-assisted musical analysis are introduced. A melodic sequence may be represented as a string of integers (e.g. midi pitch). In delta-approximate matching two patterns match if each corresponding integers differ by not more than delta and in gamma-approximate matching if the sum of differences between the two patterns is not greater than gamma. Efficient algorithms are herein presented that perform these two types of approximate matching.

Keywords: Music, Approximate Pattern Matching, String Processing

Citation: Cambouropoulos E., Crochemore M., Iliopoulos C., Mouchard L., Pinzon Y.: Algorithms for Computing Approximate Repetitions in Musical Sequences . In Proceedings of the AWOCA'99 Workshop (Australasian Workshop on Combinatorial Algorithms), 25-27 August 1999, Perth


OFAI-TR-99-22 ( 66kB g-zipped PostScript file,  68kB PDF file)

Pattern Processing in Melodic Sequences: Challenges, Caveats & Prospects

Emilios Cambouropoulos, Tim Crawford, Costas S. Iliopoulos

In this paper a number of issues relating to the application of string processing techniques on musical sequences are discussed. A brief survey of some musical string processing algorithms is given and some issues of melodic representation, abstraction, segmentation and categorisation are presented. This paper is not intended towards providing solutions to string processing problems but rather towards highlighting possible stumbling-block areas and raising awareness of primarily music-related particularities that can cause problems in matching applications.

Keywords: Music, Pattern Matching, Pattern Induction, String processing, Melodic Representation

Citation: Cambouropoulos E., Crawford T., Iliopoulos C.: Pattern Processing in Melodic Sequences: Challenges, Caveats & Prospects. In Proceedings of the AISB'99 Convention (Artificial Intelligence and Simulation of Behaviour), 6-9 April 1999, Edinburgh, U.K.


OFAI-TR-99-21 ( 86kB g-zipped PostScript file,  335kB PDF file)

Using Hidden Markov Models to build an automatic, continuous and probabilistic sleep stager

Arthur Flexer, Peter Sykacek, Iaed Rezek, Georg Dorffner

We report about an automatic continuous sleep stager which is based on probabilistic principles employing Hidden Markov Models (HMM). Our sleep stager offers the advantage of being objective by not relying on human scorers, having much finer temporal resolution (1 second instead of 30 second), and being based on solid probabilistic principles rather than a predefined set of rules (Rechtschaffen & Kales). Results obtained for nine whole night sleep recordings are reported.

Keywords: Time series, Medicine, Unsupervised learning, Sleep analysis, Hidden Markov Models

Citation: Flexer A., Sykacek P., Rezek I., Dorffner G.: Using Hidden Markov Models to build an automatic, continuous and probabilistic sleep stager, in Amari S.-I. , et al.(eds.), Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, IJCNN 2000, Como, Italy, IEEE Computer Society, Vol. III, 6 27-631, 2000.


OFAI-TR-99-20 ( 128kB g-zipped PostScript file,  307kB PDF file)

On non-linear, stochastic dynamics in economic and financial time series

Christian Schittenkopf, Georg Dorffner, Engelbert J. Dockner

The search for deterministic chaos in economic and financial time series has attracted much interest over the past decade. However, clear evidence of chaotic structures is usually prevented by large random components in the time series. In the first part of this paper we show that even if a sophisticated algorithm estimating and testing the positivity of the largest Lyapunov exponent is applied to time series generated by a stochastic dynamical system or a return series of a stock index, the results are difficult to interpret. We conclude that the notion of sensitive dependence on initial conditions as it has been developed for deterministic dynamics, can hardly be transfered into a stochastic context. Therefore, in the second part of the paper our starting point for measuring dependencies for stochastic dynamics is a distributional characterization of the dynamics, e.g. by heteroskedastic models for economic and financial time series. We adopt a sensitivity measure proposed in the literature which is an information-theoretic measure of the distance between probability density functions. This sensitivity measure is well defined for stochastic dynamics, and it can be calculated analytically for the classes of stochastic dynamics with conditional normal distributions of constant and state-dependent variance. In particular, heteroskedastic return series models such as ARCH and GARCH models are investigated.

Keywords: Chaos, Lyapunov exponents, Stochastic dynamics, Time series, Volatility

Citation: Schittenkopf C., Dorffner G., Dockner E.J.: On non-linear, stochastic dynamics in economic and financial time series, Studies in Nonlinear Dynamics and Econometrics (to appear).


OFAI-TR-99-19 ( 139kB g-zipped PostScript file,  343kB PDF file)

Non-linear versus Non-gaussian Volatility Models

Christian Schittenkopf, Georg Dorffner, Engelbert J. Dockner

One of the most challenging topics in financial time series analysis is the modeling of conditional variances of asset returns. Although conditional variances are not directly observable there are numerous approaches in the literature to overcome this problem and to predict volatilities on the basis of historical asset returns. The most prominent approach is the class of GARCH models where conditional variances are governed by a linear autoregressive process of past squared returns and variances. Recent research in this field, however, has focused on modeling asymmetries of conditional variances by means of non-linear models. While there is evidence that such an approach improves the fit to empirical asset returns, most non-linear specifications assume conditional normal distributions and ignore the importance of alternative models. Concentrating on the distributional assumptions is, however, essential since asset returns are characterized by excess kurtosis and hence fat tails that cannot be explained by models with sufficient heteroskedasticity. In this paper we take up the issue of returns' distributions and contrast it with the specification of non-linear GARCH models. We use daily returns for the Dow Jones Industrial Average over a large period of time and evaluate the predictive power of different linear and non-linear volatility specifications under alternative distributional assumptions. Our empirical analysis suggests that while non-linearities do play a role in explaining the dynamics of conditional variances, the predictive power of the models does also depend on the distributional assumptions.

Keywords: volatility, neural networks, GARCH models, non-linearity, fat tails

Citation: Schittenkopf C., Dorffner G., Dockner E.J.: Non-linear versus Non-gaussian Volatility Models. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-99-19, 1999.


OFAI-TR-99-18 ( 24kB g-zipped PostScript file,  41kB PDF file)

Thus Spoke the User to the Wizard

Hannes Pirker, Georg Loderer, Harald Trost

Wizard-of-Oz (WOZ) simulations are a popular means for investigating the properties of human-computer interaction. In this paper the findings from a WOZ experiment for evaluating different design options for a spoken dialogue system are presented. In addition to the documentation of the outcomes of this evaluation in terms of standard quantitative measures we also present findings from a more qualitative analysis of the speech data collected throughout this experiment. It is argued that such a combined analysis of all aspects of the human-computer interaction allows for a correct interpretation of the results and their fruitful application in the context of system prototyping.

Keywords: Spoken Dialogue Systems, Wizard-of-Oz Simulation, Prosody,

Citation: Pirker H., Loderer G., Trost H.: Thus Spoke the User to the Wizard, in Proceedings of the 6th European Conference on Speech Communication and Technology (Eurospeech 99), Budapest, Hungary, Vol.3,p.1171, 1999.


OFAI-TR-99-17 ( 54kB g-zipped PostScript file,  52kB PDF file)

Listening to Lists: Studying Durational Phenomena in Enumerations

Hannes Pirker, Stefan Kramer

A study on durational phenomena in list like enumerations in German is presented. Due to its highly structured and uniform nature this rather specialized utterance type seems especially well-suited for investigating principles of the rhythmical organization of speech. A corpus extracted from radio weather reports is used in order to investigate phenomena like prefinal lengthening and effects of isochrony and prominence. In addition to studying durational phenomena with standard statistical methods, the data also was analyzed using Structural Regression Trees (SRT), a machine learning algorithm.

Keywords: Speech, Prosody, Machine Learning

Citation: Pirker H., Kramer S.: Listening to Lists: Studying Durational Phenomena in Enumerations. In Proceedings of the 14th International Conference of Phonetic Sciences (ICPhS-99), San Francisco, California, p.273, 1999.


OFAI-TR-99-16 ( 120kB g-zipped PostScript file,  70kB PDF file)

"I said TWO TI-CKETS": How to Talk to a Deaf Wizard

Hannes Pirker, Georg Loderer

So-called Wizard-of-Oz (WOZ) simulations are a popular framework for investigating the nature of human machine interaction in general and for the development and evaluation of designs for spoken dialog systems in particular. In this paper a WOZ simulation of a speech based ticket reservation system is presented. In contrast to most of the studies performed in this framework we are not concerned with the evaluation of different dialogue designs. In our experiment the WOZ frequently rejected or misrecognised user utterances. The findings on the prosodic properties of repeats and corrections triggered by these recognition errors are presented. In addition some data on the lexical content of the user utterances is discussed.

Keywords: Spoken Dialog Systems, Wizard-of-Oz Simulation, Prosody,

Citation: Pirker H., Loderer G.: "I said TWO TI-CKETS": How to Talk to a Deaf Wizard. in Proceedings of the ESCA Workshop on Dialogue and Prosody, September 1-3, Veldhoven, The Netherlands, p.181, 1999.


OFAI-TR-99-15 ( 80kB g-zipped PostScript file,  300kB PDF file)

Using Hidden Markov Models to build an automatic, continuous and probabilistic sleep stager for the SIESTA project

Arthur Flexer, Peter Sykacek, Iaed Rezek, Georg Dorffner

We report about an automatic continuous sleep stager which is based on probabilistic principles employing Hidden Markov Models (HMM). Our sleep stager offers the advantage of being objective by not relying on human scorers, having much finer temporal resolution (1sec instead of 30sec), and being based on solid probabilistic principles rather than a predefined set of rules (Rechtschaffen & Kales). Results obtained for nine whole night sleep recordings are reported

Keywords: Pattern Recognition, Hidden Markov Model, Signal Processing, EEG, Sleep

Citation: Flexer A., Sykacek P., Rezek I., Dorffner G.: Using Hidden Markov Models to build an automatic, continuous and probabilistic sleep stager for the SIESTA project (extended abstract), Medical & Biological Engineering & Computing, Supplement 2, Proceedings of EMBEC'99, p.1658-1659, 1999


OFAI-TR-99-14 ( 92kB g-zipped PostScript file,  214kB PDF file)

Forecasting Time-dependent Conditional Densities: A Seminonparametric Neural Network Approach

Christian Schittenkopf, Georg Dorffner, Engelbert J. Dockner

In financial econometrics the modeling of asset return series is closely related to the estimation of the corresponding conditional densities. One reason why one is interested in the whole conditional density and not only in the conditional mean, is that the conditional variance can be interpreted as a measure of time-dependent volatility of the return series. In fact, the modeling and the prediction of volatility is one of the central topics in asset pricing.
In this paper we propose to estimate conditional densities semi-nonparam- etrically in a neural network framework. Our recurrent mixture density networks realize the basic ideas of prominent GARCH approaches but they are capable of modeling any continuous conditional density also allowing for time-dependent higher-order moments. Our empirical analysis on daily DAX data shows that out-of-sample volatility predictions of the neural network model are superior to predictions of GARCH models in that they have a higher correlation with implied volatilities.

Keywords: conditional densities, forecasting, GARCH, neural networks, volatility

Citation: Schittenkopf C., Dorffner G., Dockner E.J.: Forecasting Time-dependent Conditional Densities: A Seminonparametric Neural Network Approach, Journal of Forecasting 19, 355-374, 2000.


OFAI-TR-99-13 ( 51kB g-zipped PostScript file,  399kB PDF file)

On input selection with reversible jump Markov chain Monte Carlo sampling

Peter Sykacek

In this paper we will treat input selection for a radial basis function (RBF) like classifier within a Bayesian framework. We approximate the a-posteriori distribution over both model coefficients and input subsets by samples drawn with Gibbs updates and reversible jump moves. Using some public datasets, we compare the classification accuracy of the method with a conventional ARD scheme. These datasets are also used to infer the relevance of different input subsets.

Keywords: Classification, Bayesian Inference, Model Selection, Input Relevance, MCMC sampling

Citation: Sykacek P.: On input selection with reversible jump Markov chain Monte Carlo sampling, NIPS 99, Denver, Colorado USA, 1999


OFAI-TR-99-12 ( 40kB g-zipped PostScript file,  76kB PDF file)

Learning to Make Good Use of Operational Advice

Bernhard Pfahringer, Hermann Kaindl, Stefan Kramer, Johannes Fürnkranz

We address the problem of advice-taking in a given domain, in particular for building a game-playing program. Our approach to solving it strives for the application of machine learning techniques throughout, i.e., for avoiding knowledge elicitation by any other means as much as possible. In particular, we build upon existing work on the operationalization of advice by machine and assume that advice is already available in operational form. The relative importance of this advice is, however, not yet known and can therefore not be utilized well by a program. This paper presents an approach to determine the relative importance for a given situation through reinforcement learning. We implemented this approach for the game of Hearts and gathered some empirical evidence on its usefulness through experiments. The results show that the programs built according to our approach learned to make good use of the given operational advice.

Keywords: Machine Learning, Game Playing, Hearts, Advice Taking, Reinforcement Learning

Citation: Pfahringer B., Kaindl H., Kramer S., Fürnkranz J.: Learning to Make Good Use of Operational Advice. In Proc. ICML-99 WS on Machine Learning in Game Playing, Bled, Slovenia, 1999.


OFAI-TR-99-11 ( 45kB g-zipped PostScript file,  19kB PDF file)

Towards Engaging Full-Body Interaction

Paolo Petta, Alexander Staller, Robert Trappl, Stephan Mantler, Zsolt Szalavari, Thomas Psik, Michael Gervautz

We implemented an interactive virtual environment based on the magic mirror metaphor introduced with the MIT ALIVE project MIT. Our permanent exhibit in the Vienna Museum of Technology aims at taking advantage of the possibility opened up by this particular kind of unencumbered immersion in a virtual environment for the users to bring in their rich expertise in full-body action and communication, so as to provide a satisfying and truly interactive experience for laypersons. To this end, the system comprises a synthetic character, the Invisible Person, designed to improvise with the visitors. Evidence from the deployment of public virtual environments under comparable circumstances indicates that the use of highly specialized domains faces limited success. In order to avoid these problems and also to mitigate the effect of the inevitable limitations of systems dependent on purely vision-based user tracking, the Invisible Person was designed to be suggestive of adequate kinds of interaction. Together with the employment of high-quality motion-captured animation, the balance between the full capabilities of the agents and the capabilities elicited from the human user contributes to make the exhibit interesting to visit. In addition, the simple scenario also opened up the opportunity to deploy and evaluate a situated implementation of the functional appraisal theory of emotions to contribute to the solution of the action expression problem. The system has been a major attraction of the museum since its re-opening in June, 1999. Currently, a first evaluation is underway to determine directions of future improvements. Likely candidates include better coverage of specific situations and implementation of short- and long-term adaptive strategies. Especially for the latter, the generative approach of the employed emotion model is expected to go a long way towards ensuring preservation of consistent behaviour even with changing system parameters, obviating the need of complex re-engineering as would be required for reified shallow emotion models.

Keywords: Action Expression, Affective Computing, Emotion, Interactive, Graphics, Synthetic Actors

Citation: Petta P., Staller A., Trappl R., Mantler S., Szalavari Z., Psik T., Gervautz M.: Towards Engaging Full-Body Interaction, in Proc. of HCI International '99, Munich, Germany, August 22-27, 1999.


OFAI-TR-99-10

Representational/Efficiency Issues in Toxological Knowledge Discovery

Bernhard Pfahringer, E. Gottmann, Stefan Kramer, Christoph Helma

Citation: Pfahringer B., Gottmann E., Kramer S., Helma C.: Representational/Efficiency Issues in Toxological Knowledge Discovery. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-99-10, 1999


OFAI-TR-99-09

Data Quality Issues in Toxological Knowledge Discovery

Christoph Helma, E. Gottmann, Stefan Kramer, Bernhard Pfahringer

Citation: Helma C., Gottmann E., Kramer S., Pfahringer B.: Data Quality Issues in Toxological Knowledge Discovery. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-99-09, 1999


OFAI-TR-99-08 ( 148kB g-zipped PostScript file,  141kB PDF file)

Principled Generation of Expressive Behavior in an Interactive Exhibit

Paolo Petta

The present work was carried out in the context of the development of an immersive interactive virtual environment in which a single human user and a synthetic actor (``The Invisible Person'') engage in an improvisational interaction between equally entitled peers. The limitations of this scenario with respect to the number of parties involved and the synthetic actor's perceptual and communicative capabilities allow to experiment with the modeling of expressive behavior, a noted deficiency of early synthetic actor agents that has been recently topicalized as the ``action expression'' problem. We relate results from research in software architectures for embodied agents and the appraisal theory theory of emotions and investigate their suitability as a principled basis for behavior expression generation. We describe the implementation in the agent architecture employed in the present system, and relate the results to our ongoing work in the framework of the Tabasco architecture for emotions.

Keywords: Affective Computing, Expressive Behaviour, Action Expression, Emotions, Synthetic Characters

Citation: Petta P.: Principled Generation of Expressive Behavior in an Interactive Exhibit, in Velasquez J.D.(ed.), Workshop Notes: "Emotion-Based Agent Architectures" (EBAA'99), May 1, 1999, 3rd Int'l Conf. on Autonomous Agents (Agents'99), Seattle, WA, USA, pp.94-98, 1999.


OFAI-TR-99-07 ( 34kB g-zipped PostScript file,  121kB PDF file)

Language Modeling with Prediction Fractal Machines

Shan Parfitt, Peter Tino, Georg Dorffner

We introduce a novel method of constructing language models, which avoids some of the problems associated with recurrent neural networks. The method of creating a Prediction Fractal Machine (PFM) is briefly described and some experiments are presented which demonstrate the suitability of PFMs for language modeling tasks. PFMs are able to distinguish reliably between minimal pairs, and their behavior is consistent with the hypothesis that well-formedness is 'graded' rather than absolute. These results form the basis of a discussion of the PFM's potential to offer fresh insights into the problem of language acquisition and processing.

Keywords: Connectionism, Language modeling, Grammar learning, Recurrent neural networks

Citation: Parfitt S., Tino P., Dorffner G.: Graded grammaticality in Prediction Fractal Machines, Advances in Neural Information Processing Systems 12, MIT Press, (to appear), 2000.


OFAI-TR-99-06

Representation of Temporal Structures in Recurrent Neural Networks with Iterated Function Systems Dynamics

Peter Tino, Georg Dorffner

Citation: Tino P., Dorffner G.: Representation of Temporal Structures in Recurrent Neural Networks with Iterated Function Systems Dynamics. Technical Report, Österreichisches Forschungsinstitut für Artificial Intelligence, Wien, TR-99-06, 1999


OFAI-TR-99-05 ( 77kB g-zipped PostScript file,  681kB PDF file)

Fat Tails and Non-linearity in Volatility Models: What is more important?

Christian Schittenkopf, Georg Dorffner, Engelbert Dockner

Since the seminal works of Engle and Bollerslev about heteroskedastic return series models, many extensions of their (G)ARCH models have been proposed. In particular, the functional dependence of conditional variances and the shape of the conditional distribution of returns have been varied in several ways. These two issues have been addressed by the neural network community using multi-layer perceptrons and mixture density networks (MDNs). In this paper we extend the concept of MDNs in a recurrent way to allow for ``GARCH effects''. These recurrent MDNs (RMDNs) offer a consistent framework to analyze the impact of non-linearity and of non-gaussian (leptokurtic) conditional distributions on the explanatory power of volatility models. We present numerical experiments on a very large return data set the size of which allows to perform detailed statistical tests to compare the obtained results.

Keywords: Mixture Density Networks, Volatility, Density Estimation, GARCH, Fat Tails

Citation: Schittenkopf C., Dorffner G., Dockner E.: Fat Tails and Non-linearity in Volatility Models: What is more important?, Proc. of the 1999 Conference on Computational Intelligence for Financial Engineering (CIFEr99), pp. 259-266, New York City, NY, USA, March 28-30, 1999.


OFAI-TR-99-04 ( 42kB g-zipped PostScript file,  263kB PDF file)

On the use of self-organizing maps for clustering and visualization

Arthur Flexer

We will show that the number of output units used in a self-organizing map (SOM) influences its applicability for either clustering or visualization. By reviewing the appropriate literature and theory as well as our own empirical results, we demonstrate that SOMs can be used for clustering or visualization separately, for simultaneous clustering and visualization, and even for clustering via visualization. For all these different kinds of application, SOM is compared to other statistical approaches. This will show SOM to be a very flexible tool which can be used for various forms of explorative data analysis but it will also be made obvious that this flexibility comes with a price in terms of impaired performance.

Keywords: Neural Networks, Self-Organizing Maps, Clustering, Visualization, Multidimensional Scaling, Statistics

Citation: Flexer A.: On the use of self-organizing maps for clustering and visualization, in Zytkow J.M. and Rauch J.(eds.), Principles of Data Mining and Knowledge Discovery, Third European Conference, PKDD'99, Prague, Czech Republic, Proceedings, Lecture Notes in Artificial Intelligence 1704, Springer, p.80-88, 1999.


OFAI-TR-99-03 ( 148kB g-zipped PostScript file,  1113kB PDF file)

Monitoring human information processing via intelligent data analysis of EEG recordings

Arthur Flexer, Herbert Bauer

Human information processing can be monitored by analysing cognitive evoked potentials (EP) measurable in the electro encephalogram (EEG) during cognitive activities. In technical terms, both visualization of high dimensional sequential data and unsupervised discovery of patterns within this multivariate set of real valued time series is needed. Our approach towards visualization is to discretize the sequences via vector quantization and to perform a Sammon mapping of the codebook. Instead of having to conduct a time-consuming search for common subsequences in the set of multivariate sequential data, a multiple sequence alignment procedure can be applied to the set of one-dimensional discrete time series. The methods are described in detail and results obtained for spatial and verbal information processing are shown to be statistically valid, to yield an improvement in terms of noise attenuation and to be well in line with psychophysiological literature.

Keywords: Data Analysis, EEG, Vector Quantization, Sequence Alignment, Visualization, Cognition

Citation: Flexer A., Bauer H.: Monitoring human information processing via intelligent data analysis of EEG recordings, in Hand D.J., et al.(eds.), Advances in Intelligent Data Analysis, Third International Symposium, IDA-99, Amsterdam, The Netherlands, Proceedings, Lecture Notes in Computer Science 1642, Springer, p.137-148, 1999.


OFAI-TR-99-02 ( 101kB g-zipped PostScript file,  772kB PDF file)

Multi-channel piecewise selective averaging of cognitive evoked potentials with variable latency

Arthur Flexer, Herbert Bauer

This work is about the development of an alternative way of averaging evoked potentials (EP) of cognitive activities. Since the main assumption of invariant waveforms time locked to the eliciting events does not hold for cognitive EPs, averaging results in distorted estimates. Our alternative selective averaging finds similar subsequences of fixed length with variable latency which are common to all multi-channel EPs by transforming the multivariate time series to discrete sequences via vector quantization and applying a sequence alignment algorithm. The method yields a significant improvement over common averaging in terms of noise attenuation and is shown to be valid by comparison with results for random data. Results for EP data obtained during a spatial imagination task are reported.

Keywords: Signal Processing, EEG, Vector Quantization, Sequence Alignment, Cognition

Citation: Flexer A., Bauer H.: Multi-channel piecewise selective averaging of cognitive evoked potentials with variable latency, in Hu Y.-H., et al.(eds.), Neural Networks for Signal Processing IX, Institute of Electrical and Electronics Engineers, Inc., New York, NY, p.459-467, 1999.


OFAI-TR-99-01 ( 166kB g-zipped PostScript file,  243kB PDF file)

Erzeugung emotional gefärbter Sprache mit dem VieCtoS-Synthesizer

Erhard Rank

Es wurde der Prototyp eines Synthetisators für deutsche emotionale Sprache entwickelt. Die Realisierung erfolgte mittels des Halbsilbensynthesizers aus dem am ÖFAI entwickelten Vienna Concept-to-Speech Systems VieCtoS. Damit ist die Erzeugung von beliebigen deutschen Sätzen mit nichtneutralem Emotionsgehalt möglich. Durch die Verwendung eines konkatenativen Synthesekonzepts bleibt die klangliche Qualität dieser Methode auch für die emotionale Sprachsynthese erhalten.

Keywords: Speech Synthesis, Emotional Speech, Concatenative Synthesis

Citation: Rank E.: Erzeugung emotional gefärbter Sprache mit dem VieCtoS-Synthesizer, Austrian Research Institute for Artificial Intelligence, Vienna, TR-99-01, 1999.


OFAI-TR-98-33

Grounding Emotions in Adaptive Systems, Workshop Notes

Canamero D., Numaoka C., Petta P.

The proceedings of this workshop are available online at http://www.ai.univie.ac.at/~paolo/conf/sab98/sab98ws.html.

Keywords: Emotions, Adaptive Systems, Grounding

Citation: Canamero D., Numaoka C., Petta P. (ed.): Grounding Emotions in Adaptive Systems, Workshop Notes, 5th International Conference of the Society for Adaptive Behaviour (SAB98), Zurich, Switzerland, August 21, 1998.


OFAI-TR-98-32

Neonatal Ventilation Tutor (VIE-NVT), a Teaching Program for the Mechanical Ventilation of Newborn Infants

Werner Horn, Christian Popow, Christoph Stocker, Silvia Miksch

We developed a computer assisted program for training the medical staff in ventilating newborn infants. The Java-based client-server program consists of two modules: the instructor module enabling the domain expert to create courses of virtual patients, the tutor module running consultations with virtual patients. The tutor module displays the course of the transcutaneous blood gases and a table of the ventilator settings which can interactively be adjusted by the trainee to provide an adequate gas exchange to the virtual patient. VIE-NVT is currently tested at our neonatal intensive care unit.

Keywords: Neonatal Intensive Care, Tutoring System, Ventilation

Citation: Horn W., Popow C., Stocker C., Miksch S.: Neonatal Ventilation Tutor (VIE-NVT), a Teaching Program for the Mechanical Ventilation of Newborn Infants, in Horn W., et al.(eds.), Artificial Intelligence in Medicine, Springer, Berlin, LNAI 1620, pp.148-152, 1999.


OFAI-TR-98-31 ( 91kB g-zipped PostScript file,  21kB PDF file)

Clinical experience with VIE-PNN, a knowledge-based system for planning the parenteral nutrition of newborn infants

Lieselotte Kirchner, Christian Popow, Werner Horn, Maria Dobner, Andreas Seyfang, Silvia Miksch

Background: Knowledge-based systems are rarely used in the clinical routine. VIE-PNN, an interactive knowledge-based system, has been integrated in the local network of our patient data management system and used at the bedside since more than two years.

Objective: To evaluate the performance and acceptance of a routinely used knowledge-based system.

Methodology: Based on a few input data and the expert defined prescription rules, VIE-PNN calculates and displays suggestions for the components of parenteral nutrition solutions (PNS). These suggestions may interactively be changed by the prescribing physicians if considered necessary. For patients with partial enteral nutrition, the PNS components are reduced according to the ratio of parenteral/enteral fluid supply.

We prospectively analyzed 50 PNS calculated in parallel by VIE-PNN and manually (MAN), i.e. by using a hand held calculator. We retrospectively analyzed 5539 PNS stored in the system's database and evaluated a questionnaire asking physicians about their experience with VIE-PNN.

Results: The mean time needed for calculating a PNS was 2.4 (VIE-PNN) vs. 7.1 minutes (MAN) corresponding to daily time savings of about 3/4 hour for 10 PNS calculations. Expert review detected errors or omissions in 22% (VIE-PNN) vs. 56% (MAN) of the PNS prescriptions. All errors in the VIE-PNN based PNS were related to interactively changed values. Analyzing the 5539 stored PNS, 4 of 16 parameters were interactively changed by the prescribing physician. The questionnaires showed a good overall acceptance of VIE-PNN. Time savings and improvement of precision were rated as equally important benefits.

Conclusion: We conclude that the use of our knowledge-based system for PNS prescription led to important time savings and improvement of precision.

Keywords: Knowledge-Based System, Clinical Evaluation

Citation: Kirchner L., Popow C., Horn W., Dobner M., Seyfang A., Miksch S.: Clinical experience with VIE-PNN, a knowledge-based system for planning the parenteral nutrition of newborn infants, Submitted to: J. pediatrics.


OFAI-TR-98-30 ( 28kB g-zipped PostScript file,  295kB PDF file)

A Study Using n-gram Features for Text Categorization

Johannes Fürnkranz

In this paper, we study the effect of using n-grams (sequences of words of length n) for text categorization. We use an efficient algorithm for generating such n-gram features in two benchmark domains, the 20 newsgroups data set and 21,578 REUTERS newswire articles. Our results with the rule learning algorithm RIPPER indicate that, after the removal of stop words, word sequences of length 2 or 3 are most useful. Using longer sequences reduces classification performance.

Keywords: Machine Learning, Text Categorization

Citation: Fürnkranz J.: A Study Using n-gram Features for Text Categorization, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-30, 1998.


OFAI-TR-98-29 ( 29kB g-zipped PostScript file,  300kB PDF file)

Using Links for Classifying Web-pages

Johannes Fürnkranz

In this paper, we report on a systematic set of experiments that explore the utility of making use of such structural information. Our working hypothesis is that (at least in some domains) it is easier to classify hypertext pages using information provided on pages that point to a page instead of using information that is provided on the page itself. We present a set of experiments that confirm this hypothesis on a set of Web-pages that relate to Computer Science Departments.

Keywords: Text Categorization, Rule Learning, WWW, Hyperlinks

Citation: Fürnkranz J.: Using Links for Classifying Web-pages, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-29, 1998.


OFAI-TR-98-28 ( 198kB g-zipped PostScript file,  627kB PDF file)

A Symbolic Dynamics Approach to Volatility Prediction

Peter Tino, Christian Schittenkopf, Georg Dorffner, Engelbert Dockner

We consider the problem of predicting the direction of daily volatility changes in the Dow Jones Industrial Average (DJIA). This is accomplished by quantizing a series of historic volatility changes into a symbolic stream over 2 or 4 symbols. We compare predictive performance of the classical fixed-order Markov models with that of a novel approach to variable memory length prediction (called prediction fractal machine, or PFM) which is able to select very specific deep prediction contexts (whenever there is a sufficient support for such contexts in the training data). We learn that daily volatility changes of the DJIA only exhibit rather shallow finite memory structure. On the other hand, a careful selection of quantization cut values can strongly enhance predictive power of symbolic schemes. Results on 12 non-overlapping epochs of the DJIA strongly suggest that PFMs can outperform both traditional Markov models and (continuous-valued) GARCH models in the task of predicting volatility one time-step ahead.

Keywords: Variable Memory Length Markov Models, Iterative Function Systems, Volatility prediction

Citation: Tino P., Schittenkopf C., Dorffner G., Dockner E.J.: A Symbolic Dynamics Approach to Volatility Prediction, in Y.S. Abu-Mostafa, B. LeBaron, A.W. Lo, A.S. Weigend. (eds): Computational Finance 99, MIT Press, Cambridge, MA, 2000, pp. 137-151.


OFAI-TR-98-27 ( 38kB g-zipped PostScript file)

Perspective Effects in Non-Deontic Versions of the Wason Selection Task

Alexander Staller, Steven Sloman, Talia Ben-Zeev

Perspective effects in the Wason four-card selection task occur when people choose mutually exclusive sets of cards depending on the perspective they adopt when making their choice. Previous demonstrations of perspective effects have been limited to deontic contexts; i.e., problem contexts that involve social duty, like permissions and obligations. In three experiments, we demonstrate perspective effects in non-deontic contexts, including a context much like the original one employed by Wason (1966, 1968). We suggest that perspective effects arise whenever the task uses a rule that can be interpreted biconditionally and different perspectives elicit different counterexamples that match the predicted choice sets. This view is consistent with domain-general theories but not with domain-specific theories of deontic reasoning, e.g., pragmatic reasoning schemas and social contract theory, that cannot explain perspective effects in non-deontic contexts.

Keywords: Wason Selection Task, Reasoning, Cognitive Psychology

Citation: Staller A., Sloman S.A., Ben-Zeev T.: Perspective Effects in Non-Deontic Versions of the Wason Selection Task, Memory & Cognition, 28(3), 396-405, 2000.


OFAI-TR-98-26 ( 21kB g-zipped PostScript file,  35kB PDF file)

Generating Emotional Speech with a Concatenative Synthesizer

Erhard Rank, Hannes Pirker

We describe the attempt to synthesize emotional speech with a concatenative speech synthesizer using a parameter space covering not only f0, duration and amplitude, but also voice quality parameters, spectral energy distribution, harmonics-to-noise ratio, and articulatory precision. The application of these extended parameter set offers the possibility to combine the high segmental quality of concatenative synthesis with a wider range of control settings needed for the synthesis of natural affected speech.

Keywords: Speech Synthesis, Emotional Speech

Citation: Rank E., Pirker H.: Generating Emotional Speech with a Concatenative Synthesizer, Proc. of ICSLP'98, Sydney, Australia, Nov.30.-Dec.4., 1998.


OFAI-TR-98-25 ( 44kB g-zipped PostScript file,  39kB PDF file)

Generating Intonation Contours Using Tonal Specifications

Hannes Pirker, Erhard Rank, Harald Trost

We present a novel approach to intonation modelling for speech synthesis based on a two-layer technique. The generator component of a concept-to-speech system produces an abstract phonological representation of intonation based on GToBI interpreting the linguistic and discourse information available. This abstract representation must be translated into concrete acoustic parameters. The paper describes how this mapping is achieved with the use of stylized F0 contours.

Keywords: Speech Synthesis, Intonation

Citation: Pirker H., Rank E., Trost H.: Generating Intonation Contours Using Tonal Specifications, Proc. of TSD'98, Brno, Tzech Republic, Sept.23.-26., 1998.


OFAI-TR-98-24 ( 39kB g-zipped PostScript file,  71kB PDF file)

Towards a Tractable Appraisal-Based Architecture for Situated Cognizers

Alexander Staller, Paolo Petta

This paper introduces Tabasco (the name Tabasco is derived from the project's name "A Tractable Appraisal-Based Architecture for Situated Cognizers") an architecture for software agents aimed at integrating results from functional theories in emotion research and insights on the impact of the capacities and limitations of perception in a framework orientated along the situated "New AI"/ALife approach. This expository paper first briefly summarizes current views on the nature and function of emotion and then discusses related current appraisal theories in more detail. A survey of existing approaches to emotion synthesis is followed by a first outline of the Tabasco architecture, relating it to the areas of research in psychology, ALife and agent architectures.

Keywords: Situated Agents, Appraisal Theory (Psychology), Emotions, Affective Computing, Action Selection, Agent Architectures, Adaptive Systems

Citation: Staller A., Petta P.: Towards a Tractable Appraisal-Based Architecture for Situated Cognizers, Canamero D., et al.(eds.), Grounding Emotions in Adaptive Systems, Workshop Notes, Fifth International Conference of the Society for Adaptive Behaviour (SAB98), Zurich, Switzerland, August 21, 1998.


OFAI-TR-98-23 ( 39kB g-zipped PostScript file)

Semiosis in Embodied Autonomous Systems

Erich Prem

This paper discusses processes of semiosis in embodied autonomous systems such as behavior-based robots or animals. The starting point for this investigation are the peculiarities of embodied autonomous systems, i.e. the fact that they are physical systems with a body that is to be moved around in the real world without the help of a human supervisor. We revisit previous results about the nature of representation in such systems. We draw parallels with the philosophical work of Martin Heidegger and show the relevance of these accounts for the study of autonomous sign users. It will be argued that signs are a type of equipment for such systems that reveal a specific interaction context and serve to orient autonomous systems at specific action circuits. These considerations shed new light on a considerable amount of previous work about the usage or ``communication'' of signs in the field.

Keywords: Embodied AI, Autonomous Robots, Semiotics

Citation: Prem E.: Semiosis in Embodied Autonomous Systems, Proc. of the ISIC/CIRA/ISAS98, OmniPress, Madison WI, 1998.


OFAI-TR-98-22

Proceedings of KRIMS II The Second International Workshop on Knowledge Representation for Interactive Multimedia Systems

Paolo Petta, Marcus (eds.) Herzog

Keywords: Interactive Multimedia Systems, Knowledge Representation

Citation: Petta P., Herzog M. (eds.): Proceedings of KRIMS II The Second International Workshop on Knowledge Representation for Interactive Multimedia Systems, June 1, 1998, ITC-IRST, Povo (Trento) Italy, Co-located with KR'98, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-22, 1998.


OFAI-TR-98-21 ( 49kB g-zipped PostScript file)

Outliers and Bayesian Inference

Peter Sykacek

In this paper we report about an investigation in which we studied the properties of Bayes' inferred neural network classifiers in the context of outlier detection. The problem of misclassification due to outliers in the test data is seen as a serious problem in safety critical environments. We compare the usual way to deal with uncertainty in the Bayesian framework with a new approach based on the variance of the output layer activations and investigate the utility of both methods for outlier detection. The properties of both methods are visualized on a simple two dimensional classification problem. An investigation comparing both methods on some public data-sets with artificially constructed outlier patterns showed that a combination of the conventional method and the method proposed here should be used. These results where confirmed in a final experiment on real data, where a combination of both methods showed significantly better performance in rejecting outlying observations.

Keywords: Bayesian Inference, Outliers, Classification

Citation: Sykacek P.: Outliers and Bayesian Inference, Proc. of the International ICSC/IFAC Symposium on Neural Computation (NC 1998), pp. 973-978.


OFAI-TR-98-20 ( 51kB g-zipped PostScript file,  368kB PDF file)

Recurrent Neural Networks with Iterated Function Systems Dynamics

Peter Tino, Georg Dorffner

We suggest a recurrent neural network (RNN) model with a recurrent part corresponding to iterative function systems (IFS) introduced by Barnsley (1988) as a fractal image compression mechanism. The key idea is that 1) in our model we avoid learning the RNN state part by having non-trainable connections between the context and recurrent layers (this makes the training process less problematic and faster), 2) the RNN state part codes the information processing states in the symbolic input stream in a well-organized and intuitively appealing way. We show that there is a direct correspondence between the R' enyi entropy spectra characterizing the input stream and the spectra of R' enyi generalized dimensions of activations inside the RNN state space. We test both the new RNN model with IFS dynamics and its conventional counterpart with trainable recurrent part on two chaotic symbolic sequences. In our experiments, RNNs with IFS dynamics outperform the conventional RNNs with respect to information theoretic measures computed on the training and model generated sequences.

Keywords: Recurrent Neural Networks, Variable Memory Length Markov Models, Iterative Function Systems, Multifractal Theory

Citation: Tino P., Dorffner G.: Recurrent Neural Networks with Iterated Function Systems Dynamics, in NC'98, Proceedings of the ICSC/IFAC Symposium on Neural Computation, Vienna, Austria., pp.526-532, 1998.


OFAI-TR-98-18 ( 125kB g-zipped PostScript file,  1096kB PDF file)

Constructing finite-context sources from fractal representations of symbolic sequences

Peter Tino, Georg Dorffner

We propose a novel approach to constructing predictive models on long complex symbolic sequences. The models are constructed by first transforming the training sequence n-block structure into a spatial structure of points in a unit hypercube. The transformation between the symbolic and Euclidean spaces embodies a natural smoothness assumption (n-blocks with long common suffices are likely to produce similar continuations) in that the longer is the common suffix shared by any two n-blocks, the closer lie their point representations. Finding a set of prediction contexts is then formulated as a resource allocation problem solved by vector quantizing the spatial representation of the training sequence n-block structure. Our predictive models are similar in spirit to variable memory length Markov models (VLMMs). We compare the proposed models with both the classical and variable memory length Markov models on two chaotic symbolic sequences with different levels of subsequence distribution structure. Our models have equal or better modeling performance, yet, their construction is more intuitive (unlike in VLMMs, we have a clear idea about the size of the model under construction) and easier to automize (construction of our models can be done in a completely self-organized manner, which is shown to be problematic in the case of VLMMs).

Keywords: Markov Models, Variable Memory Length Markov Models, Iterative Function Systems, Multifractal Theory, Chaotic sequences

Citation: Tino P., Dorffner G.: Constructing finite-context sources from fractal representations of symbolic sequences, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-18, 1998. A revised version will appear as: Tino P., Dorffner G.: Predicting the future of discrete sequences from fractal representations of the past, Machine Learning, (to appear), 2000.


OFAI-TR-98-17 ( 339kB g-zipped PostScript file,  1257kB PDF file)

Spatial Representation of Symbolic Sequences through Iterative Function Systems

Peter Tino

Jeffrey (1990) proposed a graphic representation of DNA sequences using Barnsley's iterative function systems. In spite of further developments in this direction Oliver (1993), Roman (1994), Li (1997), the proposed graphic representation of DNA sequences has been lacking a rigorous connection between its spatial scaling characteristics and the statistical characteristics of the DNA sequences themselves. We 1) generalize Jeffrey's graphic representation to accommodate (possibly infinite) sequences over an arbitrary finite number of symbols, 2) establish a direct correspondence between the statistical characterization of symbolic sequences via R' enyi entropy spectra and the multifractal characteristics (R' enyi generalized dimensions) of the sequences' spatial representations, 3) show that for general symbolic dynamical systems, the multifractal fH-spectra in the sequence space coincide with the fH-spectra on spatial sequence representations.

Keywords: Iterative Function Systems, Multifractal Theory, Chaotic sequences, DNA sequences

Citation: A revised version appeared as: Tino P.: Spatial Representation of Symbolic Sequences through Iterative Function Systems, IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, 29(4), 1999., pp.386-392, 1998.


OFAI-TR-98-16 ( 65kB g-zipped PostScript file)

Identifying Stochastic Processes with Mixture Density Networks

Christian Schittenkopf, Georg Dorffner, Engelbert Dockner

In this paper we investigate the use of mixture density networks (MDNs) for identifying complex stochastic processes. Regular multilayer perceptrons (MLPs), widely used in time series processing, assume a gaussian conditional noise distribution with constant variance, which is unrealistic in many applications, such as financial time series (which are known to be heteroskedastic). MDNs extend this concept to the modeling of time-varying probability density functions (pdfs) describing the noise as a mixture of gaussians, the parameters of which depend on the input. We apply this method to identifying the process underlying daily ATX (Austrian stock exchange index) data. The results indicate that MDNs modeling a non-gaussian conditional pdf tend to be significantly better than traditional linear methods of estimating variance (ARCH) and also better than merely assuming a conditional gaussian distribution.

Keywords: Stochastic Processes, Mixture Density Networks, Time Series, Heteroscedasticity, Econometrics

Citation: Schittenkopf C., Dorffner G., Dockner E.J.: Identifying Stochastic Processes with Mixture Density Networks, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-16, 1998.


OFAI-TR-98-15 ( 42kB g-zipped PostScript file)

Volatility Prediction with Mixture Density Networks

Christian Schittenkopf, Georg Dorffner, Engelbert Dockner

Despite the lack of a precise definition of volatility in finance, the estimation of volatility and its prediction is an important problem. In this paper we compare the performance of standard volatility models and the performance of a class of neural models, i.e. mixture density networks (MDNs). First experimental results indicate the importance of long-term memory of the models as well as the benefit of using non-gaussian probability densities for practical applications.

Keywords: Mixture Density Networks, Volatility, Density Estimation, Econometrics

Citation: Schittenkopf C., Dorffner G., Dockner E.J.: Volatility Prediction with Mixture Density Networks, Proc. of the International Conference on Artificial Neural Networks (ICANN-98), pp. 929-934, Skövde, Sweden, September 2-4, 1998.


OFAI-TR-98-14 ( 99kB g-zipped PostScript file)

Selective averaging of cognitive evoked potentials

Arthur Flexer, Herbert Bauer

This work is about the development of an alternative way of averaging evoked potentials (EP) of cognitive activities. Since the main assumption of invariant waveforms time locked to the elicting events does not hold for cognitive EPs, averaging results in distorted estimates. Our alternative selective averaging finds similar subsequences of fixed length with variable latency which are common to all EPs by transforming the mulivariate time series to discrete sequences via vector quantization and applying a sequence alignment algorithm. The method yields a significant improvement over common averaging in terms of noise attenuation and is shown to be valid by comparison with results for random data. Results for EP data obtained during a spatial imagination task are reported.

Keywords: Psychophysiology, Vector Quantization, Sequence Alignment, Evoked Potentials, EEG, Cognition

Citation: Flexer A., Bauer H.: Selective averaging of cognitive evoked potentials, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-14, 1998.


OFAI-TR-98-13 ( 181kB g-zipped PostScript file)

VieCtoS-Speech Synthesizer, an Overview

Erhard Rank, Hannes Pirker

This report describes the overall architecture of the speech synthesis module developed for VieCtoS, the Vienna concept-to-speech system for Austrian German. A technical description of the methods used for the representation of inventory elements, their concatenation and the facilities for interpreting and superimposing prosody is presented. An overview of the implementation and the user environment as well as some details concerning program and test design are included in order to facilitate usage and further development.

Keywords: Speech, Speech Synthesis, Prosody

Citation: Rank E., Pirker H.: VieCtoS-Speech Synthesizer, an Overview, An extract from this report titled ``Realization of Prosody in a Speech Synthesizer for German'' has been submitted to 4. Konferenz zur Verarbeitung natürlicher Sprache - KONVENS 98), University Bonn, Germany, October 5-7, 1998.


OFAI-TR-98-12 ( 204kB g-zipped PostScript file,  91kB PDF file)

Metaphor graphics to visualize ICU data over time

Werner Horn, Christian Popow, Lukas Unterasinger

The time-oriented analysis of electronic patient records at a (neonatal) intensive care unit is a tedious and time-consuming task. The vast amount of data available makes it hard for the physician to recognize the essential changes over time. VIE-VISU is a data visualization system which uses multiples to present the change in the patient's status over time in graphic form. Metaphor graphics is used to sketch the parameters most relevent in characterizing the situation of a patient.

Keywords: Visualization, Metaphor Graphics, ICU

Citation: Horn W., Popow C., Unterasinger L.: Metaphor graphics to visualize ICU data over time, in: "Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP-98)", Workshop Notes of the ECAI-98 Workshop, Brighton, UK, 25 August 1998.


OFAI-TR-98-11 ( 71kB g-zipped PostScript file)

Bayesian Classifiers are Large Margin Hyperplanes in a Hilbert Space

Nello Cristianini, John Shawe-Taylor, Peter Sykacek

Bayesian algorithms for Neural Networks are known to produce classifiers which are very resistant to overfitting. It is often claimed that one of the main distinctive features of Bayesian Learning Algorithms is that they don't simply output one hypothesis, but rather an entire distribution of probability over an hypothesis set: the Bayes posterior. An alternative perspective is that they output a linear combination of classifiers, whose coefficients are given by Bayes theorem. One of the concepts used to deal with thresholded convex combinations is the `margin' of the hyperplane with respect to the training sample, which is correlated to the predictive power of the hypothesis itself.

We provide a novel theoretical analysis of such classifiers, based on Data-Dependent VC theory, proving that they can be expected to be large margin hyperplanes in a Hilbert space. We then present experimental evidence that the predictions of our model are correct, i.e. that Bayesian classifiers really find hypotheses which have large margin on the training examples.

This not only explains the remarkable resistance to overfitting exhibited by such classifiers, but also co-locates them in the same class of other systems, like Support Vector machines and Adaboost, which have a similar performance.

Keywords: Bayesian Inference, Large Margin Hyperplanes, Statistical Learning Theory

Citation: Cristianini N., Shawe-Taylor J., Sykacek P.: Bayesian Classifiers are Large Margin Hyperplanes in a Hilbert Space, in: Proceedings of the Fifteenth International Conference on Machine Learning, University of Wisconsin, Madison, USA, July 24-26, 1998.


OFAI-TR-98-10 ( 48kB g-zipped PostScript file)

Stochastic Propositionalization of Non-Determinate Background Knowledge

Stefan Kramer, Bernhard Pfahringer, Christoph Helma

Both propositional and relational learning algorithms require a good representation to perform well in practice. Usually such a representation is either engineered manually by domain experts or derived automatically by means of so-called constructive induction. Inductive Logic Programming (ILP) algorithms put a somewhat less burden on the data engineering effort as they allow for a structured, relational representation of background knowledge. In chemical and engineering domains, a common representational device for graph-like structures are so-called non-determinate relations. Manually engineered features in such domains typically test for or count occurrences of specific substructures having specific properties. However, representations containing non-determinate relations pose a serious efficiency problem for most standard ILP algorithms. Therefore, we have devised a stochastic algorithm to automatically derive features from non-determinate background knowledge. The algorithm conducts a top-down search for first-order clauses, where each clause represents a binary feature. These features are used instead of the non-determinate relations in a subsequent induction step. In contrast to comparable algorithms search is not class-blind and there are no arbitrary size restrictions imposed on candidate clauses. An empirical investigation in three chemical domains supports the validity and usefulness of the proposed algorithm.

Keywords: Machine Learning, ILP, Feature Construction

Citation: Kramer S., Pfahringer B., Helma C.: Stochastic Propositionalization of Non-Determinate Background Knowledge, Proceedings Eight International Conference on Inductive Logic Programming (ILP98).


OFAI-TR-98-09 ( 48kB g-zipped PostScript file)

Inducing Small and Accurate Decision Trees

Bernhard Pfahringer

Recently, the quality improvement of decision trees and classifiers in general achievable by extended search efforts has received quite some attention in the literature. Contrary to the construction of ensembles of classifiers, which aims at improving overall predictive accuracy, our approach aims at improving the intelligibility of a single classifier. Our goal is the induction of a single, small, yet accurate decision tree. We describe a simple prepruning method (PreC4) that uses cross-validation to determine an appropriate stopping point for tree construction in a reliable manner. In addition to comparison with C4.5, PreC4 is also evaluated against both Robust-C4 and the combination of the two methods (Robust-PreC4). Evaluation domains comprise two artificial problems as well as a selection of small- and medium-sized UCI databases. Experimental results confirm that trees generated by both PreC4 and Robust-C4 are reasonably accurate but at the same time consistently smaller than trees generated by C4.5. PreC4 usually achieves a much larger tree-size reduction than Robust-C4 does. Interestingly, the combined procedure Robust-PreC4 does not perform as well. Trees generated by Robust-PreC4 are the smallest ones overall, but unfortunately they are also less accurate in some domains where they seemingly underfit the respective target concepts. In summary, PreC4 induces much smaller trees of comparable predictive accuracy.

Keywords: , Machine Learning, Decision Tree, Pruning, Noise

Citation: Pfahringer B.: Inducing Small and Accurate Decision Trees, Submitted to International Conference on Machine Learning (ICML-98)


OFAI-TR-98-08 ( 42kB g-zipped PostScript file)

Predicting Ordinal Classes in ILP

Gerhard Widmer, Stefan Kramer, Bernhard Pfahringer, Michael De Groeve

This paper is devoted to the problem of learning to predict ordinal (i.e., ordered discrete) classes in an ILP setting. We start with a relational regression algorithm named SRT (Structural Regression Trees) and study various ways of transforming it into a first-order learner for ordinal classification tasks. Combinations of these algorithm variants with several data preprocessing methods are compared on two ILP benchmark data sets to verify the relative strengths and weaknesses of the strategies and to study the trade-off between optimal categorical classification accuracy (hit rate) and minimum distance-based error. Preliminary results indicate that this is a promising avenue towards algorithms that combine aspects of classification and regression in relational learning.

Keywords: Machine Learning, Inductive Logic Programming

Citation: Widmer G., Kramer S., Pfahringer B., De Groeve M.: Predicting Ordinal Classes in ILP, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-08, 1998.


OFAI-TR-98-07 ( 86kB g-zipped PostScript file)

Discovery of common subsequences in cognitive evoked potentials

Arthur Flexer, Herbert Bauer

This work is about developing a new method for the analysis of evoked potentials (EP) of cognitive activities that combines methods from statistics and sequence alignment to tackle the following two problems: the visualization of high dimensional sequential data and the unsupervised discovery of patterns within this multivariate set of real valued time series data. The sequence of the original high dimensional vectors is replaced by a sequence of prototypical codebook vectors obtained from a clustering procedure. A dimensionality reduction technique is applied to obtain an ordered one-dimensional representation of codebook vectors which allows for the depiction of the original sequences as one-dimensional time series. As a result, instead of having to search for common subsequences in the set of multivariate sequential data a multiple sequence alignment procedure can be applied to the set of one-dimensional discrete symbolic time series. The methods are described in detail and the results are shown to be significantly better than those obtained for two sets of randomized artificial data. This result is further corroborated by a one-way analysis of variance.

Keywords: Data Mining, Sequence Analysis, Clustering, Visualization, Application

Citation: Flexer A., Bauer H.: Discovery of common subsequences in cognitive evoked potentials, in Zytkow J.M. & Quafafou M.(eds.), Principles of Data Mining and Knowledge Discovery, Second European Symposium, PKDD '98, Proceedings, Lecture Notes in Artificial Intelligence 1510, p.309-317, 1998.


OFAI-TR-98-06

A comparative study on feedforward and recurrent neural networks in time series prediction using gradient descent learning

Manfred Hallas, Georg Dorffner

This paper reports about a comparative study on several linear and nonlinear feedforward and recurrent neural networks trained on artificially created time series. This has lead to interesting empirical results about the capabilities of these network models trained with a gradient descent learning procedure. Several of the time series were generated by some of the neural network models, in order to test whether they could learn to predict a time series which they could theoretically perfectly model. The results show that recurrent networks do not seem to be able to do so under the given conditions. They also show that a simple feedforward network (a nonlinear autoregressive model) significantly performs best for most of the nonlinear time series. These empirical results can be taken as valuable hints with respect to the practical application of neural networks in prediction tasks.

Keywords: Neural Networks, Time Series Analysis, Recurrent Networks, Prediction

Citation: Hallas M., Dorffner G.: A comparative study on feedforward and recurrent neural networks in time series prediction using gradient descent learning, Trappl R. (ed.), Cybernetics and Systems '98, ÖSGK, Vienna, 1998.


OFAI-TR-98-05 ( 59kB g-zipped PostScript file)

Generating Declarative Language Bias for Top-Down ILP Algorithms

Stefan Kramer

Many of today's algorithms for Inductive Logic Programming (ILP) put a heavy burden and responsibility on the user, because their declarative bias have to be defined in a rather low-level fashion. To address this issue, we developed a method for generating declarative language bias for top-down ILP systems from high-level declarations. The key feature of our approach is the distinction between a user level and an expert level of language bias declarations. The expert provides abstract meta-declarations, and the user declares the relationship between the meta-level and the given database to obtain a low-level declarative language bias. The suggested languages allow for compact and abstract specifications of the declarative language bias for top-down ILP systems using schemata. We verified several properties of the translation algorithm that generates schemata, and applied it successfully to a few chemical domains. As a consequence, we propose to use a two-level approach to generate declarative language bias.

Citation: Kramer S.: Generating Declarative Language Bias for Top-Down ILP Algorithms, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-05, 1998.


OFAI-TR-98-04 ( 46kB g-zipped PostScript file)

From Information Structure to Intonation: A Phonological Interface for Concept-to-Speech

Hannes Pirker, Georg Niklfeld, Johannes Matiasek, Harald Trost

The paper describes a component that interfaces between the generator and the synthesizer of a German language concept-to-speech system. It discusses phenomena in German intonation that depend on the interaction between grammatical dependencies (projection of information structure into syntax) and prosodic context (performance-related modifications to intonation patterns). The grammatical factors are covered by the unification-based generation grammar, whereas genuinely prosodic factors are implemented in the interface module, where influences like phonological distance between tonal accents are encoded more directly.

An extended two-level phonology component represents the core interface where the modules for grammar processing and speech synthesis meet and communicate. In a concept-to-speech system with its various modules built on diverse technological foundations, there is a strong case for having such a robust and flexible component that nevertheless offers a large degree of conceptual transparency. As the overall objective of the project was to investigate whether and how conditions in concept-to-speech favour a more elaborate treatment of prosodic parameters in speech generation, a fairly complex model of phonology was required. Phonological processing in the system comprises segmental as well as suprasegmental dimensions such as syllabification, phenomena resulting in the modification of word stress positions, and a symbolic encoding of intonation contour. Phonological phenomena often touch upon more than one of these dimensions, so that mutual accessibility of the data structures on each dimension had to be ensured. We present a linear representation of the multidimensional phonological data based on a straightforward linearization convention, which suffices to bring this conceptually multilinear data set under the scope of the well-known processing techniques for two-level morphology.

Keywords: Concept-to-Speech, Finite State Phonology, Intonation

Citation: Pirker H., Niklfeld G., Matiasek J., Trost H.: From Information Structure to Intonation: A Phonological Interface for Concept-to-Speech, extended version of a paper in Proceedings of COLING/ACL-98, Montral, Canada, 1998.


OFAI-TR-98-03 ( 15kB g-zipped PostScript file)

A System of Stylized Intonation Contours for German

Hannes Pirker, Kai Alter, Erhard Rank, Johannes Matiasek, Harald Trost, Gernot Kubin

Modeling intonation, i.e., specifying adequate fundamental frequency (F0) contours, remains a challenging task for speech synthesis systems. This paper discusses the development of a system for phonetically specifying intonation contours for German. It deals with the problem of translating an abstract phonological representation of intonation - namely the tone-sequence model - into a concrete phonetic model. Design options and evaluation methods are discussed.

Keywords: Phonology, Phonetics, Intonation

Citation: Pirker H., Alter K., Rank E., Matiasek J., Trost H., Kubin G.: A System of Stylized Intonation Contours for German, In Proc. of the 5th European Conference on Speech Communication and Technology (EUROSPEECH-97), Vol.1, pp.307-310 University of Patras, Greece, 1998


OFAI-TR-98-02 ( 27kB g-zipped PostScript file)

On the Specification of Sentence Initial F0-Patterns in German

Kai Alter, Hannes Pirker

It is widely accepted that linguistic high level information like information structure (focus-background-division, topicalization) influences accent placement and accent type (rising, falling, hat pattern etc.) in German. Previous research was concentrated on tonal patterns associated with focus and topic. We will demonstrate that 1) sentence initial tonal variation and 2) syllable duration depend on the location and the type of focus in the sentence. We assume that focussed material forms its own prosodic domain. Both prosodic parameters show a significant variation depending on whether they are associated with prosodic categories inside or outside of focussed domains.

Keywords: Phonology, Intonation, Information Structure

Citation: Alter K., Pirker H.: On the Specification of Sentence Initial F0-Patterns in German, In Botinis A., et al.(eds.), Intonation: Theory, Models and Applications, University of Athens, Greece, 1998


OFAI-TR-98-01 ( 51kB g-zipped PostScript file)

Constraint Handling Rules Reference Manual, Release 2.0

Christian Holzbaur, Thom Frühwirth

This manual documents the Sicstus library implemention of Constraint Handling Rules (CHR). The high-level CHR are an excellent tool for rapid prototyping and implementation of constraint handlers. The usual abstract formalism to describe a constraint system, i.e. inference rules, rewrite rules, sequents, formulas expressing axioms and theorems, can be written as CHR in a straightforward way. The CHR library includes a compiler, which translates CHR programs into Prolog programs on the fly, and a runtime system, which includes a stepper for debugging.

Keywords: Constraints, Constraint Logic Programming, Prolog, Rewriting

Citation: Holzbaur C., Frühwirth T.: Constraint Handling Rules Reference Manual, Release 2.0, Austrian Research Institute for Artificial Intelligence, Vienna, TR-98-01, 1998.


OFAI-TR-97-33 ( 55kB g-zipped PostScript file)

Knowledge Discovery in Chess Databases: A Research Proposal

Johannes Fürnkranz

In this paper we argue that chess databases have a significant potential as a test-bed for techniques in the area of Knowledge Discovery in Databases (KDD). Conversely, we think that research in Artificial Intelligence has not yet come up with reasonable solutions for the knowledge representation and reasoning problems that are posed by knowledge-based computer chess programs, and consequently argue that KDD techniques could be useful for the advancement of various types of knowledge-based computer chess systems. Although we cannot present any concrete results, we hope to outline some fruitful directions for further research and exchange of ideas between the KDD and computer chess communities.

Keywords: KDD, Data Mining, Chess, Game Playing

Citation: Fürnkranz J.: Knowledge Discovery in Chess Databases: A Research Proposal, Austrian Research Institute for Artificial Intelligence, Vienna, TR-97-33, 1997.


OFAI-TR-97-32 ( 63kB g-zipped PostScript file)

Discovering Compressive Partial Determinations in Mixed Numerical and Symbolic Domains

Bernhard Pfahringer, Stefan Kramer

Partial determinations are an interesting form of dependency between attributes in a relation. They generalize functional dependencies by allowing exceptions. We modify a known MDL formula for evaluating such partial determinations to allow for its use in an admissible heuristic in exhaustive search. Furthermore we describe an efficient preprocessing-based approach for handling numerical attributes. An empirical investigation tries to evaluate the viability of the presented ideas.

Keywords: Machine Learning, Functional Dependencies, Noise

Citation: Pfahringer B., Kramer S.: Discovering Compressive Partial Determinations in Mixed Numerical and Symbolic Domains, Proc. of the European Meeting on Cybernetics and Systems, (EMCSR98), Vienna, Austria, 1998.


OFAI-TR-97-31 ( 223kB g-zipped PostScript file)

Improving Bagging Performance by Increasing Decision Tree Diversity

Bernhard Pfahringer, Ian Witten

Ensembles of decision trees often exhibit greater predictive accuracy than single trees alone. Bagging and boosting are two standard ways of generating and combining multiple trees. Boosting has been empirically determined to be the more effective of the two, and it has recently been proposed that this may be because it produces more diverse trees than bagging. This paper reports empirical findings that strongly support this hypothesis. We enforce greater decision tree diversity in bagging by a simple modification of the underlying decision tree learner that utilizes randomly-generated decision stumps of predefined depth as the starting point for tree induction. The modified procedure yields very competitive results while still retaining one of the attractive properties of bagging: all iterations are independent. Additionally, we also investigate a possible integration of bagging and boosting. All these ensemble-generating procedures are compared empirically on various domains.

Keywords: Machine Learning, Bagging, Boosting, Decision Trees

Citation: Pfahringer B., Witten I.: Improving Bagging Performance by Increasing Decision Tree Diversity, Submitted to Machine Learning Journal, Special Issue on the Integration of Multiple Learned Models.


OFAI-TR-97-30 ( 44kB g-zipped PostScript file)

On the Induction of Intelligible Ensembles

Bernhard Pfahringer

Ensembles of classifiers, e.g. decision trees, often exhibit greater predictive accuracy than single classifiers alone. Bagging and boosting are two standard ways of generating and combining multiple classifiers. Unfortunately, the increase in predictive performance is usually linked to a dramatic decrease in intelligibility: ensembles are more or less black boxes comparable to neural networks. So far attempts at pruning of ensembles have not been very successful, approximately reducing ensembles into half. This paper describes a different approach which both tries to keep ensemble-sizes small during induction already and also limits the complexity of single classifiers rigorously. Single classifiers are decision-stumps of a prespecified maximal depth. They are combined by majority voting. Ensembles are induced and pruned by a simple hill-climbing procedure. These ensembles can reasonably be transformed into equivalent decision trees. We conduct some empirical evaluation to investigate both predictive accuracies and classifier complexities.

Keywords: Machine Learning, Ensembles, Bagging

Citation: Pfahringer B.: On the Induction of Intelligible Ensembles, Poster submission for ECML98


OFAI-TR-97-28 ( 59kB g-zipped PostScript file)

On the Potential of Machine Learning for Music Research

Gerhard Widmer

This chapter argues that the branch of AI known as Machine Learning (ML) can make useful contributions to music research, if employed in a thoughtful way. After giving a brief introduction to machine learning and discussing some general methodological questions, the article presents an ongoing project by the author as an example of a substantial and highly non-trivial application of machine learning to a musical problem. The basic music-theoretic assumptions of the project are discussed, the general method is briefly described, and some exemplary results are presented to give the reader an appreciation of the kinds of benefits musicology may draw from such research.

Keywords: Machine Learning, Music Research, Musical Expression, Expressive Performance

Citation: Widmer G.: On the Potential of Machine Learning for Music Research, To appear in: Contemporary Music Review, Special Issue on Artificial Intelligence and Music, 1998


OFAI-TR-97-27 ( 27kB g-zipped PostScript file)

Learning in Dynamically Changing Domains: Recent Contributions of Machine Learning

Gerhard Widmer

This document gives a brief summary of, and numerous bibliographic references to, recent contributions that the field of (mostly symbolic) machine learning has made to the problem of learning in dynamically changing domains. The paper accompanied an invited talk at the MLNet Workshop on Learning in Dynamically Changing Domains, held in the context of the Ninth European Conference on Machine Learning (ECML-97), Prague, Czech Republic, April 1997.

Citation: Widmer G.: Learning in Dynamically Changing Domains: Recent Contributions of Machine Learning, Invited Talk, appeared in Proceedings of the MLNet Workshop on Learning in Dynamically Changing Domains: Theory Revision and Context Dependence Issues, Prague, Czech Republic, 1997


OFAI-TR-97-26 ( 29kB g-zipped PostScript file)

Dimensionality Reduction in ILP: A Call to Arms

Johannes Fürnkranz

The recent uprise of Knowledge Discovery in Databases (KDD) has underlined the need for machine learning algorithms to be able to tackle large-scale applications that are currently beyond their scope. One way to address this problem is to use technologies for reducing the dimensionality of the learning problem by reducing the hypothesis space and/or reducing the example space. While research in machine learning has devoted considerable attention to such techniques, they have so far been neglected in ILP research. The purpose of this paper is to motivate research in this area and to present some results on windowing techniques.

Citation: Fürnkranz J.: Dimensionality Reduction in ILP: A Call to Arms, Proceedings of the IJCAI-97 Workshop on Frontiers of Inductive Logic Programming, Nagoya, Japan, August 1997


OFAI-TR-97-25 ( 56kB g-zipped PostScript file)

On Effort in AI Research: A Description along Two Dimensions

Franz-Günter Winkler, Johannes Fürnkranz

In this paper we describe Artificial Intelligence as research that moves along two different axes: human-compatible knowledge and machine-compatible processing. An analysis of computer chess research along these dimensions shows that AI more and more diverges into an engineering branch and a cognitive branch. As an explanation, we offer a hypothesis about the dependency of research effort on these dimensions. It becomes obvious that the most rewarding projects are the hardest.

Citation: Winkler F., Fürnkranz J.: On Effort in AI Research: A Description along Two Dimensions, Proceedings of the AAAI-97 Workshop on Deep Blue vs. Kasparov: The Significance for Artificial Intelligence, Providence, R.I., July 1997


OFAI-TR-97-24 ( 44kB g-zipped PostScript file)

When Pseudowords Become Words - Effects of Learning on Orthographic Similarity Priming

Georg Dorffner, Catherine Harris

This paper investigates empirical predictions of a connectionist model of word learning. The model predicts that, although the mapping between word form and meaning is arbitrary (thus rendering words as being symbols in the semiotic sense), novel pseudowords will be able to prime the concepts corresponding to word forms that are orthographically similar. If, however, pseudowords acquire meaning through an arbitrary mapping, this priming should be reduced. Two experiments support this hypothesis. Pseudowords, derived from and thus orthographically similar to English words, primed a categorization task involving those similar words. After a subsequent learning phase, in which subjects are asked to learn meanings for the pseudowords, this priming disappears. This interplay between iconic and symbolic use of words is proposed to emerge from connectionist learning procedures.

Keywords: Connectionism, Cognitive Modeling, Psycholinguistics

Citation: Dorffner G., Harris C.: When Pseudowords Become Words - Effects of Learning on Orthographic Similarity Priming, Shafto M.G. & Langley P.(eds.), Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society, Lawrence Erlbaum, Mahwah, NJ, pp.185-190.


OFAI-TR-97-23 ( 34kB g-zipped PostScript file)

Can neural networks improve signal processing? A criticial assessment from the ANNDEE project

Georg Dorffner

This paper reports about the critical assessment of neural network applications in the domain of electroencephalography (EEG) processing, which can be considered as an outcome of the EU-project ANNDEE. It is argued that neural networks must be viewed as an integral part of statistical data analysis. As such, only some types of neural network can provide truly novel contributions to data analysis -- especially with respect to non-linearity. Furthermore, only proper use of these neural networks can assure robust results, while previously inappropriate use has often lead to over-optimistic conclusions. These major arguments about neural networks are highlighted with concrete results from partners in the ANNDEE project. The conclusion is that neural networks can indeed improve signal processing applications, but one must always retain a critical attitude and consider alternative pattern recognition methods.

Keywords: Neural Networks, Signal Processing, EEG, EU-Project

Citation: Dorffner G.: Can neural networks improve signal processing? A criticial assessment from the ANNDEE project, Proceedings of Measurement '97, Smolenice, 1997.


OFAI-TR-97-21

Evaluating confidence measures in a neural network based sleep stager

Peter Sykacek, Georg Dorffner, Peter Rappelsberger, Josef Zeitlhofer

In this paper we report our studies of automatic sleep staging based on electroencephalogram (EEG) and electrooculogram (EOG) signals. We report all steps which were performed to build an automatic sleep stager. These include preprocessing, feature selection and classification by means of artificial neural networks. The main focus of the paper is on classification by means of neural networks. We compare three different approaches for neural network trainin which include Bayesian Inference. From using more sophisticated learning algorithms, we expected better results in terms of classification accuracy and reliability of the results. In our case reliability means robustness against outliers in the test data. Using Bayesian inference resulted in higher classification accuracy, which needed less fine tuning compared with simpler methods. On the other hand we observed some troubles in refusing classification of artefacts when using the current practice for Bayes' inferred classifiers. We therefore investigated an "error-bar" like measure for outliers detection, which is also derived from the Bayesian solution. Experiments showed that combining both measures we can detect outliers with better reliability without seriously reducing the number of classifications.

Keywords: Neural Networks, Bayesian Inference, Outliers Detection

Citation: Sykacek P., Dorffner G., Rappelsberger P., Zeitlhofer J.: Evaluating confidence measures in a neural network based sleep stager, To be submitted to: IEEE Trans. Biomed. Engineering, extended abstract submitted to: NIPS-97


OFAI-TR-97-20

Outlier Detection with Bayes' Inferred Classifiers

Peter Sykacek

In this paper we report about an investigation in which we studied the properties of Bayes' inferred neural network classifiers in the context of outlier detection. The problem of misclassifications due to outliers in the test data is seen as a serious problem in safety critical environments. In this paper we compare D.J. MacKay's approach with a method based on an "error-bar" like measure and investigate the utility of both methods for outlier detection. The properties of both methods are visualized on a simple two dimensional classification problem, a final investigation compares both methods on some public data-sets with artificially constructed outlier patterns. We therefore suggest a combination of both methods, which showed significantly better performance on the selected test data.

Citation: Sykacek P.: Outlier Detection with Bayes' Inferred Classifiers, Submitted to: NIPS-97


OFAI-TR-97-19 ( 43kB g-zipped PostScript file)

The world according to a humanoid robot

Erich Prem

The new field of embodied Artificial Intelligence deals with the construction of robotic systems that exhibit high interaction dynamics with the world and intelligent behavior. The construction of humanoid robots is particularly challenging and poses many foundational and methodological questions. Two such questions are "What does the world look like for a human-like artificial systems?" and "Why should we care?". It is argued that the study of autonomous intelligent systems must be based on a new understanding of the relation between the cognizer and its environment. We describe building blocks of such an approach to the construction of humanoid systems.

Keywords: Embodied AI, Humanoid Robot, Epistemology, Ontology

Citation: Prem E.: The world according to a humanoid robot, Proc. of the Workshop on Teleoperation and Robotics, June 1997, Linz, Austria, Schriftenreihe der OCG, Austrian Computer Society, 1997.


OFAI-TR-97-18 ( 54kB g-zipped PostScript file)

Neural Networks for Recognizing Patterns in Cardiotocograms

Claudia Ulbricht, Georg Dorffner, Andreas Lee

The cardiotocogram (CTG) is commonly used for routine fetal monitoring in the delivery room. A major problem is that the interpretation of the CTG trace requires experienced specialists. In order to avoid long gaps between the detection of a suspicious pattern and the intervention, the CTG has to be checked in short intervals. An automated monitoring system at the obstetric site can reduce such delays. Therefore, an alarm system immediately reporting suspicious events has been built. The focus of our study was put on the question whether AI techniques such as neural networks are suited to the task of recognizing patterns in the CTG trace. In a comparative study, their performance was evaluated against that of conventional methods. The neural networks turned out to provide significantly better results than the tested conventional methods.

Keywords: Connectionism, Medicine, Monitoring

Citation: Ulbricht C., Dorffner G., Lee A.: Neural Networks for Recognizing Patterns in Cardiotocograms, Submitted to: Artificial Intelligence in Medicine


OFAI-TR-97-17 ( 38kB g-zipped PostScript file)

Stochastic Propositionalization of Non-Determinate Background Knowledge

Stefan Kramer

It is a well-known fact that propositional learning algorithms require "good" features to perform well in practice. So a major step in data engineering for inductive learning is the construction of good features by domain experts. These features often represent properties of structured objects, where a property typically is the occurrence of a certain substructure having certain properties. To partly automate the process of "feature engineering", we devised an algorithm that searches for features which are defined by such substructures. The algorithm stochastically conducts a top-down search for first-order clauses, where each clause represents a binary feature. It differs from existing algorithms in that its search is not class-blind, and that it is capable of considering clauses ("context") of almost arbitrary length (size). Preliminary experiments are favorable, and support the view that this approach is promising.

Citation: Kramer S.: Stochastic Propositionalization of Non-Determinate Background Knowledge, Austrian Research Institute for Artificial Intelligence, Vienna, TR-97-17, 1997.


OFAI-TR-97-16 ( 44kB g-zipped PostScript file)

Mining for Causes of Cancer: Machine Learning Experiments at Various Levels of Detail

Stefan Kramer, Bernhard Pfahringer, Christoph Helma

This paper presents, from a methodological point of view, first results of an interdisciplinary project in scientific data mining. We analyze data about the carcinogenicity of chemicals derived from the carcinogenesis bioassay program, a long-term research study performed by the US National Institute of Environmental Health Sciences. The database contains detailed descriptions of 6823 tests performed with more than 330 compounds and animals of different species, strains and sexes. The chemical structures are described at the atom and bond level, and in terms of various relevant structural properties. The goal of this paper is to investigate the effects that various levels of detail and amounts of information have on the resulting hypotheses, both quantitatively and qualitatively. We apply relational and propositional machine learning algorithms to learning problems formulated as regression or as classification tasks. In addition, these experiments have been conducted with two learning problems which are at different levels of detail. Quantitatively, our experiments indicate that additional information not necessarily improves accuracy. Qualitatively, a number of potential discoveries have been made by the algorithm for Relational Regression, because it is not forced to abstract from the details contained in the relations of the database.

Citation: Kramer S., Pfahringer B., Helma C.: Mining for Causes of Cancer: Machine Learning Experiments at Various Levels of Detail, Austrian Research Institute for Artificial Intelligence, Vienna, TR-97-16, 1997.


OFAI-TR-97-15 ( 76kB g-zipped PostScript file)

Data Mining - Methoden und Anwendungen

Johann Petrak

Data Mining oder Knowledge Discovery in Databases ist ein Forschungsgebiet, das in den letzten Jahren aufgrund seines großen Anwendungspotentials auf großes Interesse gestoßen ist. Dieser Report führt in das Gebiet ein und gibt einen Überblick über Motivation, Methoden und aktuelle Forschungsrichtungen und präsentiert einige typische Anwendungen von Data Mining.

Keywords: Data Mining, Machine Learning

Citation: Petrak J.: Data Mining - Methoden und Anwendungen, Austrian Research Institute for Artificial Intelligence, Vienna, TR-97-15, gemeinsame Publikation mit dem Christian Doppler Labor für Expertensysteme, 1997.


OFAI-TR-97-14 ( 52kB g-zipped PostScript file)

Epistemic Autonomy in Models of Living Systems

Erich Prem

This paper discusses epistemological consequences of embodied AI for Artificial Life models. The importance of robotic systems for ALife lies in the fact that they are not purely formal models and thus have to address issues of semantic adaptation and epistemic autonomy, which means the system's own ability to decide upon the validity of measurements. Epistemic autonomy in artificial systems is a difficult problem that poses foundational questions. The proposal is to concentrate on biological transformations of epistemological questions that have lead to the development of modern ethology. Such an approach has proven to be useful in the design of control systems for behavior-based robots. It leads to a better understanding of modern ontological conceptions as well as a reacknowledgement of finality in the description and design of autonomous systems.

Keywords: Epistemic Autonomy, Embodied AI, Epistemology, Robotics, Theoretical Biology, Finality, Teleology, Ontology

Citation: Prem E.: Epistemic Autonomy in Models of Living Systems, Proc. of the Fourth Europ.Conf. on Artif.Life, MIT Press/Bradford Books, 1997.


OFAI-TR-97-13

Erfahrungen mit einem selbst entwickelten, WWW-basierten Lernprogramm für das Medizinisch-Chemische Praktikum

Aurel Botz, Paolo Petta, Dieter Haider, Klaus Rapf, Karl Kremser, Richard März

Web-based learning software supporting self-study in the field of Medical Chemistry was developed and used for the first time in the fall/winter term 96/97. The program is easy to modify since it is structured in two parts: a data base and a shell generating system. It supports preparation for both the entrance examination and the laboratory part of the course. User activities were logged by the server. Additional information was collected with on-line questionnaires and direct observation of student use in the computer clusters.

Our experience shows that self developed software can gain a much higher acceptance than externally generated products if they:

  • are written in the standard language of instruction
  • are integrated into the standard curriculum
  • provide formative assessment by giving immediate feedback
  • prepare students in an efficient manner for their examinations
The learning software was used by at least 18% of all students which is impressive for first time use. The computer clusters of the university hospital was the preferred site of use (56%) but modem dial up from home computers was also used to a significant extent (28%). Our data show that the software was readily accepted as a learning tool. Further development plans for the software are described in the paper.

Keywords: CBT (computer based teaching), Education, World-Wide Web, Evaluation, Knowledge-Intensive Learning

Citation: Botz A., Petta P., Haider D., Rapf K., Kremser K., März R.: Erfahrungen mit einem selbst entwickelten, WWW-basierten Lernprogramm für das Medizinisch-Chemische Praktikum, Zeitschrift für Hochschuldidaktik, 20(4), Wien, 1996.


OFAI-TR-97-12 ( 22kB g-zipped PostScript file)

Epistemological Aspects of Embodied Artificial Intelligence

Erich Prem

Keywords: Embodied AI, Introduction, Epistemology

Citation: Prem E.: Epistemological Aspects of Embodied Artificial Intelligence, Introduction to the Special Issue on Epistemological Aspects of Embodied Artificial Intelligence, Cybernetics and Systems, 28(5), iii-ix, 1997.


OFAI-TR-97-11 ( 40kB g-zipped PostScript file)

The implications of embodiment for cognitive theories

Erich Prem

This paper discusses epistemological aspects of the new field of embodied Artificial Intelligence and its consequences for the study of cognition. It is argued that the new emphasis on bodily phenomena fundamentally changes the nature of cognitive theories and the way these theories are formed. This result is based on an analysis of conventional Cognitive Science methodology as it has been succinctly described by Herbert Simon. It is argued that the inherently dynamic and physical nature of embodied AI can serve to correct oversimplifying assumptions of Cognitive Science methodology.

Keywords: Epistemology, Embodied AI, Methodology, Critique

Citation: Prem E.: The implications of embodiment for cognitive theories, TR-97-11, Oesterreichisches Forschungsinstitut fuer Artificial Intelligence, Wien, Austria, 1997.


OFAI-TR-97-10 ( 64kB g-zipped PostScript file)

Machine Learning and Case-based Reasoning: Their Potential Role in Preventing the Outbreak of Wars or in Ending Them

Robert Trappl, Johannes Fürnkranz, Johann Petrak, Jacob Bercovitch

In a current project we investigate the potential contribution of Artificial Intelligence for the avoidance and termination of crises and wars. This paper reports some results obtained by analyzing international conflict databases using machine learning and case-based reasoning techniques.

Keywords: Machine Learning, Case-Based Reasoning, Data Mining, Knowledge Discovery in Databases, International Relations, Peace

Citation: Trappl R., Fürnkranz J., Petrak J., Bercovitch J.: Machine Learning and Case-based Reasoning: Their Potential Role in Preventing the Outbreak of Wars or in Ending Them, in G. della Riccia, H.-J. Lenz, and R. Kruse (eds.), Learning, Networks and Statistics: Proceedings of the ISSEK-96 Workshop, pp. 209-225, 1997.


OFAI-TR-97-09 ( 68kB g-zipped PostScript file)

Integrating a Knowledge-Based System for Parenteral Nutrition of Neonates into a Clinical Intranet

Werner Horn, Christian Popow, Silvia Miksch, Andreas Seyfang

Daily renewed composition of parenteral nutrition for premature and full-term newborn infants in intensive care is tedious routine work needing expert knowledge and experience. It is a time consuming task and prone to inherent calculation errors. We have built several versions of the knowledge-based system VIE-PNN for prescribing parenteral nutrition supply. None of these systems succeeded to become a tool used in the clinical routine of the neonatal intensive care unit.

The recent version of VIE-PNN is a redesign using a HTML-based client-server architecture. It is integrated into the intranet of workstations which run the clinic's patient data management system. This integrated version is fully accepted by the clinical staff and all nutrition sheets are calculated by VIE-PNN. Reasons for the successful operation of the knowledge-based system in the daily routine work are ease of use, minimal required input, robustness of the system, explanation facilities, and most important time savings for the physician compared to calculation by hand.

Keywords: Knowledge-based system, intensive care unit, parenteral nutrition, neonates, intranet application, integration with patient data management system

Citation: Horn W., Popow C., Miksch S., Seyfang A.: Integrating a Knowledge-Based System for Parenteral Nutrition of Neonates into a Clinical Intranet, Austrian Research Institute for Artificial Intelligence, Vienna, TR-97-09, 1997.


OFAI-TR-97-08 ( 614kB g-zipped PostScript file)

Is Skeletal Planning in Real-World, High-Frequency Domains Possible?

Silvia Miksch, Werner Horn, Yuval Shahar, Christian Popow, Franz Paky, Peter Johnson

Skeletal plans are a powerful way to reuse existing domain-specific procedural knowledge. In the Asgaard project, a set of tasks that support the design and the execution of skeletal plans by a human executing agent other than the original plan designer were created. The underlying requirement to develop task-specific problem-solving methods is a modeling language. Therefore, within the Asgaard project, a time-oriented, intention-based language, called Asbru, was developed. During the design phase of plans, Asbru allows to express durative actions and plans caused by durative states of an observed agent. The intentions underlying these plans are represented explicitly as temporal patterns to be maintained, achieved or avoided. The Asgaard project and the Asbru language were designed for low-frequency domains. We proved the applicability of the Asbru language in the real-world, high-frequency environment of neonatal intensive care units (NICUs). The knowledge-base of VIE-VENT, an open-loop monitoring and therapy planning system for artificially ventilated newborn infants, is enhanced and formulated in the Asbru syntax. We show the benefits and limitations of the time-oriented, skeletal plan representation to be applicable in real-world, high-frequency domains.

Keywords: Planning, temporal reasoning, skeletal plans, high-frequency domains

Citation: Miksch S., Horn W., Shahar Y., Popow C., Paky F., Johnson P.: Is Skeletal Planning in Real-World, High-Frequency Domains Possible?, 15th International Joint Conference on Artificial Intelligence, Nagoya, Japan, (Poster Session), 1997.


OFAI-TR-97-07 ( 55kB g-zipped PostScript file)

Noise-tolerant Windowing

Johannes Fürnkranz

Windowing has been proposed as a procedure for efficient memory use in the ID3 decision tree learning algorithm. However, previous work has shown that it may often lead to a decrease in performance, in particular in noisy domains. Following up on previous work, where we have shown that the ability of separate-and-conquer rule learning algorithms to learn rules independently can be exploited for more efficient windowing procedures, we demonstrate in this paper how this property can be exploited to achieve noise-tolerance in windowing.

Citation: Fürnkranz J.: Noise-tolerant Windowing, in Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI-97), pp. 852-857, Nagoya, Japan, 1997.


OFAI-TR-97-06 ( 65kB g-zipped PostScript file)

Handling Time-Warped Sequences with Neural Networks

Claudia Ulbricht

Being able to deal with time-warped sequences is crucial for a large number of tasks autonomous agents can be faced with in real-world environments, where robustness concerning natural temporal variability is required, and similar sequences of events should automatically be treated in a similar way. Such tasks can easily be dealt with by natural animals, but equipping an animat with this capability is rather difficult. The presented experiments show how this problem can be solved with a neural network by ensuring slow state changes. An animat equipped with such a network not only adapts to the environment by learning from a number of examples, but also generalizes to yet unseen time-warped sequences.

Citation: Ulbricht C.: Handling Time-Warped Sequences with Neural Networks, Revised version of: C. Ulbricht, Handling Time-Warped Sequences with Neural Networks, From Animals to Animats 4, Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, September 9th-13th, 1996, Cape Cod, Massachusetts, P. Maes, M.J. Mataric, J.-A. Meyer, J. Pollack and S.W. Wilson, Bradford, 1996.


OFAI-TR-97-05 ( 38kB g-zipped PostScript file)

Forecasting Fetal Heartbeats with Neural Networks

Claudia Ulbricht, Georg Dorffner, Andreas Lee

The given task is to forecast the intervals between the heartbeats recorded from a fetus. The six tested neural network models combine input windows, hidden layer feedback, and self-recurrent unit feedback in different ways. The two networks combining an input window and hidden layer feedback performed best. One of them has additional self-recurrent feedback loops around the units in the state layer, which enable the system to deal with time-warped patterns. It turns out to be reasonable to combine several techniques for processing the temporal aspects inherent to the input sequence.

Citation: Ulbricht C., Dorffner G., Lee A.: Forecasting Fetal Heartbeats with Neural Networks, Extended version of: Ulbricht C., Dorffner G., Lee A.: Forecasting Fetal Heartbeats with Neural Networks, in Bulsari A.B., et al.(reds.), Solving Engineering Problems with Neural Networks, Systeemitekniikan seura ry, Turku, pp.403-406, 1996.


OFAI-TR-97-04

Autonomous Agents in User Interfaces (Work in Progress)

Mario Veitl, Paolo Petta, Robert Spour, Klaus Obermaier

In this paper we present novel concepts for user interfaces, based on agents actively exploring their environment including the user's behaviour. These autonomous agents are active entities in a hypermedia art-work on the development and artistic challenge of three dimensional sculpturing on video screens and with laser installations.

Such an approach challenges current paradigms of user interfaces in a fundamental way. Each individual, users and agents alike, is forced to construct their/its own concepts and strategies to operate successfully, based on their/its experiences. The strategies of the agent cannot be conveyed explicitly: instead, they emerge from concepts of the domain, concepts of the users, the agent's own drives, and the interactions between users and the agent; all together they form the main source for the generation of new concepts and strategies. The history of behaviour-based robotics provides us with similar evidence that active exploration - in particular as opposed to mere deliberation - constitutes a necessary step in improving a system's performance.

While this approach paves the path for novel ways of interacting with computers, there currently still is a lack of theories and methodologies to help in the development of such interfaces. The constructivistic point of view (Papert 93,Kafai and Resnick 96) can contribute some theoretical background, while paying attention to the novel abilities of computer based multimedia systems. We regard concepts from second order cybernetics and new system science metaphors to be a valuable and successful methodology for this work.

Keywords: User Interfaces, Human Computer Interaction, User Modeling, Autonomous Agent, Interface Agent, Multimedia, Hypermedia, 3D effects

Citation: Veitl M., Petta P., Spour R., Obermaier K.: Autonomous Agents in User Interfaces (Work in Progress), Proc. Annual Conference of the American Society of Cybernetics (ASC'97), Urbana, IL.


OFAI-TR-97-03

Personalities for Synthetic Actors: Current Issues and Some Perspectives

Paolo Petta, Robert Trappl

In this paper we analyze the current limitations and shortcomings of the various approaches to create personalities for synthetic actors and point out possible directions to extend the covered functionalities or amend existing problems. Adding to that, we give a brief description of some further lines of research which we expect to become of practical relevance in this area, thereby suggesting starting points for future directions of research, in which interdisciplinary efforts will continue to play an essential role.

Keywords: Synthetic Actors, Personality Modeling

Citation: Petta P., Trappl R.: Personalities for Synthetic Actors: Current Issues and Some Perspectives, Trappl R., Petta P. (eds.): Creating Personalities for Synthetic Actors, Springer LNAI Series, 1997.


OFAI-TR-97-02 ( 31kB g-zipped PostScript file)

Why to Create Personalities for Synthetic Actors

Paolo Petta, Robert Trappl

Synthetic Actors provide a novel channel of communication that is rapidly gaining in relevance for both human-computer interaction and computer mediated communication between human users. As illustrated in this paper, the personalities of these agents, engendered by and resting on top of their physical and cognitive capabilities, contribute decisively to their successful application and their acceptance by the public.

Keywords: Virtual Actors, Personality Modeling

Citation: Petta P., Trappl R.: Why to Create Personalities for Synthetic Actors, Trappl R., Petta P. (eds.): Creating Personalities for Synthetic Actors, Springer LNAI Series, 1997.


OFAI-TR-97-01 ( 54kB g-zipped PostScript file)

More Efficient Windowing

Johannes Fürnkranz

Windowing has been proposed as a procedure for efficient memory use in the ID3 decision tree learning algorithm. However, previous work has shown that windowing may often lead to a decrease in performance. In this work, we try to argue that separate-and-conquer rule learning algorithms are more appropriate for windowing than divide-and-conquer algorithms, because they learn rules independently and are less susceptible to changes in class distributions. In particular, we will present a new windowing algorithm that achieves additional gains in efficiency by exploiting this property of separate-and-conquer algorithms. While the presented algorithm is only suitable for redundant, noise-free data sets, we will also briefly discuss the problem of noisy data in windowing and present some preliminary ideas how it might be solved with an extension of the algorithm introduced in this paper.

Keywords: Machine Learning, Rule Learning, Efficiency

Citation: Fürnkranz J.: More Efficient Windowing, in Proceedings of the 14th National Conference on Artificial Intelligence (AAAI-97), pp. 509-514, Providence, RI, 1997.


OFAI-TR-96-26 ( 87kB g-zipped PostScript file)

Equivalent Error Bars for Neural Network Classifiers Trained by Bayesian Inference

Peter Sykacek

The topic of this paper is the problem of outlier detection for neural networks trained by Bayesian inference. I will show that marginalization is not a good method to get moderated probabilities for classes in outlying regions. The reason why marginalization fails to indicate outliers is analysed and an alternative measure, that is a more reliable indicator for outliers, is proposed. A simple artificial classification problem is used to visualize the differences. Finally both methods are used to classify a real world problem, where outlier detection is mandatory.

Keywords: Bayesian Inference, Confidence, Classification, Error Bars

Citation: Sykacek P.: Equivalent Error Bars for Neural Network Classifiers Trained by Bayesian Inference, Submitted to the European Symposium on Artificial Neural Networks, Bruges, Belgium, 16-18 April 1997.


OFAI-TR-96-25 ( 127kB g-zipped PostScript file)

Separate-and-Conquer Rule Learning

Johannes Fürnkranz

This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in Inductive Logic Programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases.

Keywords: Machine Learning, Inductive Logic Programming, Rule Learning

Citation: Fürnkranz J.: Separate-and-Conquer Rule Learning, Artificial Intelligence Review 13(1), 1999.


OFAI-TR-96-24 ( 114kB g-zipped PostScript file)

Knowledge Discovery in International Conflict Databases

Johannes Fürnkranz, Johann Petrak, Robert Trappl

Artificial Intelligence is heavily supported by military institutions,while practically no effort goes into the investigation of possible contributions of AI to the avoidance and termination of crises and wars. This paper makes a first step into this direction by investigating the use of machine learning techniques for discovering knowledge in international conflict and conflict management databases. We have applied similarity-based case retrieval to the KOSIMO database of international conflicts. Furthermore, we present results of analyzing the CONFMAN database of successful and unsuccessful conflict management attempts with an inductive decision tree learning algorithm. The latter approach seems to be particularly promising, as conflict management events apparently are more repetitive and thus better suited for machine-aided analysis.

Keywords: Machine Learning, Data Mining, Knowledge Discovery, Peace

Citation: Fürnkranz J., Petrak J., Trappl R.: Knowledge Discovery in International Conflict Databases, Applied Artificial Intelligence 11(2):91-118, March 1997.


OFAI-TR-96-23 ( 34kB g-zipped PostScript file)

Limitations of self-organizing maps for vector quantization and multi dimensional scaling

Arthur Flexer

The limitations of using self-organizing maps (SOM) for either clustering/vector quantization (VQ) or multi dimensional scaling (MDS) are being discussed by reviewing recent empirical findings and the relevant theory. SOM's remaining ability of doing both VQ and MDS at the same time is challenged by a new combined technique of adaptive K-means clustering plus Sammon mapping of the cluster centroids. SOM are shown to perform significantly worse in terms of quantization error, in recovering the structure of the clusters and in preserving the topology in a comprehensive empirical study using a series of multivariate normal clustering problems.

Citation: Flexer A.: Limitations of self-organizing maps for vector quantization and multidimensional scaling, in Mozer M.C. et al. (eds.), Advances in Neural Information Processing Systems 9, MIT Press/Bradford Books, pp. 445-451, 1997.


OFAI-TR-96-22 ( 97kB g-zipped PostScript file)

Tracking Context Changes through Meta-Learning

Gerhard Widmer

The article deals with the problem of learning incrementally (`on-line') in domains where the target concepts are context-dependent, so that changes in context can produce more or less radical changes in the associated concepts. In particular, we concentrate on a class of learning tasks where the domain provides explicit clues as to the current context (e.g., attributes with characteristic values). A general two-level learning model is presented that effectively adjusts to changing contexts by trying to detect (via `meta-learning') contextual clues and using this information to focus the learning process. Context learning and detection occur during regular on-line learning, without separate training phases for context recognition. Two operational systems based on this model are presented that differ in the underlying learning algorithm and in the way they use contextual information: MetaL(B) combines meta-learning with a Bayesian classifier, while MetaL(IB) is based on an instance-based learning algorithm. Experiments with synthetic domains as well as a number of `real-world' problems show that the algorithms are robust in a variety of dimensions, and that meta-learning can produce substantial improvement over simple object-level learning in situations with changing contexts.

Citation: Widmer G.: Tracking Context Changes through Meta-Learning, Submitted to Machine Learning


OFAI-TR-96-21 ( 53kB g-zipped PostScript file)

Defasibility in CLP(Q) through Generalized Slack Variables

Christian Holzbaur, Francisco Menezes, Pedro Barahona

This paper presents a defeasible constraint solver for the domain of linear equations, disequations and inequalities over the body of rational/real numbers. As extra requirements resulting from the incorporation of the solver into an Incremental Hierarchical Constraint Solver (IHCS) scenario we identified: a)the ability to refer to individual constraints by a label, b) the ability to report the (minimal) cause for the unsatisfiability of a set of constraints, and c) the ability to undo the effects of a formerly activated constraint.

We develop the new functionalities after starting the presentation with a general architecture for defeasible constraint solving, through a solved form algorithm that utilizes a generalized, incremental variant of the Simplex algorithm, where the domain of a variable can be restricted to an arbitrary interval. We demonstrate how generalized slacks form the basis for the computation of explanations regarding the cause of unsatisfiability and/or entailment in terms of the constraints told, and the possible deactivation of constraints as demanded by the hierarchy handler.

Keywords: Constraint Logic Programming, Linear Programming, Defeasible Constraint Solving

Citation: Holzbaur C., Menezes F., Barahona P.: Defasibility in CLP(Q) through Generalized Slack Variables, Proc. of the Second International Conference on Principles and Practice of Consstraint Programming (CP96), Cambridge, Massachusetts, USA, August 19-22, 1996.


OFAI-TR-96-20 ( 143kB g-zipped PostScript file)

Knowledge Representation in WERKL, an Architecture for Intelligent Multimedia Information Systems

Marcus Herzog, Paolo Petta

Multimedia information systems handle vast quantities of media resources. As a consequence, it is difficult to keep track of the semantic content of these items, especially if they were produced by different users of the system. We are interested in developing a formalism and corresponding tools that will be capable of abstracting concepts, ideas, and lines of thought expressed in the media by inferring relationships between the content of different resources. The tools will emphasize the role of the system as a partner augmenting the capabilities of the human user. As such they will tackle the problems of collaboration (human-to-human, human-to-machine), knowledge-sharing, and knowledge-retrieval in multimedia information systems. Users are not (indeed, cannot be) assumed to have complete knowledge of the dynamically changing content of the system; instead, they engage in an exploration of the available resources, during which they are offered opportunities to analyse their information need from different points of view.

Citation: Herzog M., Petta P.: Knowledge Representation in WERKL, an Architecture for Intelligent Multimedia Information Systems, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-20, 1996.


OFAI-TR-96-19 ( 44kB g-zipped PostScript file)

The Referential Basis of Abduction and Logic

Erich Prem

AI has often mistreated logic and abduction. A closer view at the historical roots of logic and at the system theoretic structure of logical modeling reveals that the foundations of logic lie in a science of argumentation rather than reasoning. The basis of such a science is a clarifcation of the term ``argument'' that also leads to a reduction of the semantics of natural language. In logic, the reductions are tacit semantic restrictions to specialized meanings of terms and forms of conceptual relations as they are used in arguments. These relations are the conditions of the possibility of conceptual reference and at the same time of logical reasoning, where abduction and the identification of something as something form the basis of both.

Citation: Prem E.: The Referential Basis of Abduction and Logic, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-19, 1996.


OFAI-TR-96-18 ( 36kB g-zipped PostScript file)

Elements of a Theory of Embodied Artificial Intelligence

Erich Prem

This paper discusses epistemological aspects of embodied AI as an engineering and as a cognitive science. The paper is abaout a warning and a proposal. The warning concerns potentially misleading assumptions about how spatial characteristics can be used to generate an alternative approach to grounding cognition. The proposal is to concentrate on biological transformations of epistemological questions that have lead to the development of modern ethology. These have been proven to be useful in the design of control systems for behavior-based robots. This also leads to a reacknowledgement of finality in the description and design of autonomous systems.

Keywords: embodied AI, epistemology, robotics, Kant, theoretical biology, finality, teleology

Citation: Prem E.: Elements of a Theory of Embodied Artificial Intelligence, Proc. of the AAAI Fall Symposium, AAAI Press, Menlo Park, CA, 1996.


OFAI-TR-96-17 ( 155kB g-zipped PostScript file)

Neural networks for time series processing

Georg Dorffner

This paper provides an overview over the most common neural network types for time series processing, i.e. pattern recognition and forecasting in spatio-temporal patterns. Emphasis is put on the relationships between neural network models and more classical approaches to time series processing, in particular, forecasting. The paper begins with an introduction of the basics of time series processing, and discusses feedforward as well as recurrent neural networks, with respect to their ability to model non-linear dependencies in spatio-temporal patterns.

Citation: Dorffner G.: Neural networks for time series processing, Neural Network World, 4(6) 447-468.


OFAI-TR-96-16 ( 199kB g-zipped PostScript file)

Categorization in early language acquisition - accounts from a connectionist model

Georg Dorffner

In this paper we introduce a connectionist model for early word learning as an important part of language acquisition. In the spirit of previous connectionist models (e.g. (Plunkett)) we evaluate the performance of the model with respect to important phenomena such as over- and underextensions, the comprehension over production asymmetry, and the naming insight leading to vocabulary spurts. The main motivation behind the model -- based on extensions of the well-known competitive learning paradigm -- is the focus on categorization as a major foundation of language and language learning. We give a detailed motivation of this approach and demonstrate how this model can help in identifying the roots of some of the phenomena in word learning. We also discuss the relationship between this modeling approach and more ``conventional'' connectionist models based on multilayer perceptrons and backpropagation, and how our model can overcome some of the apparent shortcomings of those models.

Citation: Dorffner G.: Categorization in early language acquisition - accounts from a connectionist model, Submitted to: Language and Cognitive Processes


OFAI-TR-96-15 ( 71kB g-zipped PostScript file)

A Connectionist Model of Categorization and Grounded Word Learning

Georg Dorffner, Michael Hentze, Georg Thurner

This paper reports about ongoing research on a connectionist model of the learning of single words and their meaning, grounded in perception. Similar to (Plunkett) it consists of a minimum of two sensory-based components with an adaptable link between them. Both components perform categorization of sensory stimuli plus current internal states using a version of ``soft'' competitive learning employing mechanisms known from adaptive resonance theory (ART). Through repeated learning categorization gradually leads to distinct attractors (compressed activation states). The temporal co-occurence of such categories in both system components can elicit the building of strongly weighted connections between them via a specialized component designed to approximate the arbitrariness and discreteness of the function of words (their properties as symbols in the semiotic sense), to be robust against noise, and to reflect important psycholinguistic phenomena.

Citation: Dorffner G., Hentze M., Thurner G.: A Connectionist Model of Categorization and Grounded Word Learning, in Koster C., Wijnen F. (eds.): Proceedings of the Groningen Assembly on Language Acquisition, 1996.


OFAI-TR-96-14 ( 49kB g-zipped PostScript file)

Neue adaptive Lösungen zur Kontrolle dynamischer Prozesse durch intelligente Technologien: Eine Einführung

Erich Prem

Dieser Artikel gibt im Sinne einer Einführung einen Überblick über typische industrielle Modellierungsprobleme und die Verwendung intelligenter Technologien zu ihrer Lösung. Es werden hierbei die Problembereiche Steuerung, Regelung, Überwachung und Vorhersage behandelt und miteinander verglichen. Neuartige adaptive Lösungsansätze für diese Bereiche, die vor allem auf dem Gebiet der Artificial Intelligence entwickelt wurden, erlauben die Realisierung von qualitativen Verbesserungen oder Vereinfachungen im Systemaufbau. Zu diesen Ansätzen gehören Expertensysteme, maschinelles Lernen, fuzzy logic und neuronale Netze.

Keywords: Introduction, Process Automation, Adaptive Control

Citation: Prem E.: Neue adaptive Lösungen zur Kontrolle dynamischer Prozesse durch intelligente Technologien: Eine Einführung, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-14, 1996.


OFAI-TR-96-13 ( 71kB g-zipped PostScript file)

The Behavior-Based Firm: An Application of Recent AI Concepts to Company Management

Erich Prem

Artificial Intelligence (AI) has a long tradition in developing technological means for the control of complex systems. This paper reviews recent developments in the area of embodied AI and behavior-based robotics and formulates principles as they appear to be applicable in managerial problem domains. We compare these principles to new management concepts such as the horizontal organization and lean production, which exhibit definite similarities to proposals recently made by roboticists. An analysis of these similarities identifies the importance of a tight system-environment coupling. This connection is achieved by a rapid and precise evaluation of external observables from many internal processes. Another important factor is the process orientation of control that marks a clear departure from traditional approaches based on functional decomposition.

Keywords: Embodied AI, Management, Theory of the Firm

Citation: Prem E.: The Behavior-Based Firm: An Application of Recent AI Concepts to Company Management, Applied Artificial Intelligence, 13(3), 1997.


OFAI-TR-96-12 ( 60kB g-zipped PostScript file)

Efficient Search for Strong Partial Determinations

Stefan Kramer, Bernhard Pfahringer

Our work offers both a solution to the problem of finding functional dependencies that are distorted by noise and to the open problem of efficiently finding strong (i.e., highly compressive) partial determinations per se. Briefly, we introduce a restricted form of search for partial determinations which is based on functional dependencies. Focusing attention on solely partial determinations derivable from overfitting functional dependencies enables efficient search for strong partial determinations. Furthermore, we generalize the compression-based measure for evaluating partial determinations to n-valued attributes.

Applications to real-world data suggest that the restricted search indeed retrieves a subset of strong partial determinations in much shorter runtimes, thus showing the feasibility and usefulness of our approach.

Citation: Kramer S., Pfahringer B.: Efficient Search for Strong Partial Determinations, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-12, 1996.


OFAI-TR-96-11 ( 57kB g-zipped PostScript file)

Machine Learning in Computer Chess: The Next Generation

Johannes Fürnkranz

Ten years ago the ICCA Journal published an overview of machine learning approaches to computer chess (Skiena,1986). The author's results were rather pessimistic. In particular he concludes that ``with the exception of rote learning in the opening book few results have trickled into competitive programs'' and that ``there appear no research projects on the horizon which offer reason for optimism''. In this paper we will update Skiena's work with research that has been conducted in this area since the publication of his paper. By doing so we hope to show that at least Skiena's second conclusion is no longer valid.

Citation: Fürnkranz J.: Machine Learning in Computer Chess: The Next Generation, International Computer Chess Association Journal 19(3):147-161, September 1996.


OFAI-TR-96-10 ( 54kB g-zipped PostScript file)

Digging for Peace: Using Machine Learning Methods for Assessing International Conflict Databases

Robert Trappl, Johannes Fürnkranz, Johann Petrak

In the last decade research in Machine Learning has developed a variety of powerful tools for inductive learning and data analysis. On the other hand, research in International Relations has developed a variety of different conflict databases that are mostly analyzed with classical statistical methods. As these databases are in general of a symbolic nature, they provide an interesting domain for application of Machine Learning algorithms. This paper gives a short overview of available conflict databases and subsequently concentrates on the application of machine learning methods for the analysis and interpretation of such databases.

Citation: Trappl R., Fürnkranz J., Petrak J.: Digging for Peace: Using Machine Learning Methods for Assessing International Conflict Databases, Proc. of the 12th European Conference on Artificial Intelligence (ECAI-96), 1996.


OFAI-TR-96-09 ( 1544kB PDF file)

A Multi-Agent Approach to Open Shop Scheduling: Adapting the Ant-Q Formalism

Bernhard Pfahringer

We adapt the Ant-Q formalism, a multi-agent search method, which was previously developed for solving traveling salesman problems, to open shop scheduling. We also introduce a new policy for delayed reinforcement, global-best-proportional, which at least in this domain seems to outperform known policies. The effect of different heuristics is also investigated. Experimental results are excellent. For one of the benchmark problems even a new better schedule was discovered.

Citation: Pfahringer B.: A Multi-Agent Approach to Open Shop Scheduling: Adapting the Ant-Q Formalism, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-09, 1996.


OFAI-TR-96-08

OFAI Publishing Rules: A Case Study in Information Agents

Bernhard Pfahringer

In this paper we investigate the applicability of agent technology to a selected corporation administration task: the handling of publishing matters at our own institution OFAI. With the outline of a proof-of-principle specification using the language Agent-Clips we show that such tedious tasks are amenable to - at least partial - automation. Currently we aim at automating the process as it is. The insights gained from this exercise should also be helpful in improving the process itself in the future. We also discuss other language options and immediate further targets for automated administration in university environments.

Citation: Pfahringer B.: OFAI Publishing Rules: A Case Study in Information Agents, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-08, 1996.


OFAI-TR-96-07 ( 111kB g-zipped PostScript file)

Pruning Algorithms for Rule Learning

Johannes Fürnkranz

Pre-Pruning and Post-Pruning are two standard methods of dealing with noise in decision tree learning. Pre-Pruning methods deal with noise during learning, while post-pruning methods try to address this problem after an overfitting theory has been learned. This paper shows how pre- and post-pruning algorithms can be used for separate-and-conquer rule learning algorithms. We discuss some fundamental problems and show how to solve them with two new algorithms that combine and integrate pre- and post-pruning.

Citation: Fürnkranz J.: Pruning Algorithms for Rule Learning, Machine Learning 27(2):139-171, May 1997.


OFAI-TR-96-06

On the Cognition of Synthetic Characters

Paolo Petta, Robert Trappl

We set out giving a brief overview of the present range of applications of animated synthetic characters. Building on this assessment, we next review the techniques used to endow these actors with cognitive capabilities. In an analysis of the architectures of these representative animation systems we identify essential limitations of today's implementations as well as promising opportunities for future developments.

Citation: Petta P., Trappl R.: On the Cognition of Synthetic Characters, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-06, 1996.


OFAI-TR-96-05 ( 108kB g-zipped PostScript file)

Effective Data Validation of High-Frequency Data: Time-Point-, Time-Interval-, and Trend-Based Methods

Werner Horn, Silvia Miksch, Gerhilde Egghart, Christian Popow, Franz Paky

Real-time systems for monitoring and therapy planning, which receive their data from on-line monitoring equipment and computer-based patient records, require reliable data. Data validation has to utilize and combine a set of fast methods to detect, eliminate, and repair faulty data, which may lead to life-threatening conclusions. The strength of data validation results from the combination of numerical and knowledge-based methods applied to both continuously-assessed high-frequency data and discontinuously-assessed data.
Dealing with high-frequency data, examining single measurements is not sufficient. It is essential to take into account the behavior of parameters over time. We present time-point-, time-interval-, and trend-based methods for validation and repair. These are complemented by time-independent methods for determining an overall reliability of measurements. The data validation benefits from the temporal data-abstraction process, which provides automatically derived qualitative values and patterns. The temporal abstraction is oriented on a context-sensitive and expectation-guided principle. Additional knowledge derived from domain experts forms an essential part for all of these methods.
The methods are applied in the field of artificial ventilation of newborn infants. Examples from the real-time monitoring and therapy-planning system VIE-VENT illustrate the usefulness and effectiveness of the methods.

Citation: Horn W., Miksch S., Egghart G., Popow C., Paky F.: Effective Data Validation of High-Frequency Data: Time-Point-, Time-Interval-, and Trend-Based Methods, Computers in Biology and Medicine, 27(5)389-409, 1997.


OFAI-TR-96-04 ( 72kB g-zipped PostScript file)

Motivation, Emotion and the Role of Functional Circuits in Autonomous Agent Design Methodology

Erich Prem

This paper discusses a new methodological approach to designing software for autonomous agents. For real autonomy such systems must be equipped with a motivational subsystem that drives the agent and selects among its possible behaviors. We present a methodology that supports the design of such a system and discuss its relation to theoretical biology, particularly the work of Jakob von Uexkuell. Another issue which is treated more briefly here, is the role of emotion in such an agent, particularly the communicative function of showing emotions.
These issues are discussed in the context of behavior-based control circuits.

Citation: Prem E.: Motivation, Emotion and the Role of Functional Circuits in Autonomous Agent Design Methodology, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-04, 1996.


OFAI-TR-96-03 ( 42kB g-zipped PostScript file)

Monitoring and Therapy Planning without Effective Data Validation are Ineffective

Silvia Miksch, Werner Horn, Gerhilde Egghart, Christian Popow, Franz Paky

Systems for monitoring and therapy planning, which receive their data from computer-based patient records and on-line monitoring equipment, require reliable data. Reasoning on faulty data can cause unexplainable and life-threatening conclusions. Effective and efficient data validation methods are needed to arrive at reliable conclusions.
We distinguished four categories of data validation and repair based on their underlying temporal ontologies: time-point-, time-interval-, trend-based, and time-independent validation and repair. Observing single measurements is not effective to arrive at trustable data. Therefore we take into account the behavior of parameters in the past as well as knowledge derived from domain experts. Examples from VIE-VENT, a knowledge-based monitoring and therapy-planning system for artificially-ventilated newborns, demonstrate the applicability of these methods.

Citation: Miksch S., Horn W., Egghart G., Popow C., Paky F.: Monitoring and Therapy Planning without Effective Data Validation are Ineffective, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-03, 1996.


OFAI-TR-96-02 ( 267kB g-zipped PostScript file)

Context-Sensitive and Expectation-Guided Temporal Abstraction of High-Frequency Data

Silvia Miksch, Werner Horn, Christian Popow, Franz Paky

Therapy planning benefits from derived qualitative values or patterns which can be used for recommending therapeutic actions as well as for assessing the effectiveness of these actions within a certain period. Dealing with high-frequency data, shifting contexts, and different expectations of the development of parameters requires particular temporal abstraction methods to arrive at unified qualitative values or patterns.
This paper addresses context-sensitive and expectation-guided temporal abstraction methods. They incorporate knowledge about data points, data intervals, and expected qualitative trend patterns to arrive at unified qualitative descriptions of parameters (temporal data abstraction). Our methods are based on context-sensitive schemata for data-point transformation and curve fitting which express the dynamics of and the reactions to different degrees of parameters' abnormalities, as well as on smoothing and adjustment mechanisms to keep the qualitative descriptions stable in case of shifting contexts or data oscillating near thresholds.
The temporal abstraction methods are integrated and implemented in VIE-VENT, an open-loop knowledge-based monitoring and therapy planning system for artificially ventilated newborn infants. The applicability and usefulness of our approach are illustrated by examples of VIE-VENT.

Citation: Miksch S., Horn W., Popow C., Paky F.: Context-Sensitive and Expectation-Guided Temporal Abstraction of High-Frequency Data, Proc. of the Tenth International Workshop on Qualitative Reasoning, Stanford Sierra Camp, Fallen Leaf Lake, CA, May 21-24, 1996.


OFAI-TR-96-01 ( 95kB g-zipped PostScript file)

Recognition and Exploitation of Contextual Clues via Incremental Meta-Learning

Gerhard Widmer

Daily experience shows that in the real world, the meaning of many concepts heavily depends on some implicit context, and changes in that context can cause more or less radical changes in the concepts. Incremental concept learning in such domains requires the ability to recognize and adapt to such changes.
This paper presents a solution for incremental learning tasks where the domain provides explicit clues as to the current context (e.g., attributes with characteristic values). We present a general two-level learning model, and its realization in a system named MetaL(B), that can learn to detect certain types of contextual clues, and can react accordingly when a context change is suspected. The model consists of a base level learner that performs the regular on-line learning and classification task, and a meta-learner that identifies potential contextual clues. Context learning and detection occur during regular on-line learning, without separate training phases for context recognition. Experiments with synthetic domains as well as a `real-world' problem show that MetaL(B) is robust in a variety of dimensions and produces substantial improvement over simple object-level learning in situations with changing contexts.
The meta-learning framework is very general, and a number of instantiations and extensions of the model are conceivable. Some of these are briefly discussed.

Citation: Widmer G.: Recognition and Exploitation of Contextual Clues via Incremental Meta-Learning, Austrian Research Institute for Artificial Intelligence, Vienna, TR-96-01, 1996.


OFAI-TR-95-38

The Interpretation of Prosodic Meaning using Discourse-level Features for the Purposes of Text Generation

Elizabeth J. Garner

Appropriate intonational contours are an important feature of all synthesized speech. Speakers convey a great deal of information about the salience of discourse entities, and about the word's given/new status, itself dependent on discourse-level phenomena.
This paper attempts to provide a number of features in the semantic representation which reflect the meaning carried by intonation in German weather forecasts. I begin by taking al look at the meaning carried by intonation and how it is expressed. The findings of other linguists, in particular Pierrehumbert and Hirschberg (1990) are compared to features of prosodic meaning found in our corpus of weather forecasts. The results are interpreted as a combination of three separate types of stress marking, emphatic, thematic and contrastive stress. A method of generating these is suggested which makes use of discourse plans and the knowledge contained in a classification-based hierarchy.

Citation: Garner E.J.: The Interpretation of Prosodic Meaning using Discourse-level Features for the Purposes of Text Generation, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-38, 1995.


OFAI-TR-95-37

Phonologische Datenstrukturen in einem Concept-to-Speech System: Segmentale und Prosodische Phonologie

Georg Niklfeld

This report from a project on the development of a concept-to-speech system for German presents the component that is responsible for generating the phonological representations. Building on linguistic analyses of the three domains of segmental phonology, syllabification and stress, an algorithmic procedure is developed for projecting the phonological specifications of lexical units onto those representations of larger stretches of utterances that serve as the input for speech synthesis. All phonological rules are rendered in a single rule set for the extended two-level morphology system X2MorF. Here we provide detailed documentation of the developed implementation. The declarative approach pursued allows to account for the mutual dependencies between the three phonological dimensions. (Report in German)

Citation: Niklfeld G.: Phonologische Datenstrukturen in einem Concept-to-Speech System: Segmentale und Prosodische Phonologie, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-37, 1995.


OFAI-TR-95-35 ( 52kB g-zipped PostScript file)

Structural Regression Trees

Stefan Kramer

In many real-world domains the task of machine learning algorithms is to learn a theory for predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with non-determinate background knowledge. (The only exception is a covering algorithm called FORS.)
In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems. SRT integrates the statistical method of regression trees into ILP. It constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.

Citation: Kramer S.: Structural Regression Trees, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-35, 1995.


OFAI-TR-95-34 ( 136kB g-zipped PostScript file)

Retrieval as Exploration of Large Multimedia Document Bases

Marcus Herzog, Paolo Petta, Christian Kühn

We propose an approach to information retrieval from multimedia databases based on the interpretation of the retrieval task as query by refinement using an explorative design process as guiding metaphor. The combined use of an intensional Index Layer and an extensional Information Layer allows for a flexible incremental structuring of the semantic space according to the requirements of the task at hand. In addition, the system provides a number of operations to assist the user in the exploration of the solution space. Possible contributions of various techniques from the areas of description logics and inductive logic programming are discussed.

Citation: Herzog M., Petta P., Kühn C.: Retrieval as Exploration of Large Multimedia Document Bases, Presented at the IJCAI-95 Workshop on Intelligent Multimedia Information Retrieval


OFAI-TR-95-33

Evolving Good TSP Tours by Means of Heuristic Repair and Strong Crowding

Bernhard Pfahringer

We introduce a new cross-over operator applicable to an adjacency-based representation of the traveling salesman problem (TSP). This operator is employed in a modification of deterministic crowding called strong crowding which aims at preserving diversity more rigorously. This is a strong concern in multi-agent search approaches. Experimental results are most impressive for asymmetric TSPs: for 5 out of 6 testbed problems globally optimal solutions are found in at least one out of 20 test runs. Application to larger symmetric TSPs, however, appears to be problematic. Possible remedies are discussed.

Citation: Pfahringer B.: Evolving Good TSP Tours by Means of Heuristic Repair and Strong Crowding, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-33, 1995.


OFAI-TR-95-32 ( 75kB g-zipped PostScript file)

Predicate Invention: A Comprehensive View

Stefan Kramer

This paper discusses predicate invention (PI) from various, previously neglected viewpoints. First of all, we argue that predicate invention should build on existing work on constructive induction in propositional learning. We recall the major reasons for constructive induction in propositional languages, and give a brief overview of the frameworks for constructive induction by Matheus, Wnek and Michalski. We then apply these frameworks to predicate invention in order to categorize systems and to identify relevant aspects of PI. The discussion demonstrates that some relevant aspects are treated only implicitly, and some are largely neglected in many systems. Secondly, we review current criticism against constructive induction that also concerns predicate invention. In particular, we agree with Sutton's demand that constructive induction should be based on continuing learning, i.e. it should reuse representational ``tricks'' in a series of learning tasks. Thirdly, we discuss the advantages and disadvantages of fully-automatic vs. interactive predicate invention. The question is how to create meaningful new predicates. Since comprehensibility and syntactical complexity are not necessarily the same, human intervention may be required if humans shall make sense of the resulting theory.
We try to attract more attention to important aspects that have not yet been recognized clearly, and still are present in the work of many authors. These aspects are illustrated by existing PI systems.

Citation: Kramer S.: Predicate Invention: A Comprehensive View, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-32, 1995.


OFAI-TR-95-31 ( 46kB g-zipped PostScript file)

Compression-Based Evaluation of Partial Determinations

Bernhard Pfahringer, Stefan Kramer

Our work tackles the problem of finding partial determinations in databases and proposes a compression-based measure to evaluate them. Partial determinations can be viewed as generalizations of both functional dependencies and association rules, in that they are relational in nature and may have exceptions. Extending the measures used for evaluating association rules, namely support and confidence, to partial determinations leads to a few problems. We therefore propose a measure based on the minimum description length (MDL) principle to remedy this problem. We assume the hypothetical task of transmitting a given database as efficiently as possible. The new measure estimates the compression achievable by transmitting partial determinations instead of the original data. It takes into account both the complexity and the correctness of a given partial determination, thus avoiding overfitting especially in the presence of noise. We also describe three different kinds of search using the new measure. Preliminary empirical results in a few boolean domains are favorable.

Citation: Pfahringer B., Kramer S.: Compression-Based Evaluation of Partial Determinations, Proc. of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), Montreal, Canada, 1995.


OFAI-TR-95-30 ( 184kB g-zipped PostScript file)

Multimedia Information Systems in Open-World Domains

Marcus Herzog, Paolo Petta

The use of different media types in information systems allows to communicate semantically expressive data of problems encountered in open-world domains. As a consequence, new methodologies have to be developed to cope with the rich semantics of data in such multimedia information systems. We present WERKL, a system architecture that takes into account the dependency between the semantic and the architectonic spaces as found in real-world problems. Within WERKL, we give a description of the exploration process that supersedes the search paradigm of traditional information systems. We further discuss implementation issues and demonstrate the applicability of this architecture by reviewing on-going projects based on WERKL.

Citation: Herzog M., Petta P.: Multimedia Information Systems in Open-World Domains, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-30, 1995.


OFAI-TR-95-29 ( 43kB g-zipped PostScript file)

The Generation of Idiomatic and Collocational Expressions

Johannes Matiasek

Collocations whose semantic content is not or only partially composed from the semantic content of their parts are often viewed as problematic for generation. In this paper a tactical generator combining FUF as the generation engine and HPSG as the grammar framework is presented. It is shown, that the lexicon driven approach to syntactic and semantic processing is well-suited for the generation of idioms exhibiting various degrees of noncompositionality and syntactic restrictions.

Citation: Matiasek J.: The Generation of Idiomatic and Collocational Expressions, Proc. of 13th European Meeting on Cybernetics and Systems Research (EMCSR'96), Vienna, Austria, April 9 -12, 1996.


OFAI-TR-95-28

Opportunities and Risks of Innovation: The Case of Automatizing Cognitive Processes

Franz Josef Radermacher

Keywords: ,

Citation: Radermacher F.J.: Opportunities and Risks of Innovation: The Case of Automatizing Cognitive Processes, Translation of an edited transcript of a presentation given at the anniversary celebration ``10 Years of Austrian Research Institute for Artificial Intelligence/25 Years of Austrian Society for Cybernetic Studies'' at the Austrian Federal Ministry for Science and Research on November 17, 1994.


OFAI-TR-95-27 ( 173kB g-zipped PostScript file)

Applications of Machine Learning to Music Research: Empirical Investigations into the Phenomenon of Musical Expression

Gerhard Widmer

This chapter describes an application of machine learning techniques to the study of a fundamental phenomenon in tonal music. Learning algorithms are described that induce general rules of expressive music performance from examples of real performances by musicians. Motivated by the insight that general knowledge about music plays an essential role in the way humans learn this task, we present two alternative approaches to knowledge-based learning. In both cases, the domain knowledge provided to the learner is based on established theories of tonal music. Experimental results show that both approaches lead to a significant improvement of the learning results, compared to purely inductive learning.
However, this project is more than basic machine learning research. Due to its thorough grounding in music theory, the project can also be viewed as a contribution to the scientific field of music research or musicology; it has produced results that have found their way also into the literature of that scientific discipline. These will also be touched on in this chapter.

Citation: Widmer G.: Applications of Machine Learning to Music Research: Empirical Investigations into the Phenomenon of Musical Expression, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-27, 1995.


OFAI-TR-95-26 ( 13kB g-zipped PostScript file)

Artificial Intelligence for Decision Support: Needs, Possibilities, and Limitations in ICU

Silvia Miksch

This paper discusses how Artificial Intelligence (AI) could be used for decision support in modern intensive care units (ICUs), namely using knowledge-based techniques. First, the specific needs for decision support in ICU will be analyzed, which results in the most urgent need with regard to the different tasks of monitoring and therapy planning. Second, a definition for AI will be presented. Third, methods to solve the two essential parts of monitoring and therapy planning, namely data validation and abstraction in real-world environments, will be exemplified. Finally, basic requirements and limitations for knowledge-based decision support will be summarized.

Citation: Miksch S.: Artificial Intelligence for Decision Support: Needs, Possibilities, and Limitations in ICU, 10th Postgraduate Course in Critical Care Medicine A.P.I.C.E.'95, Springer, 1995.


OFAI-TR-95-25 ( 112kB g-zipped PostScript file)

Constraint Logic Programming for Structure-Based Reasoning about Dynamic Physical Systems

Yousri El Fattah

The paper describes a constraint logic programming approach for reasoning about dynamic physical systems based on structure. The approach takes a bond graph model of a system and computes a causal graph representation of its causal structure. Causal graphs can be used for causal explanations and for explicating the effect of modeling abstractions on causal structure. The paper shows how causal graphs can be used to compute a simulation model of a system in the form of a set of differential algebraic equations. The topological properties of the causal graph determines whether a simulation model is regular, i.e., conforming to the criterion of ``real-time representation'' or causality. In case the simulation model is not regular, a method is given for identifying all causal problems of a bond graph model.

Citation: El Fattah Y.: Constraint Logic Programming for Structure-Based Reasoning about Dynamic Physical Systems, Submitted to: Artificial Intelligence in Engineering.


OFAI-TR-95-24

Constraint Logic Programming for Qualitative Reasoning about Dynamic Behavior

Yousri El Fattah

The paper shows how constraint logic programming over the real domain, CLP(R), can be used for qualitative reasoning (QR) about dynamic behavior. The main result is a procedure for computing qualitative solutions of ordinary differential equations (ODEs). The procedure is described for linear systems and is based on partitioning the quantity space into disjoint sub-spaces where each sub-space corresponds to a qualitatively distinct relation between variables and their derivatives. Examples are given to illustrate how the procedure can be utilized for qualitative simulation and analysis of dynamic behavior.

Citation: El Fattah Y.: Constraint Logic Programming for Qualitative Reasoning about Dynamic Behavior, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-24, 1995.


OFAI-TR-95-23 ( 354kB g-zipped PostScript file)

A Machine Learning Analysis of Expressive Timing in Pianists' Performances of Schumann's ``Träumerei''

Gerhard Widmer

The paper describes a recent attempt at reconstructing, by means of machine learning techniques, expressive performance skills from examples of real musical performances. An inductive machine learning algorithm is used to analyze the expressive timing (rubato) patterns in some actual performances by various famous pianists of Robert Schumann's ``Träumerei'' (from ``Kinderszenen'', op.15). Two approaches based on the same learning algorithm, but using different vocabularies for describing example performances and formulating the rules are described. The experimental results are quite interesting and instructive, but they also point to some rather serious limitations of the available data collection.

Citation: Widmer G.: A Machine Learning Analysis of Expressive Timing in Pianists' Performances of Schumann's ``Träumerei'', Proc. of the Stockholm Symposium on Generative Grammars for Music Performance, Royal Institute of Technology (KTH), Stockholm, May 1995.


OFAI-TR-95-22 ( 30kB g-zipped PostScript file)

Using Two-Level Morphology as a Generator-Synthesizer Interface in Concept-to-Speech Generation

Georg Niklfeld, Hannes Pirker, Harald Trost

In a project for the development of a concept-to-speech system for German, we apply extended two-level-morphology (Trost 1991) to provide a unified solution to the tasks of morphotactics, segmental (morpho)phonology, syllabification and assignment of stress. Starting from a lexeme-based lexicon, we show that a declarative two-level-implementation of a single rule-corpus complemented with feature filters is sufficient for a comprehensive account of the various mutual influences holding between separate phonological dimensions in the phonology of German.
Information from higher levels of linguistic structure, up to textual representation, can be exploited in our system by performing a look-up of relevant feature-values through the filter conditions.

Citation: Niklfeld G., Pirker H., Trost H.: Using Two-Level Morphology as a Generator-Synthesizer Interface in Concept-to-Speech Generation, Proc. of the HCM-Workshop on Spoken Dialogue and Discourse, Dublin, April 1, 1995.


OFAI-TR-95-21 ( 550kB g-zipped PostScript file)

Fact-finding Committee Work: Data Analysis beyond Single Sources

Silvia Miksch, Johannes Gaertner

Data analysis in real-world environments is not well defined. Data are usually more faulty than expected and the knowledge available is fuzzy and incomplete. Additionally, data analysis has to deal with different observation frequencies, different regularities, and different data types. We propose a fact-finding process that takes the characteristics of real-world problems into account. All kinds of information available should be used to interpret the context. Therefore continuously and discontinuously numerical as well as qualitative data are used. Our approach consists of three main components based on temporal ontologies: data validation, data interpretation, and task adequate visualization. The data validation process classifies the input data according to their reliability. The data interpretation process leads to unified qualitative descriptions of point and interval data. Finally, these gathered and derived information is visualized task-oriented with more detailed descriptions on users' request to aid further refinements regarding validation and interpretation. Our cyclic fact-finding process is applied in two different domains, namely artificial ventilation of newborn infants and shift-scheduling. The most crucial precondition of our approach is the task oriented structuring and interweaving of these three steps.

Citation: Miksch S., Gaertner J.: Fact-finding Committee Work: Data Analysis beyond Single Sources, Proc. of the International Symposium on Intelligent Data Analysis (IDA-95), Baden-Baden, Germany, August 17-19, 1995.


OFAI-TR-95-20 ( 116kB g-zipped PostScript file)

A generalized view on learning in feedforward neural networks

Georg Dorffner

In this paper we introduce a generalized view on feedforward neural networks. In this view, well-known network types like multilayer perceptrons and radial basis function networks are just a few of many possibilities in a virtual space of neural network types, spanned by the three dimensions propagation rule, transfer function, and learning rule. We list several examples of other combinations of values along these dimensions and discuss the advantages of such a view. The goal of depicting neural networks this way is to arrive at strategies to find optimal neural network solutions for given data sets, aided by statistical data analysis to identify the best method.

Citation: Dorffner G.: A generalized view on learning in feedforward neural networks, In: Cromme L., Wille J., Kolb T.(eds.): CoWAN'94, Technische Universitaet Cottbus, Reihe Mathematik M-01/1995, 1995.


OFAI-TR-95-19 ( 54kB g-zipped PostScript file)

Classification through Hyperplane Fitting with Feedforward Neural Networks

Georg Dorffner

This paper introduces and demonstrates a novel (or at least unusual) way of using feedforward neural networks for classification, inspired by a technique known from statistics. Usually, hidden units of a network are considered to define hyperplanes which separate clusters or sub-classes. The network defined in this paper uses the hyperplanes to represent clusters or sub-classes of points by fitting them as in regression. An update and gradient descent learning rule is defined, and results from XOR and the benchmark sonar data are used for illustration. The resulting network is reminiscent of some kind of ``inverse'' competitive learning or LVQ, and have different qualitative properties than multilayer perceptrons (e.g. linear separability is replaced by ``linear regressability'').

Citation: Dorffner G.: Classification through Hyperplane Fitting with Feedforward Neural Networks, Submitted to: International Conference on Artificial Neural Networks (ICANN-95), Paris.


OFAI-TR-95-18

Integrating Knowledge-Based Components Into Neural Network Software: A Case Study for ECANSE

Erich Prem

The aim of this paper is to propose an integration of knowledge transfer methods with the neural network software engineering tool ECANSE. A group of techniques is selected from the wide range of existing approaches so as to allow easy integration of these techniques into ECANSE. Constraints for selections include (i) ease of programming, (ii) availability of knowledge, (iii) use of standard network architectures. Changes that are necessary in the conceptual frame of ECANSE are described. Several modules which implement the knowledge transfer techniques are specified. One general guideline in developing this specification has been to allow a quick implementation of a prototype. At several points throughout this paper, however, extensions to this first approach are mentioned which can easily be added in future versions of ECANSE.

Citation: Prem E.: Integrating Knowledge-Based Components Into Neural Network Software: A Case Study for ECANSE, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-18, 1995.


OFAI-TR-95-17

Connectionist Knowledge Transfer Methods: An Overview

Erich Prem

Ever since the introduction of neural networks have there been vivid discussions that centered around the duality of knowledge based and connectionist methods. Whereas the representations in conventional rule-bases are easily accessible to humans, a neural network's weights are practically always difficult to interpret. On the other hand, networks can usually only be put to work through extensive training on sample data, whereas rule-based systems are explicitly designed through the work of a knowledge engineer. Therefore, any a priori knowledge one may have about a task apart from training examples can usually not be used to speed up or improve network training. The behavior of trained networks, on the other hand, can often only be poorly understood. This is why many authors have been searching for methods to combine the virtues of both knowledge-based and connectionist approaches. This paper provides an overview of existing approaches which extract knowledge from neural networks or inject knowledge about the task into the net.

Citation: Prem E.: Connectionist Knowledge Transfer Methods: An Overview, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-17, 1995.


OFAI-TR-95-16 ( 33kB g-zipped PostScript file)

Statistical Evaluation of Neural Network Experiments: Minimum Requirements and Current Practice

Arthur Flexer

This work concerns the necessity of statistical evaluation of neural network experiments. This necessity is motivated by applying fundamental notions of statistical hypotheses testing to neural network research. Minimum requirements concerning statistical evaluation are developed and the appropriate statistical techniques are introduced. Articles from two leading neural network journals are examined and critizised for the lack of statistical evaluation they contain.

Citation: Flexer A.: Statistical Evaluation of Neural Network Experiments: Minimum Requirements and Current Practice, in Trappl R., Cybernetics and Systems '96, Proceedings of the 13th European Meeting on Cybernetics and Systems Research. Austrian Society for Cybernetic Studies, Vienna, 2 vols., pp.1005-1008, 1996. Also available as: TR-95-16.


OFAI-TR-95-15 ( 33kB g-zipped PostScript file)

Requirements on Linguistic Knowledge Sources for Multilingual Generation

Johannes Matiasek, Harald Trost

Multilingual generation is often regarded as a possible alternative to machine translation in a number of application scenarios. The expectation is that for these applications multilingual generation will prove to be an inherently easier solution. In this paper we investigate whether this claim is substantial. In particular, we consider the linguistic knowledge sources needed for multilingual generation and compare them to those needed for machine translation. By studying some examples in detail not surprisingly we conclude that this seems not to be the case in general. Only by shifting the emphasis from producing equivalent texts to texts conveying the same message this goal may be achieved. On the other hand, such an approach places additional demands on other components of the system.

Citation: Matiasek J., Trost H.: Requirements on Linguistic Knowledge Sources for Multilingual Generation, Proc. of the IJCAI-95 Workshop on Multilingual Generation.


OFAI-TR-95-14 ( 58kB g-zipped PostScript file)

Implementing HPSG in FUF -- An Experiment in the Reusability of Linguistic Resources

Johannes Matiasek, Harald Trost

In practical systems it is often required to reuse existing resources. Such an approach clearly has advantages: it speeds up the development process considerably, if one doesn't have to start from scratch. However, combining resources not designed to work together is not a trivial task. An HPSG grammar of German has been implemented in FUF, an unification-based text generator. Although FUF is largely theory-neutral, some of its characteristics diverge from the processing requirements imposed by HPSG in its strict sense. The most prominent discrepancy is, that HPSG, being a lexically driven formalism lends itself best to a head-driven bottom-up processing strategy, whereas FUF, at least by default, uses a top-down, category-driven approach. FUF also lacks a morphological component able to deal with the rich German inflectional system. Therefore a two-level morphology component, X2MorF, has been added. We describe the problems arising when integrating these three resources and the transformations and adaptations made to them, leading to a wide coverage tactical generator for German.

Citation: Matiasek J., Trost H.: Implementing HPSG in FUF -- An Experiment in the Reusability of Linguistic Resources, in: Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), Copenhagen, Denmark, 1996


OFAI-TR-95-13 ( 78kB g-zipped PostScript file)

Knowledge Resources for the Text Planner: The Domain Model and Plans for Discourse

Elizabeth Garner

The task of the text planner is to convert an input specification into an output suitable for manipulation by the tactical generator; a task which involves both content selection and content organisation. This paper takes a look at some of the resources needed for this. Using the results of a corpus analysis I first show how both the underlying domain and communication knowledge may be modelled in the Knowledge Representation Language LOOM. Having looked at possible methods of specifying input, I go on to discuss how certain types of variation may be expressed by discourse plans. Then, taking examples from the corpus, I demonstrate how these may be implemented by the use of LOOM production rules. Finally, I look at the form of the output of the production rules and suggest what further resources are necessary in order to arrive at the desired output for the tactical generator.

Citation: Garner E.: Knowledge Resources for the Text Planner: The Domain Model and Plans for Discourse, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-13, 1995.


OFAI-TR-95-12

Utilizing Temporal Data Abstraction for Data Validation and Therapy Planning

Silvia Miksch, Werner Horn, Christian Popow, Franz Paky

Medical diagnosis and therapy planning at modern intensive care units (ICUs) have been refined by the technical improvement of their equipment. However, the bulk of continuous data arising from complex monitoring systems in combination with discontinuously assessed numerical and qualitative data creates a rising information management problem at neonatal ICUs (NICUs). We developed methods for data validation and therapy planning which incorporate knowledge about point and interval data, as well as expected qualitative trend descriptions to arrive at unified qualitative descriptions of parameters (temporal data abstraction). Our methods are based on data-point-transformation and curve-fitting schemata which express the dynamics of and the reactions to different degrees of parameters' abnormalities as well as on smoothing and adjustment mechanisms to keep the qualitative descriptions stable. We show their applicability to detect anomalous system behavior early, to recommend therapeutic actions, and to assess the effectiveness of these actions within a certain period. We implemented our methods in VIE-VENT, an open-loop knowledge-based monitoring and therapy planning system for artificially ventilated newborn infants. The applicability and usefulness of our approach are illustrated by examples of VIE-VENT.

Citation: Miksch S., Horn W., Popow C., Paky F.: Utilizing Temporal Data Abstraction for Data Validation and Therapy Planning, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-12, 1995.


OFAI-TR-95-11

What Governs Autonomous Agents

Robert Trappl, Paolo Petta

In recent years, proven artificial intelligence techniques have been rapidly adopted in the expanding field of computer animation. With the increasing complexity and interactivity of the environments in which computer animations involving autonomous actors are set, the interest in further contributions is extended to include computational models of behavioural and cognitive phenomena. In this paper we review some representative examples of current artificial intelligence research efforts that could point towards solutions satisfying these new challenging needs.

Citation: Trappl R., Petta P.: What Governs Autonomous Agents, Proc. Computer Animation '95, Geneva, IEEE Press, Los Alamitos, CA, 1995.


OFAI-TR-95-10 ( 52kB g-zipped PostScript file)

Compression-Based Discretization of Continuous Attributes

Bernhard Pfahringer

Discretization of continuous attributes into ordered discrete attributes can be beneficial even for propositional induction algorithms that are capable of handling continuous attributes directly. Benefits include possibly large improvements in induction time, smaller sizes of induced trees or rule sets, and even improved predictive accuracy. We define a global evaluation measure for discretizations based on the so-called Minimum Description Length (MDL) principle from information theory. Furthermore we describe the efficient algorithmic usage of this measure in the MDL-Disc algorithm. The new method solves some problems of alternative local measures used for discretization. Empirical results in a few natural domains and extensive experiments in an artificial domain show that MDL-Disc scales up well to large learning problems involving noise.

Citation: Pfahringer B.: Compression-Based Discretization of Continuous Attributes, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-10, 1995.


OFAI-TR-95-09 ( 46kB g-zipped PostScript file,  160kB PDF file)

OEFAI clp(q,r) Manual Rev. 1.3.2

Christian Holzbaur

This Manual documents a Prolog implementation of clp(q,r), based on SICStus featuring extensible unification via attributed variables. The system described in this document is an instance of the general Constraint Logic Programming scheme introduced by [Jaffar and Michaylov 87]. The implementation is at least as complete as other existing clp(r) implementations: It solves linear equations over rational or real valued variables, covers the lazy treatment of nonlinear equations, features a decision algorithm for linear inequalities that detects implied equations, removes redundancies, performs projections (quantifier elimination), allows for linear dis-equations, and provides for linear optimization.

Citation: Holzbaur C.: OEFAI clp(q,r) Manual Rev. 1.3.2, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-09, 1995.


OFAI-TR-95-08 ( 45kB g-zipped PostScript file)

Grounding and the Entailment Structure in Robots and Artificial Life

Erich Prem

This paper is concerned with foundations of ALife and its methodology. A brief look into the research program of ALife serves to clarify its goals, methods and subfields. It is argued that the field of animat research within ALife follows a program which is considerably different from the rest of ALife endeavours. The simulation -- non-simulation debate in behavior based robotics is revisited in the light of ALife criticism and Simon's characterization of the sciences of the artificial. It reveals severe methodological problems, or dangers at least, which can only be overcome by reconsidering naturalness in the study of ALife. Reconsidering Simon's arguments I suugest that ALife is not a science of the artificial. Furthermore, it is argued that life-as-it-could-be is an ill defined term, if it is supposed to rescue the research program of ALife into a domain where naturalness is not important. This is so, because it is and must be based on life-as-we-know-it as long as there is no better (necessary and sufficient) characterization of life. A comparison of ALife with other such sciences like Artificial Intelligence and Cognitive Science shows similar problems in these areas, and a similar solution: grounding. The practical upshot of all this lies in the fact that Artificial Life must indeed be more concerned with natural systems than it may itself consider to have to. Building robots, building models (instead of simulations), and grounding systems may be ways for ALife in the future.

Citation: Prem E.: Grounding and the Entailment Structure in Robots and Artificial Life, Proc. Third International Conference on Artificial Life, Granada, Spain.


OFAI-TR-95-07 ( 45kB g-zipped PostScript file)

Dynamic Symbol Grounding, State Construction and the Problem of Teleology

Erich Prem

Symbol grounding has originated within the connectionist-symbolic debate so as to gap the bridge between the two approaches. This paper provides an overview about recent results concerning symbol grounding, which is critically reviewed here. A thorough analysis reveals that symbol grounding parallels transcendental logic and is best viewed as automated model construction. If this diagnosis is true, the necessary next question must be which sort of models are generated in symbol grounding systems. The answer to this question very much depends on the kind of network architecture employed for grounding. An illustrative neural network architecture is used to explain how a dynamic symbol grounding system generates a formal model. It is argued that, depending on the architecture, very different kinds of signs ranging from input to goal descriptions can be grounded.

Citation: Prem E.: Dynamic Symbol Grounding, State Construction and the Problem of Teleology, Proc. International Workshop on Artificial Neural Networks 1995, Torremolinos, Spain.


OFAI-TR-95-06 ( 144kB g-zipped PostScript file,  123kB PDF file)

Connectionists and Statisticians, Friends or Foes?

Arthur Flexer

This investigation on relationships between the field of artificial neural networks (connectionism) and statistics starts with a look on relevant work based on a classification of possible points of contact. Then follows a distinction between connectionism seen as a tool for data analysis (engineering connectionism) and seen as a model for human thinking or, as one might say, a tool for cognitive or biological modeling (explanatory connectionism). It will be argued that statistics will have a major impact on the former but a rather minor on the latter. As a consequence, the gap between applied neural network research and research concerning cognitive modeling with artificial neural networks will become even bigger than it already is. Statistics will be adopted as the theory of engineering connectionism and therefore entail its development fom a purely empirical to a fullgrown theoretical science. Explanatory connectionism has its own problems and will have to undergo its own changes. Consequently, it will finally be seen as a science of its own independent from mere data analysis purposes.

Citation: Flexer A.: Connectionists and Statisticians, Friends or Foes?, in Mira J. & Sandoval F. (eds.), From Natural to Artificial Neural Computation, Proc. International Workshop on Artificial Neural Networks, Malaga-Torremolinos, Spain, June, Springer, LNCS 930, pp. 454-461, 1995. Also available as: TR-95-06.


OFAI-TR-95-05 ( 1181kB g-zipped PostScript file)

Shift-Scheduling with the ``Projections First Strategy''

Johannes Gaertner, Silvia Miksch

Shift-scheduling for employees is highly complex due to the size of each solution and the size of the solution space. We model this real-world problem as a CSP-problem and simplify this NP-hard problem dramatically with an algorithm called "projections first strategy". The projections first strategy benefits from the interaction with the user (e.g., variable ordering) and the extensive usage of aggregated features of possible solutions (projections) to prune the search space and to find building blocks to build up solutions quickly. Our approach captures a human problem solving technique algorithmically, namely modularisation. To give an idea: The merchant asked: How to lay my bricks to get 48 offices, an inviting entrance hall, .... The craftsman answered: First we have to make up our mind on the number of floors, then on the number of rooms. Several floors will be equal. .... Don't bother the bricks at the beginning. Projections make also recomputation easier when constraints change. Projections allow chronological backtracking to move through levels of abstraction and to reuse results. Applying a realistic example we compare the complexity of our approach with backtracking. In good and bad cases our approach reduces the complexity dramatically.

Citation: Gaertner J., Miksch S.: Shift-Scheduling with the ``Projections First Strategy'', Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-05, 1995.


OFAI-TR-95-04 ( 1188kB g-zipped PostScript file)

Automated Data Validation and Repair Based on Temporal Ontologies

Silvia Miksch, Werner Horn, Christian Popow, Franz Paky

Most knowledge-based monitoring and therapy planning systems neglect the importance of data validation. Real data are more faulty than expected. Moreover, only reliable data may be used for effective and efficient therapy planning. Additionally, most systems do not take into account the various kinds of data and the various frequencies at which they are usually available. We propose automated data validation methods which consider the various kinds of real data and which are based on temporal ontologies (time points, time intervals, and trends) in order to arrive at reliable data. Furthermore, our approach includes repair and adjustment methods for correcting wrong or ambiguous data. Our approach benefits from dynamically derived qualitative data-point- and trend-categories which result in unified qualitative descriptions of parameters and overcome the limitations of comparison with predefined static thresholds. Our methods are applicable to domains where different kinds of data are available and where no reliable structure-function model exists because the underlying mechanism is only poorly understood. We applied them in VIE-VENT, an open-loop knowledge-based system for artificially ventilated newborn infants.

Citation: Miksch S., Horn W., Popow C., Paky F.: Automated Data Validation and Repair Based on Temporal Ontologies, Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-04, 1995.


OFAI-TR-95-03 ( 68kB g-zipped PostScript file)

A Tight Integration of Pruning and Learning

Johannes Fürnkranz

This paper outlines some problems that may occur with Reduced Error Pruning in rule learning algorithms. In particular we show that pruning complete theories is incompatible with the separate-and-conquer learning strategy that is commonly used in propositional and relational rule learning systems. As a solution we propose to integrate pruning into learning and examine two algorithms, one that prunes at the clause level and one that prunes at the literal level. Experiments show that these methods are not only much more efficient, but also able to achieve small gains in accuracy by solving the outlined problem.

Citation: Fürnkranz J.: A Tight Integration of Pruning and Learning, in N. Lavrac and S. Wrobel (eds.), Proceedings of the 8th European Conference on Machine Learning (ECML-95), pp. 291-294, Crete, Greece, 1995.


OFAI-TR-95-02 ( 51kB g-zipped PostScript file)

Compression-Based Feature Subset Selection

Bernhard Pfahringer

Irrelevant and redundant features may reduce both predictive accuracy and comprehensibility of induced concepts. Most common Machine Learning approaches for selecting a good subset of relevant features rely on cross-validation. As an alternative, we present the application of a particular Minimum Description Length (MDL) measure to the task of feature subset selection. Using the MDL principle allows taking into account all of the available data at once. The new measure is information-theoretically plausible and yet still simple and therefore efficiently computable. We show empirically that this new method for judging the value of feature subsets is more efficient than and performs at least as well as methods based on cross-validation. Domains with both a large number of training examples and a large number of possible features yield the biggest gains in efficiency. Thus our new approach seems to scale up better to large learning problems than previous methods.

Citation: Pfahringer B.: Compression-Based Feature Subset Selection, Proc. IJCAI'95 Workshop on Data Engineering for Inductive Learning


OFAI-TR-95-01

Artificial Intelligence-Forschung in Österreich 1994: 152 Projekte, 256 Personen, 43 Institutionen, 1329 Publikationen

Robert Trappl, Johannes Matiasek, Gerda Helscher

Mittels einer Fragebogenerhebung wurde als Follow-up der ersten Untersuchung aus dem Jahr 1990 und der zweiten aus 1992 wieder ein Überblick über die sich rasch ändernde österreichische Artificial Intelligence-Forschungslandschaft gewonnen: 1. was, 2. von wem, 3. wo, 4. wie lange 5. von wem gefördert, 6. mit welchen Ergebnissen in Österreich auf dem Gebiet der AI geforscht wird. Die Rücklaufquote betrug 98%. Insgesamt wurden 152 (1992: 133, 1990: 78) Projekte erhoben, von denen 52 bereits abgeschlossen sind - ein Zuwachs von 95% innerhalb von 4 Jahren. Das Spektrum der Projekte reicht von bidirektionaler heuristischer Suche über die automatische Generierung von Angebotserstellungen, die Lokalisation von Fehlern in Speicherchips mit Hilfe neuronaler Netze, ein AI-System zur Erkennung koronarer Herzkrankheiten, ein Expertensystem für Energiemanagement, intelligente autonome Softwareagenten, bis zu AI-Modellen der Wahrnehmung von Musik und einem Projekt zur Unterstützung von Bemühungen zur Vermeidung bzw. Beendigung von Kriegen mittels AI-Methoden. Schwerpunkte der Forschung sind Wissensrepräsentation, Expertensysteme und der Bereich der Neuronalen Netze/Konnektionismus. 40% der Projekte sind im Bereich der Grundlagenforschung, 41% in der angewandten Forschung und 19% in der Entwicklung angesiedelt. 256 (1992: 237, 1990: 142) Personen haben an diesen Forschungsprojekten mitgearbeitet. Es gibt in Österreich derzeit rund 100 haupt- bis nebenberufliche AI-ForscherInnen. Unverändert sticht aus der österreichischen AI-Forschungslandschaft Wien auch als AI-Wasserkopf hervor: In Wien werden dreimal so viele Projekte wie in allen anderen Bundesländern zusammen durchgeführt. Auf dem Gebiet der AI wird derzeit in 43 (1992: 37, 1990: 26) Institutionen geforscht, gegenüber 1990 ein Zuwachs um rund 65%. Rund 40% aller Projekte werden von nur fünf Instituten durchgeführt, die aufgrund von Personalunionen von insgesamt nur drei Wissenschaftern geleitet werden. Die mittlere Projektdauer beträgt 29 Monate (1992: 28, 1990: 24), eine, wenngleich schwache, Tendenz zu länger dauernden Projekten. Der Prozentsatz der geförderten Projekte ist von 65% (1992) auf 58% zurückgegangen. 67% der Projektförderungen kommen von staatlichen Förderstellen, 33% von Firmen - im Vergleich zu 1992 eine Verlagerung von 10% der Förderungen vom Staat zu Firmen und damit ein überproportionaler Rückgang der staatlichen Förderung. Das Verzeichnis aller bisher von österreichischen AI-ForscherInnen veröffentlichten Arbeiten umfaßt nunmehr 1329 (1992: 963, 1990: 563) Publikationen, ein Zuwachs um 136% seit 1990. Viele davon sind in renommierten internationalen Fachzeitschriften erschienen. 167 kooperierende Forschungsinstitute und Firmen wurden genannt (1992: 132, 1990: 66), ein Zuwachs von 153% seit 1990. 62 der Partner waren im Ausland, gegenüber 1990 ein Zuwachs von 169%. 80% der Projektpartner befinden sich in der EU; die USA sind in der Rangliste vom 2. auf den 5.Platz zurückgefallen, Deutschland ist vom 4. auf den 1.Platz aufgerückt. Die österreichische AI-Forschung hat internationales Niveau, wie auch aus der Mitarbeit österreichischer Forschungsinstitutionen in zahlreichen EU-Projekten und der Aufnahme in mehrere Networks of Excellence der EU hervorgeht. Um Schritt zu halten, sind erforderlich: Weitere Intensivierung der internationalen Kontakte; eine Zunahme der Bereitschaft, vergleichbar den westlichen Nachbarländern, sowohl der Wirtschaft zur Zusammenarbeit als auch der öffentlichen Hand zur Förderung von Grundlagen- und angewandter Forschung.

Citation: Trappl R., Matiasek J., Helscher G.: Artificial Intelligence-Forschung in Österreich 1994: 152 Projekte, 256 Personen, 43 Institutionen, 1329 Publikationen , Austrian Research Institute for Artificial Intelligence, Vienna, TR-95-01, 1995.


OFAI-TR-94-34 ( 65kB g-zipped PostScript file)

VIE-CBR: Vienna Case-Based Reasoning Tool, Version 1.0 - Programmer's and Installation Manual

Johann Petrak

VIE-CBR is a collection of Common-LISP programs for storing, manipulating and comparing cases within one or more case libraries. It allows the definition of similarity functions for case attributes and cases and can be used for similarity based case retrieval and classification. This document describes the functions and data structures of the program as well as the necessary steps for installation.

Citation: Petrak J.: VIE-CBR: Vienna Case-Based Reasoning Tool, Version 1.0 - Programmer's and Installation Manual, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-34, 1994.


OFAI-TR-94-33 ( 91kB g-zipped PostScript file)

Machine Learning Methods for International Conflict Databases: A Case Study in Predicting Mediation Outcome

Johannes Fürnkranz, Johann Petrak, Robert Trappl, Jacob Bercovitch

This paper tries to identify rules and factors that are predictive for the outcome of international conflict management attempts. We use C4.5, an advanced Machine Learning algorithm, for generating decision trees and prediction rules from cases in the CONFMAN database. The results show that simple patterns and rules are often not only more understandable, but also more reliable than complex rules. Simple decision trees are able to improve the chances of correctly predicting the outcome of a conflict management attempt. This suggests that mediation is more repetitive than conflicts per se, where such results have not been achieved so far.

Citation: Fürnkranz J., Petrak J., Trappl R., Bercovitch J.: Machine Learning Methods for International Conflict Databases: A Case Study in Predicting Mediation Outcome , Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-33, 1994.


OFAI-TR-94-32 ( 104kB g-zipped PostScript file)

The Possible Contribution of AI to the Avoidance of Crises and Wars: Using CBR Methods with the KOSIMO Data Base of Conflicts

Johann Petrak, Robert Trappl, Johannes Fürnkranz

This paper presents the application of Case-Based Reasoning methods to the KOSIMO data base of international conflicts. A Case-Based Reasoning tool - VIE-CBR - has been deveolped and used for the classification of various outcome variables, like political, military, and territorial outcome, solution modalities, and conflict intensity. In addition, the case retrieval algorithms are presented as an interactive, user-modifiable tool for intelligently searching the conflict data base for precedent cases.

Citation: Petrak J., Trappl R., Fürnkranz J.: The Possible Contribution of AI to the Avoidance of Crises and Wars: Using CBR Methods with the KOSIMO Data Base of Conflicts, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-32, 1994.


OFAI-TR-94-31 ( 1115kB g-zipped PostScript file)

Therapy Planning Using Qualitative Trend Descriptions

Silvia Miksch, Werner Horn, Christian Popow, Franz Paky

This paper addresses a method of therapy planning applicable in the absence of an appropriate curve-fitting model. It incorporates knowledge about data points, data intervals, and expected qualitative trend description to arrive at unified qualitative descriptions of parameters (temporal data abstraction). Our approach benefits from derived qualitative values which can be used for recommending therapeutic actions as well as for assessing the effectiveness of these actions within a certain period. It results in an easily comprehensible and transparent concept of therapy planning. Furthermore, we improved the system model of data interpretation and therapy planning by using importance ranking of parameters, priority lists of attainable goals, and pruning of contradictory therapy recommendations. Our methods are applicable in domains where an appropriate curve-fitting model is not available in advance. We have applied them in the field of artificial ventilation of newborn infants. The utility of our approach is illustrated by VIE-VENT, an open-loop knowledge-based system for artificially ventilated newborn infants.

Citation: Miksch S., Horn W., Popow C., Paky F.: Therapy Planning Using Qualitative Trend Descriptions, Proc. 5th Conference on Artificial Intelligence in Medicine Europe, AIME-95, Pavia, Italy.


OFAI-TR-94-30

Auswertung der Datenbank ``International Conflict Episodes, 1945 -- 1979'' mittels des Programmpaketes C4.5

Robert Kopecny

Das induktive Lernprogramm C4.5 wird zur Analyse von R. Alker's und L. Sherman's Datenbank ``International Conflict Episodes'' verwendet. Diese Datenbank enthält 370 Konfliktfälle von gewaltfreien Auseinandersetzungen bis zu Kriegen in der Zeitspanne 1945 - 1979. In mehreren Experimenten wird der Einfluß der verschiedenen Fallattribute auf die Entwicklung und den Ausgang des Konflikts untersucht.

Citation: Kopecny R.: Auswertung der Datenbank ``International Conflict Episodes, 1945 -- 1979'' mittels des Programmpaketes C4.5, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-30, 1994.


OFAI-TR-94-29 ( 54kB g-zipped PostScript file)

A New MDL Measure for Robust Rule Induction

Bernhard Pfahringer

We present a generalization of a particular Minimum Description Length (MDL) measure that sofar has been used for pruning decision trees only. The generalized measure is applicable to (propositional) rule sets directly. Furthermore the new measure also does not suffer from problems reported for various MDL measures in the ML literature. The new measure is information-theoretically plausible and yet still simple and therefore efficiently computable. It is incorporated in a propositional FOIL-like learner called KNOPF. We report on favorable results in various purely symbolic propositional domains. Both rule quality in terms of simplicity (and syntactic closeness to the respective underlying theory where known) and predictive accuracy of induced theories are convincing.

Citation: Pfahringer B.: A New MDL Measure for Robust Rule Induction, Extended version of a paper presented at the 8th European Conference on Machine Learning, ECML-95.


OFAI-TR-94-28 ( 261kB g-zipped PostScript file)

Efficient Pruning Methods for Relational Learning

Johannes Fürnkranz

This thesis is concerned with efficient methods for achieving noise-tolerance in Machine Learning algorithms that are capable of using relational background knowledge. While classical algorithms are restricted to learn propositional concepts in the form of decision trees or decision lists, relational learning algorithms are able to include into the learning process not only knowledge about data attributes and values, but also about relations between the attributes. As these algorithms use a more powerful representation language --- they learn PROLOG programs for classification --- they are part of the recent field of Inductive Logic Programming, a new research area at the intersection of Machine Learning and Logic Programming. In this work we first review several known methods for achieving noise-tolerance and put them into a unified framework and then introduce three new and improved algorithms. The two basic approaches to pruning are either to try to recognize noise in the data during the learning process (pre-pruning) or to first learn a theory from the data as they are and subsequently try to detect and correct the resulting mistakes in this theory (post-pruning). Both approaches having their advantages, the major part of this thesis is devoted to trying to combine and integrate them into new powerful algorithms. A series of experiments with artificial and natural data sets demonstrates the usefulness of these approaches.

Citation: Fürnkranz J.: Efficient Pruning Methods for Relational Learning, Ph.D. Thesis, Technical University of Vienna, Austria, November 1994.


OFAI-TR-94-27 ( 60kB g-zipped PostScript file)

Adapting to Drift in Continuous Domains

Miroslav Kubat, Gerhard Widmer

The paper presents the system FRANN, which exploits the idea of radial-basis functions for the needs of learning in numeric domains under concept drift. The classification accuracy of the program compares favourably to that of older algorithms that are based on symbol manipulation. The system tolerates noise and is able to learn symbolic, numeric, and mixed concepts with nonlinear boundaries in environments with abrupt as well as gradual concept drift.

Citation: Kubat M., Widmer G.: Adapting to Drift in Continuous Domains, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-27, 1994.


OFAI-TR-94-26 ( 58kB g-zipped PostScript file)

Pruning Methods for Rule Learning Algorithms

Johannes Fürnkranz

In this paper we will shortly review several pruning methods for relational learning algorithms and show how they are related to each other. We then report some experiments in several natural domains and try to analyse the performance of the algorithms in these domains in terms of run-time and accuracy. While some algorithms are clearly faster than others, no safe recommendation for achieving high accuracy can be given.

Citation: Fürnkranz J.: Pruning Methods for Rule Learning Algorithms, Proc. 4th Intl. Inductive Logic Programming Workshop, ILP-94, pp. 321-336, Bad Honnef/Bonn, 1994.


OFAI-TR-94-25 ( 49kB g-zipped PostScript file)

A CLP approach to Detection and Identification of Control System Component Failures

Yousri El Fattah, Christian Holzbaur

One approach to fault detection and identification (FDI) in control systems is based on computing the so-called parity relations for sensors and actuators. We state the problem of generating those parity relations in the language of constraints, and describe an implementation of a parity solver in the Constraint Programming Language CLP(Re). The solver adopts a discrete linear time-invariant (LTI) model of control systems, and outputs a set of single-component parity relations. We describe a FDI procedure, also in CLP(Re), that monitors system behavior and computes single-fault diagnoses. The CLP approach enhances the naturalness of representation of system relations, and makes use of the resolution capability of CLP both for deriving parity relations and for making diagnostic decisions. An example is given to illustrate the viability of the CLP approach.

Citation: El Fattah Y., Holzbaur C.: A CLP approach to Detection and Identification of Control System Component Failures, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-25, 1994.


OFAI-TR-94-24 ( 207kB g-zipped PostScript file)

Wissensbasierte Systeme in der Intensivmedizin: Was können sie, was sollten sie können?

Silvia Miksch

Moderne Intensivstationen (ICUs) sind mit komplizierten technischen Geräten ausgestattet, die eine Fülle von on-line Daten liefern. Diese Informationsfülle schafft auch Probleme: die Daten müssen vertrauenswürdig sein, fehlerhafte Daten müssen als solche erkannt und aus dem Entscheidungsprozeß eliminiert werden. Auch der Verlauf der wichtigsten Parameter muß transparent sein. Für die Beurteilung des Gesundheitszustandes von PatientInnen muß die wichtigste Information herausgefiltert und in kürzester Zeit interpretiert werden. Die Daten müssen gespeichert und jederzeit rasch wieder abgerufen werden können. Informationstechnologie, speziell die Methoden der Artificial Intelligence, können einige dieser notwendigen Abläufe unterstützen und erleichtern. Ausgehend von der Analyse der Notwendigkeit sowie der Vor- und Nachteile einer informationstechnischen Unterstützung werden einige Anwendungen aus dem Bereich der wissensbasierten Monitoring-, Diagnose- und Therapieplanungssysteme vorgestellt. Solche Systeme reichen von einfachen ``intelligent alarming systems'' bis zu hochkomplexen entscheidungsunterstützenden Systemen. Abschließend werden Forderungen für die effektive und erfolgreiche Planung und Implementierung eines Monitoring- und Therapieplanungssystems aus medizinischer und technischer Sicht formuliert. Dieser Beitrag behandelt nicht ``Patient Data Management Systems'', (PDMS) sondern "Patient Management Systems" (PMS), die auf PDMS aufsetzen und eine Unterstützung der ÄrztInnen bei der medizinischen Behandlung von PatientInnen darstellen. PDMS beschränken sich auf die Verwaltung von PatientInnendaten.

Citation: Miksch S.: Wissensbasierte Systeme in der Intensivmedizin: Was können sie, was sollten sie können?, In EDV im Krankenhaus, Wissenschaftliche Schriftenreihe der Wissenschaftlichen Landesakademie für NÖ, Krems, 1994.


OFAI-TR-94-23 ( 166kB g-zipped PostScript file)

Toward Improving Exercise ECG for Detecting Ischemic Heart Disease with Recurrent and Feedforward Neural Nets

Georg Dorffner, Ernst Leitgeb, Heinz Koller

This paper reports about a study evaluating the usefulness of neural networks for the early detection of heart disease based on ECG and other measurements during exercise testing. Data from 350 persons who underwent stress tests consisted of patient demographic data and fifteen time frames of measurements during stress and rest. Three different neural networks, two recurrent and one feedforward using background knowledge for preprocessing, were trained and compared to the performance of skilled cardiologists. It could be shown that the best neural networks can compete with experts in classifying tests as CAD (coronary artery disease) or normal. What concerns an index value expressing the likelihood of disease, to be used for monitoring the success of treatments, the neural networks outperformed classical statistical techniques published previously. This study has thus shown large evidence in favor of using neural nets to improve the exercise ECG as a non-invasive technique for detecting heart diseases.

Keywords: , Exercise ECG, Coronary Heart Disease, Medical Diagnosis, Recurrent Neural Networks, Backpropagation

Citation: Dorffner G., Leitgeb E., Koller H.: Toward Improving Exercise ECG for Detecting Ischemic Heart Disease with Recurrent and Feedforward Neural Nets, Proc. IEEE Workshop on Neural Networks in Signal Processing, 1994.


OFAI-TR-94-22 ( 57kB g-zipped PostScript file)

Synergies between Statistical Data Analysis and Neural Networks in the Control of Rotary Blood Pumps

Georg Dorffner, Christian Stöcklmayer, Christian Schmidt, Heinrich Schima

In this paper we report about the application of multilayer perceptrons to three important tasks in the control of rotary blood pumps, namely the estimation of left atrium pressure, and the indication of suction, as well as danger of suction, in the left atrium. Special focus is laid on the value of traditional techniques for statistical data analysis such as principal component analysis (PCA) as tools to guide the design of the neural network solution. Eleven parameters derived from actual measurements during the performance of the pump were available as input for the three tasks. With the help of PCA, they could be reduced to three major components and thus visualized. This visualisation served as guidance for the proper choice of network, and as a tool for initialization to improve learning.

Keywords: , Principal Component Analysis, Multilayer Perceptrons, Initialization, Medical Application, Rotary Blood Pumps, Control

Citation: Dorffner G., Stöcklmayer C., Schmidt C., Schima H.: Synergies between Statistical Data Analysis and Neural Networks in the Control of Rotary Blood Pumps, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-22, 1994.


OFAI-TR-94-21 ( 48kB g-zipped PostScript file)

New AI: Naturalness Revealed in the Study of Artificial Intelligence

Erich Prem

This paper seeks to describe the science of a ``New AI''. (I tried to use ``New AI'' (with a capital N) when regarding a certain set of disciplines as one coherent field of science. But since I believe that ``New AI'' is not a new AI, but only the right AI, I may not have succeeded in a consistent use of this notation---for which I apologize.) It explaines why this new development belongs to Artificial Intelligence and why it should be called new. A list of criteria is established which a new AI must satisfy. Artificial life, autonomous agents and the field of symbol grounding are revisited in the light of this explanation. One bottom-up approach to symbol grounding is described. It can be seen as a step towards new production systems or heuristics. The central thesis of this paper is that in the course of ever advancing computer simulations of aspects of intelligence the central goal of AI has come out of sight. The different approaches to New AI share commonalities in that they (i) question the applicability of a reductionist approach and (ii) try to open the closed formal AI programs to environmental interaction. The third failure of old AI---being a theory of only aspects of intelligence---still pertains to new AI. However, first steps are undertaken in the right direction on a more natural way to the study of intelligence.

Citation: Prem E.: New AI: Naturalness Revealed in the Study of Artificial Intelligence, in: Neural Networks and a New AI (G. Dorffner Ed.), Chapman and Hall, London, 1995.


OFAI-TR-94-20 ( 55kB g-zipped PostScript file)

Symbol Grounding and Transcendental Logic

Erich Prem

Symbol Grounding tries to answer the question as to how it is possible for a computer program to use symbols which are not arbitrarily interpretable. Whereas the signs in conventional programs are just ``parasitic on the meaning in our heads'', grounded symbols should possess at least some ``intrinsic meaning.'' This paper gives a brief overview of what Symbol Grounding is and summarizes some of today's connectionist Symbol Grounding models. Instead of concentrating on cognitive linguistics, we try to present an alternative view of Symbol Grounding. Our analysis reveals that Symbol Grounding is in fact the endeavour of automated model construction. Although it originated in a somewhat anti-formal spirit it is (necessarily) full of parallels to classical symbolic logic. We present our view that Symbol Grounding is in fact a connectionist version of transcendental logic, which is the basis for generating formal models of non-formal domains. Such formalizations are inherently logical, though not only based on formal but also on material truth conditions.

Citation: Prem E.: Symbol Grounding and Transcendental Logic, in Niklasson L. & Boden M.(eds.), Current Trends in Connectionism, Lawrence Erlbaum, Hillsdale, NJ, p. 271-282, 1995.


OFAI-TR-94-19 ( 50kB g-zipped PostScript file)

Symbol Grounding Revisited

Erich Prem

Symbol Grounding has originated in the domain of cognitive connectionism as an approach to a model of language acquisition. It has, however, transcended this restricted domain and its most prominent proponents now regard it as a technique which is of general importance to the connectionist-symbolic debate, especially for their integration into a common framework. This paper revisits the claims made by symbol grounders and summarizes different models which have been presented to date. We try to answer the question what Symbol Grounding really is all about and point to its parallels with classical logic and reasoning which have not received sufficient attention so far.

Citation: Prem E.: Symbol Grounding Revisited, Presented at the ``Workshop on Connectionist-Symbolic Integration'' at the European Conf. on Artificial Intelligence, ECAI-94, Amsterdam, 1994.


OFAI-TR-94-18

Neural Networks Exhibit Fundamental Differences in Modeling Natural Systems Compared to Rule-Based Systems

Erich Prem

Whereas it is generally appreciated that in a certain sense there are some differences between the neural network and the rule-based approach to modeling systems, exact characterizations of them are rare. This paper applies a terminological system to the study of this problem which has been developed in a completely different context, namely in theoretical biology. With this background it is easy to see that in clarifying differences between the two approaches it is necessary to keep in mind that formal models are usually about reality. It is then possible to show that the two ways of modeling differ in how such models of natural systems are constructed. This is done here by comparing representational structures (network units or predicates of rules) with measurement devices. It can be shown that the allowed state set of rule-based models is generally much larger than that of networks. The consequences of this observation for the treatment of learning, missing information and similarity are also discussed here.

Citation: Prem E.: Neural Networks Exhibit Fundamental Differences in Modeling Natural Systems Compared to Rule-Based Systems, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-18, 1994.


OFAI-TR-94-17 ( 56kB g-zipped PostScript file)

Constraint Logic Programming for Structure-Based Reasoning about Dynamic Physical System

Yousri El Fattah

The paper describes a framework for reasoning about dynamic physical systems based on structure. The framework integrates the language of bond graphs (BG) with the language of constraint logic programming (CLP). The advantage of such integration is twofold. First, to exploit the wealth of reasoning methods developed in the BG area within system dynamics. Second, to enhance the naturalness of representation of system relations and possibly increase solution efficiency via CLP. The paper describes methods for causal modeling of dynamic physical system including the generation of causal explanations.

Citation: El Fattah Y.: Constraint Logic Programming for Structure-Based Reasoning about Dynamic Physical System, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-17, 1994.


OFAI-TR-94-16 ( 50kB g-zipped PostScript file)

A Comparison of Pruning Methods for Relational Concept Learning

Johannes Fürnkranz

Pre-Pruning and Post-Pruning are two standard methods of dealing with noise in concept learning. Pre-Pruning methods are very efficient, while Post-Pruning methods typically are more accurate, but much slower, because they have to generate an overly specific concept description first. We have experimented with a variety of pruning methods, including two new methods that try to combine and integrate pre- and post-pruning in order to achieve both accuracy and efficiency. This is verified with test series in a chess position classification task.

Citation: Fürnkranz J.: A Comparison of Pruning Methods for Relational Concept Learning, Proc. AAAI-94 Workshop on Knowledge Discovery in Databases, Seattle, WA, 1994.


OFAI-TR-94-15

The Value of Background Knowledge for ILP: A Case Study in the Interpretation of Tl-201 Myocardial SPECT Polar Maps

Paolo Petta

In this case study we applied an inductive learner (FOIL) to the tasks of detecting the presence of coronary artery disease (CAD) at increasing levels of detail, relating input data derived from Tl-201 myocardial scintigraphic polar maps to diagnostic statements obtained by visual evaluation of angiographic films. Several differently structured representations of the domain knowledge were used for each of three binary classification problems: overall detection of CAD, detection of CAD in the combined circumflex artery/right coronary arteries (LCX/RCA), and detection of CAD in single vessels (LAD, LCX, and RCA). We review both diagnostic performance and characteristics of the definitions learned. The present analysis underscores the importance of coherent modeling of the domain knowledge covering appropriate levels of abstraction: the inclusion of `islands' of poorly interrelated representation structures has a noticeable negative impact on the performance of the learner. In a large majority of investigated cases the domain descriptions meeting this requirement achieve better accuracy and higher information scores than an immediate, flat representation. It also becomes apparent that for well-structured domain representations the learning process converges more readily towards a consistent set of more expressive and concise theories. Drawing on these findings we identify what appear to be promising directions for further experimentation.

Citation: Petta P.: The Value of Background Knowledge for ILP: A Case Study in the Interpretation of Tl-201 Myocardial SPECT Polar Maps, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-15, 1994.


OFAI-TR-94-14 ( 103kB g-zipped PostScript file)

Learning in the Presence of Concept Drift and Hidden Contexts

Gerhard Widmer, Miroslav Kubat

On-line learning in domains where the target concept depends on some hidden context poses serious problems. Context shifts can induce changes in the target concepts, producing what is known as concept drift. We describe a family of learning algorithms that flexibly react to concept drift and can take advantage of situations where contexts reappear. The general approach underlying all these algorithms consists of (1) keeping only a window of currently trusted examples and hypotheses; (2) storing concept descriptions and re-using them if a previous context re-appears; and (3) controlling both of these functions by a heuristic that constantly monitors the system's behavior. The paper reports on experiments that test the systems' performance under various levels noise and different extent and speed of concept drift.

Citation: Widmer G., Kubat M.: Learning in the Presence of Concept Drift and Hidden Contexts, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-14, 1994.


OFAI-TR-94-13 ( 91kB g-zipped PostScript file)

CN2-MCI: A Two-Step Method for Constructive Induction

Stefan Kramer

Methods for constructive induction perform an automatic transformation of description spaces if representational shortcomings deteriorate the quality of learning. In the context of concept learning and propositional representation languages, feature construction algorithms have been developed in order to improve the accuracy and to decrease the complexity of hypotheses. Particularly, so-called hypothesis-driven constructive induction (HCI) algorithms construct new attributes based upon the analysis of induced hypotheses. Well-known HCI-systems analyze decision trees, or employ a coarse-grained analysis of decision rules. This paper introduces a new constructive operator Ok and documents its applicability in the usual HCI-framework. Ok uses a cluster algorithm to map selected features into a new binary feature. A new method for constructive induction, CN2-MCI, is described that applies Ok as its only constructive operator to achieve a fine-grained analysis of decision rules. The output of CN2-MCI is an inductive hypothesis expressed in terms of the transformed representation, given training examples as input. CN2-MCI is theoretically and empirically compared with existing methods for constructive induction.

Citation: Kramer S.: CN2-MCI: A Two-Step Method for Constructive Induction, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-13, 1994.


OFAI-TR-94-12

VieNet2 V2.0 --- Vienna Neural Network Toolkit 2

Günter Linhart, Georg Dorffner

The VieNet2 is a toolkit to implement simulations of neural networks without need to take care of memory allocation, pointer-handling and graphical representation. It comprises a set of functions and datatypes, written in ANSI-C, which will be linked with your own main-program. The package was developed some years ago at the Austrian Research Institute for Artificial Intelligence (ARIAI) and will be extended currently. Version 2.0 now supports dynamic layers (interactively resizeable), integrated learning functions for standard paradigmas, extended pattern management and DDE (Dynamic Data Exchange, for MS Windows only). VieNet2 is distributed as Shareware. For non-commercial usage and for a testing period of 30 days you are allowed to use and give copies of VieNet2 to your friends. Each copy of VieNet2 must be unchanged and complete. VieNet2 is available for DOS, MS-Windows (Borland -C++ 3.x), X-Window (GNU-C-Compiler V2.x) und HP Apollo.

Citation: Linhart G., Dorffner G.: VieNet2 V2.0 --- Vienna Neural Network Toolkit 2, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-12, 1994.


OFAI-TR-94-11 ( 57kB g-zipped PostScript file)

Robust Constructive Induction

Bernhard Pfahringer

We describe how CiPF 2.0, a propositional constructive learner, can cope with both noise and representation mismatch in training examples simultaneously. CiPF 2.0 abilities stem from coupling the robust selective learner C4.5 (and its production rule generator) with a sophisticated constructive induction component. An important new general constructive operator incorporated into CiPF 2.0 is the "simplified Kramer operator" abstracting binary attribute combinations into a single new boolean attribute. The so-called "Minimum Description Length" (MDL) principle acts as a powerful control heuristic guiding search in the representation space through the abundance of opportunities for constructively adding new attributes. Claims are confirmed empirically by experiments in two artificial domains.

Keywords: , Minimum Description Length, Constructive Induction, Noise

Citation: Pfahringer B.: Robust Constructive Induction, Proc. 18.Deutsche Jahrestagung für Künstliche Intelligenz (KI-94), Saarbrücken, Germany.


OFAI-TR-94-10 ( 56kB g-zipped PostScript file)

Multi-recurrent Networks for Traffic Forecasting

Claudia Ulbricht

Recurrent neural networks solving the task of short-term traffic forecasting are presented in this report. They turned out to be very well suited to this task, they even outperformed the best results obtained with conventional statistical methods. The outcome of a comparative study shows that multiple combinations of feedback can greatly enhance the network performance. Best results were obtained with the newly developed Multi-recurrent Network combining output, hidden, and input layer memories having self-recurrent feedback loops of different strengths. The outcome of this research will be used for installing an actual tool at a highway check point. The investigated methods provide short-term memories of different length which are not only needed for the given application, but which are of importance for numerous other real world tasks.

Citation: Ulbricht C.: Multi-recurrent Networks for Traffic Forecasting, Revised version of a paper in Proc. 12th Natl. Conf. on Artificial Intelligence, AAAI-94, Seattle, WA, 1994.


OFAI-TR-94-09 ( 61kB g-zipped PostScript file)

Incremental Reduced Error Pruning

Johannes Fürnkranz, Gerhard Widmer

This paper outlines some problems that may occur with Reduced Error Pruning in Inductive Logic Programming, most notably efficiency. Thereafter a new method, Incremental Reduced Error Pruning, is proposed that attempts to address all of these problems. Experiments show that in many noisy domains this method is much more efficient than alternative algorithms, along with a slight gain in accuracy. However, the experiments show as well that the use of this algorithm cannot be recommended for domains with a very specific concept description.

Citation: Fürnkranz J., Widmer G.: Incremental Reduced Error Pruning, Proc. 11th Intl. Conf. on Machine Learning, ML-94, pp. 70-77, Rutgers University, NJ, 1994.


OFAI-TR-94-08 ( 46kB g-zipped PostScript file)

Intensive Care Monitoring and Therapy Planning for Newborns

Silvia Miksch, Werner Horn, Christian Popow, Franz Paky

We developed a knowledge-based system, VIE-VENT, for monitoring and therapy planning of the artificial ventilation of newborn infants. Clinical and textbook knowledge were implemented in VIE-VENT's knowledge base. Therapy planning was based on transcutaneously and invasively determined blood gas measurements and on clinical observations. After the selection of appropriate input parameters, measured data were validated and transformed into qualitative values. If these values differed from target values, therapeutic actions were proposed according to heuristic clinical rules of artificial ventilation. VIE-VENT was specifically designed for practical use under real-time constraints in Neonatal Intensive Care Units (NICUs). VIE-VENT was applied to the ICU data set provided by the organizers of the AAAI-AIM-94 symposium and a neonatal data set, which covered a neonatal case of similar severity. It included all transcutaneous measurements and allowed to explore the full potential of VIE-VENT.

Citation: Miksch S., Horn W., Popow C., Paky F.: Intensive Care Monitoring and Therapy Planning for Newborns, Proc. AAAI-94 Spring Symposium: Artificial Intelligence in Medicine: Interpreting Clinical Data, AAAI Press, Technical Reports Series, Menlo Park, CA, 1994.


OFAI-TR-94-07 ( 54kB g-zipped PostScript file)

A Specialized, Incremental Solved Form Algorithm for Systems of Linear Inequalities

Christian Holzbaur

We present a computationally improved incremental solved form algorithm for systems of linear equations and inequalities. The algorithm is of the pivotal algebra type. It benefits (computationally) from a specialization of the classical Simplex algorithm that treats inequalities of dimension one, i.e. of the shape kx + d <= 0, special. In particular, the introduction of a slack variables is avoided in this case, which results in a basis that consists of higher dimensional inequality constraints only. Although the classical results concerning the complexity results for the Simplex algorithm apply, in particular in the worst case, the specialization is justified on the basis that even in the unlikely case that the special cases should not occur in practical programs, the average complexity is not higher than that of the classical algorithm. The proposed algorithm matches and advances current activities in the CLP area that try to restrict the use of general and expensive decision methods to the cases where they are unavoidable.

Citation: Holzbaur C.: A Specialized, Incremental Solved Form Algorithm for Systems of Linear Inequalities, Austrian Research Institute for Artificial Intelligence, Vienna, TR-94-07, 1994.


OFAI-TR-94-06 ( 169kB g-zipped PostScript file)

The Synergy of Music Theory and AI: Learning Multi-Level Expressive Interpretation

Gerhard Widmer

The paper presents genuinely interdisciplinary research in the intersection of AI (machine learning) and Art (music). We describe an implemented system that learns expressive interpretation of music pieces from performances by human musicians. This problem, shown to be very difficult in the introduction, is solved by combining insights from music theory with a new machine learning algorithm. Theoretically founded knowledge about music perception is used to transform the original learning problem to a more abstract level where relevant regularities become apparent. Experiments with performances of Chopin waltzes are presented; the results indicate musical understanding and the ability to learn a complex task from very little training data. As the system's domain knowledge is based on two established theories of tonal music, the results also have interesting implications for music theory.

Citation: Widmer G.: The Synergy of Music Theory and AI: Learning Multi-Level Expressive Interpretation, Proc. 12th Natl. Conf. on Artificial Intelligence, AAAI-94, Seattle, WA, 1994.


OFAI-TR-94-05 ( 55kB g-zipped PostScript file)

Computing Minimal Diagnoses with Critical Set Algorithms

Igor Mozetic

The paper is concerned with the time complexity of model-based diagnosis. Our experiments indicate that the time to compute minimal diagnoses is dominated by the calls to the model of the device being diagnosed. In the paper we describe an attempt to reduce the number of model calls by incorporating two critical set algorithms (Loveland 1987) into IDA (Mozetic 1992). A critical set algorithm computes a minimal diagnosis with O(log n) model calls as opposed to O(n) model calls made by a straightforward algorithm. We performed experiments on two non-trivial domains: (a) a 1000-bit adder which has simple structure and behaviour, but large number of components (5000) and minimal diagnoses, and (b) the KARDIO model of the heart with complicated structure and behaviour, but relatively small search space and few minimal diagnoses. The reported results are negative: the straightforward algorithm outperforms more sophisticated critical set algorithms. We analyse the results and show that both critical set algorithms are suboptimal in the number of failed model calls which dominate the total number of model calls and consequently the overall diagnostic time.

Citation: Mozetic I.: Computing Minimal Diagnoses with Critical Set Algorithms, Proc. 11th European Conf. on Artificial Intelligence, ECAI-94, pp. 657-661, Amsterdam, John Wiley and Sons, 1994.


OFAI-TR-94-04 ( 61kB g-zipped PostScript file)

Context-Sensitive Data Validation and Data Abstraction for Knowledge-Based Monitoring

Silvia Miksch, Werner Horn, Christian Popow, Franz Paky

This paper addresses two important components of our knowledge-based system, VIE-VENT, a monitoring and therapy planning system for artificially ventilated newborn infants: data validation and data abstraction. VIE-VENT is specifically designed for practical use under real-time constraints in Neonatal Intensive Care Units (NICUs). Monitoring includes observing and guiding the behavior of a system. We concentrate on the initial steps in the monitoring and therapy planning processes --- detecting anomalous system behavior quickly and arriving at reliable measurements for the therapy planning steps. The monitoring task must take into account that not all sensor data can be checked in available time and there are more faulty data than may be expected. Our approach is a context-sensitive focussing on relevant continuous and discontinuous data, a validating and an abstracting of these data. Important issues are the detection of artifacts, the adjustment of thresholds and the transformation of numerical data into qualitative values. These methods were applied in a context-sensitive and dynamic way by recognizing the interaction between different measurements in the context of the current clinical situation of the neonate. Additionally, we used a combination of statistical analysis with knowledge-based system technology. This paper presents a summary of the methods used for data validation and data abstraction. The methods are illustrated by examples from our VIE-VENT application.

Citation: Miksch S., Horn W., Popow C., Paky F.: Context-Sensitive Data Validation and Data Abstraction for Knowledge-Based Monitoring, Proc. 11th European Conf. on Artificial Intelligence, ECAI-94, pp. 48-52, Amsterdam, John Wiley and Sons, 1994.


OFAI-TR-94-03 ( 49kB g-zipped PostScript file)

Top-Down Pruning in Relational Learning

Johannes Fürnkranz

Pruning is an effective method for dealing with noise in Machine Learning. Recently pruning algorithms, in particular Reduced Error Pruning, have also attracted interest in the field of Inductive Logic Programming. However, it has been shown that these methods can be very inefficient, because most of the time is wasted for generating clauses that explain noisy examples and subsequently pruning these clauses. We introduce a new method which searches for good theories in a top-down fashion to get a better starting point for the pruning algorithm. Experiments show that this approach can significantly lower the complexity of the task as well as increase predictive accuracy.

Citation: Fürnkranz J.: Top-Down Pruning in Relational Learning, Proc. 11th European Conf. on Artificial Intelligence, ECAI-94, pp. 453-457, Amsterdam, John Wiley and Sons, 1994.


OFAI-TR-94-02 ( 53kB g-zipped PostScript file)

Morphology with a Null-Interface

Harald Trost, Johannes Matiasek

We present an integrated architecture for word-level and sentence-level processing in a unification-based paradigm. The system is built upon a unification engine implemented in a CLP language supporting types and definite relations as part of feature descriptions. In this framework an HPSG-style grammar is implemented. Word-level processing uses X2MorF a morphological component based on an extended version of two-level morphology. This component is tightly integrated with the grammar as a specialized relation. This architecture has computational as well as linguistic advantages. Integrating morphology and morphophonology directly into the grammar is in the spirit of HPSG which views grammar as a relation between the phonological (or graphemic) form of an utterance and its syntactic/semantic representation. This way also the treatment of phenomena transcending the boundary between morphology and syntax is made possible. On the implementation side, the practical problems of interfacing two inherently different modules are eliminated. For applications this means that using a morphological component is made easy. Nevertheless, this tight integration still leaves morphology and syntax/semantics as autonomous components, enabling direct use of existing data sets describing morphophonology in terms of the two-level paradigm.

Citation: Trost H., Matiasek J.: Morphology with a Null-Interface, Proc. COLING-94


OFAI-TR-94-01 ( 65kB g-zipped PostScript file)

Combining Robustness and Flexibility in Learning Drifting Concepts

Gerhard Widmer

The paper deals with incremental concept learning from classified examples. In many real-world applications, the target concepts of interest may change over time, and incremental learners should be able to track such changes and adapt to them. The problem is known in the literature as concept drift. The paper presents a new method for learning in such changing environments. In particular, it addresses the problem of learning drifting concepts from noisy data. We present an algorithm that is both robust against noise and quick at recognizing and adapting to changes in the target concepts. The method has been implemented in a system named FLORA4, the latest member of a whole family of learning algorithms. Experiments demonstrate significant improvement over previous results, both in noise-free and noisy situations.

Citation: Widmer G.: Combining Robustness and Flexibility in Learning Drifting Concepts, Proc. 11th European Conf. on Artificial Intelligence, ECAI-94, pp. 468-472, Amsterdam, John Wiley and Sons, 1994.


OFAI-TR-93-32

Soziale Aspekte im Cyberspace

Robert Trappl

Über zukünftige Verhaltensweisen in und mögliche soziale Auswirkungen von Cyberspaces (Multi-user Virtual Environments) wird ausführlich spekuliert. Bereits jetzt gibt es aber mehrere hundert textbasierte Cyberspaces (MUDs) mit über einer Million BenutzerInnen, die Hinweise auf zukünftige Auswirkungen liefern können. In diesem Bericht werden anhand der mit ihnen gemachten Erfahrungen Aspekte wie Formen der Selbstdarstellung, soziales Verhalten, insbes. sexuelle Belästigung, Sozial- und Rechtsstruktur, insbes. Rückgriffe auf vordemokratische Rechtsformen, Forderungen nach ``Grounding von TeilnehmerInnen'' u.a. diskutiert.

Citation: Trappl R.: Soziale Aspekte im Cyberspace, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-32, 1993.


OFAI-TR-93-31

CONVTIF: Ein Tool zur Konvertirung von TIFF- in ASCII-Dateien, Benutzerhandbuch

Günter Linhart

No abstract.

Citation: Linhart G.: CONVTIF: Ein Tool zur Konvertirung von TIFF- in ASCII-Dateien, Benutzerhandbuch, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-31, 1993.


OFAI-TR-93-30 ( 61kB g-zipped PostScript file)

Words, Symbols, and Symbol Grounding

Georg Dorffner, Erich Prem, Harald Trost

In this paper we present a definition of `symbol' in cognitive science which is designed to clear some obvious misunderstandings in discussions around ``symbolic'' vs. ``sub-symbolic'' approaches. We discuss this definition in the light of three different frames of reference (i.e. three different views, namely the intelligent agent's, an observer's, and a meta-observer's). Then we show the implications of these views for cognitive science and artificial intelligence (AI) and discuss whether the most conspicous ``symbols'' in cognition --- words in a language --- can fulfill the ideals behind their definition.

Citation: Dorffner G., Prem E., Trost H.: Words, Symbols, and Symbol Grounding, in: Proc. 16th Wittgenstein Symposium


OFAI-TR-93-29 ( 58kB g-zipped PostScript file)

Controlling Constructive Induction in CiPF: An MDL Approach

Bernhard Pfahringer

We describe the learning system CiPF, which tightly couples a simple concept learner with a sophisticated constructive induction component. It is described in terms of a generic architecture for constructive induction. We focus on the problem of controlling the abundance of opportunities for constructively adding new attributes. In CiPF the so-called Minimum Description Length (MDL) principle acts as a powerful control heuristic. This is also confirmed in the experiments reported.

Citation: Pfahringer B.: Controlling Constructive Induction in CiPF: An MDL Approach, Proc. European Conf. on Machine Learning, ECML-94, pp. 242-256, Catania, Italy, Springer-Verlag, 1994.


OFAI-TR-93-28 ( 88kB g-zipped PostScript file)

FOSSIL: A robust relational learner

Johannes Fürnkranz

The research reported in this paper describes FOSSIL, an ILP system that uses a search heuristic based on statistical correlation. Several interesting properties of this heuristic are discussed, and a it is shown how it naturally can be extended with a simple, but powerful stopping criterion that is independent of the number of training examples. Instead, FOSSIL's stopping criterion depends on a search heuristic that estimates the utility of literals on a uniform scale. After a comparison with FOIL and mFOIL in the KRK domain and on the mesh data, we outline some ideas how FOSSIL can be adopted for top-down pruning and present some preliminary results.

Citation: Fürnkranz J.: FOSSIL: A robust relational learner, An extended version of the paper in Proc. European Conf. on Machine Learning, ECML-94, pp. 122-137, Catania, Italy, Springer-Verlag, 1994.


OFAI-TR-93-27

Hypermedia and Multimedia

Mario Veitl, Paolo Petta

Diese Arbeit soll interessierte Leser in die Welt von Hyper- und Multimedia einführen. Hyper- und Multimedia stellt eines der interessantesten Gebiete der Informatik dar, das derzeit am Beginn eines breiten Einsatzes steht. Dazu werden die einzelnen Medien und deren Einsatzmöglichkeiten besprochen. Hard- und Software Plattformen für die Erstellung von Multimediasystemen werden aufgezeigt. Problematisch sind die überzogenen Erwartungen, die oft an neue Techniken gestellt werden. Vor dem Erzeugen einer übertriebenen Erwartungshaltung muß jedenfalls gewarnt werden, da diese auch schon bei anderen Technologien eine realistische Bewertung von Einsatzmöglichkeiten und Ergebnissen erschwert hat. Oft sind solchermaßen überhöhte Kunden- bzw. Anwenderwünsche nicht zu befriedigen. Eines jedenfalls ist sicher, das Allheilmittel, als das Multimedia derzeit von der Computerindustrie beworben wird, ist es sicher nicht. Bestenfalls für die Bilanzen der EDV-Hersteller, die verständlicherweise versuchen, dadurch ihren Kunden neue und schnellere Rechner zu bescheren. Zweifellos benötigen multimediale Anwendungen ein Vielfaches an Speicherplatz und Rechengeschwindigkeit, als für etablierte Anwendungen wie Textverarbeitung, Datenbanken usw. nötig sind.

Citation: Veitl M., Petta P.: Hypermedia and Multimedia, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-27, 1993.


OFAI-TR-93-26 ( 79kB g-zipped PostScript file)

A CLP Based Approach to HPSG

Johannes Matiasek, Wolfgang Heinz

In this paper we present a CLP based method for the direct implementation of HPSG, a grammar formalism employing strongly typed feature structures and principles to constrain them. We interpret unification of typed feature structures under the restrictions of principled constraints as constraint solving in the CLP paradigm. The aim of our implementation method is to operationalize the declarative grammar specification without having to account for processing aspects, i.e. to clearly separate grammar and processing model. To achieve this goal we employ a lazy instantiation technique which maintains well-typedness of feature structures at every instantiation stage. This method is complemented with a delay mechanism enabling the constraints arising from grammar principles to cope with uninstantiated structures. Applicability conditions of grammar principles may be specified conditionally, viewing them as licensing conditions on every node of a feature structure. This also allows for the reformulation of disjunctive constraints into a conjunction of conditional constraints, thereby reducing the search space.

Citation: Matiasek J., Heinz W.: A CLP Based Approach to HPSG, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-26, 1993.


OFAI-TR-93-25 ( 190kB g-zipped PostScript file)

A Unified Framework for MLPs and RBFNs: Introducing Conic Section Function Networks

Georg Dorffner

Multilayer Perceptrons (MLP, Werbos 1974, Rumelhart et al. 1986) and Radial Basis Function Networks (RBFN, Broomhead and Lowe 1988, Moody and Darken 1989) probably are the most widely used neural network models for practical applications. While the former belong to a group of ``classical'' neural networks (whose weighted sums are loosely inspired by biology), the latter have risen only recently from an analogy to regression theory (Broomhead and Lowe 1988). On first sight, the two models --- except for being multilayer feedforward networks --- do not seem to have much in common. On second thought, however, MLPs and RBFNs share a variety of features, worthy of viewing them in the same context and comparing them to each other with respect to their properties. Consequently, a few attempts on arriving at a unified picture of a class of feedforward networks --- with MLPs and RBFNs as members --- have been undertaken (Robinson et al. 1988, Maruyama et al. 1992, Dorffner 1992, 1993). Most of these attempts have centered around the observation that the function of a neural network unit can be divided into a propagation rule (``net input'') and an activation or transfer function. The dot product (``weighted sum'') and the Euclidean distance are special cases of propagation rules, whereas the sigmoid and Gaussian function are examples for activation functions. This paper introduces a novel neural network model based on a more general conic section function as propagation rule, containing hyperplane (straight line) and hypersphere (circle) as special cases, thus unifying the net inputs of MLPs and RBFNs with an easy-to-handle continuum in between. A new learning rule --- complementing the existing methods of gradient descent in weight space and initialization --- is introduced which enables the network to make a continuous decision between bounded and unbounded (infinite half-space) decision regions. The capabilities of CSFNs are illustrated with several examples and compared with exisiting approaches. CSFNs are viewed as a further step toward more efficient and optimal neural network solutions in practical applications.

Keywords: , Feedforward neural networks, multilayer perceptrons, radial basis, function networks, decision regions, classification, non-linear separation

Citation: Dorffner G.: A Unified Framework for MLPs and RBFNs: Introducing Conic Section Function Networks, in: Cybernetics and Systems 4, 1994.


OFAI-TR-93-24 ( 80kB g-zipped PostScript file)

Formal Neural Network Specification and its Implications on Standardization

Georg Dorffner, Herbert Wiklicky, Erich Prem

This paper introduces a formal framework for describing and specifying neural networks and discusses several important issues with implications for neural network standardization. In particular, a neural network definition and two tools for graphical description and formal specification are introduced. Issues such as the theoretical impossibility of canonical description, or the need for complete specification (including global algorithms) are discussed. Several examples, making use of the developed tools, illustrate these discussions. In sumamry, this paper aims at contributing to the important endeavour of neural network standardization both practically and theoretically.

Citation: Dorffner G., Wiklicky H., Prem E.: Formal Neural Network Specification and its Implications on Standardization, in: Computer Standards and Interfaces, special issue on Artificial Neural Network Standards, 1994.


OFAI-TR-93-23 ( 128kB g-zipped PostScript file)

On Using Feedforward Neural Networks for Clinical Diagnostic Tasks

Georg Dorffner, Gerold Porenta

In this paper we present an extensive comparison between several feedforward neural network types in the context of a clinical diagnostic task, namely the detection of coronary artery disease (CAD) using planar thallium-201 dipyridamole stress-redistribution scintigrams. We introduce results from well-known (e.g. multilayer perceptrons or MLPs, and radial basis function networks or RBFNs) as well as novel neural network techniques (e.g. conic section function networks) which demonstrate promising new routes for future applications of neural networks in medicine, and elsewhere. In particular we show that initializations of MLPs and conic section function networks --- which can learn to behave more like an MLP or more like an RBFN --- can lead to much improved results in rather difficult diagnostic tasks.

Keywords: , Feedforward neural networks, neural network initialization, multilayer perceptrons, radial basis function networks, conic section function networks, thallium scintigraphy, angiography, clinical diagnosis, decision making

Citation: Dorffner G., Porenta G.: On Using Feedforward Neural Networks for Clinical Diagnostic Tasks, in: Artificial Intelligence in Medicine, special issue on Neurocomputing in Biomedicine, fall 1994.


OFAI-TR-93-22

Die Anwendung neuronaler Netzwerke in der betrieblichen Unternehmung

Erich Prem

Dieser Artikel gibt einen Überblick über Einsatzmoeglichkeiten künstlich neuronaler Netzwerke (NN) in der betrieblichen Unternehmung. Neben den rein technischen Aspekten wird vor allem auf betrieblich relevante Merkmale ihres Einsatzes hingewiesen. Es zeigt sich, daß der Einsatz neuronaler Netzwerke neben einer technologisch bedingten Qualitätssteigerung häufig aufgrund der Erschließung eines neugeschaffenen Automatisierungspotentials für ein Unternehmen interessant ist. Mit dieser Erschließung gehen neue Möglichkeiten einher, die ``lernende Unternehmung'' zu unterstützen. Weiters wird die Vorgangsweise bei der Realisierung einer Lösung mit neuronalen Netzen diskutiert. Es werden Kriterien vorgestellt, welche die Beurteilung der Eignung eines Problems für eine NN-Lösung erlauben. Die abschließend angeführte Liste bestehender NN-Applikationen gibt einen weiteren Überblick über potentielle Einsatzgebiete neuronaler Netzwerke in der betrieblichen Unternehmung.

Citation: Prem E.: Die Anwendung neuronaler Netzwerke in der betrieblichen Unternehmung, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-22, 1993.


OFAI-TR-93-21 ( 79kB g-zipped PostScript file)

Diagnosing Analog Circuits Designed-for-Testability by Using CLPR

Igor Mozetic, Franc Novak, Marina Santo-Zarnik, Anton Biasizzo

Recently, a design-for-test (DFT) methodology for active analog filters was proposed with the primary goal in increased controllability and observability. We operationalize and extend the DFT methodology by using CLPR to model and diagnose analog circuits. CLPR is a logic programming language with the capability to solve systems of linear equations and inequalities. It is well suited to model parameter tolerances and to diagnose soft faults, i.e., deviations from nominal values. The diagnostic algorithm uses different DFT test modes and results of voltage measurements for different frequencies and computes a set of suspected components. Ranking of suspected components is based on a measure of (normalized) standard deviations from predicted mean values of component parameters. Presented case studies on a real circuit show encouraging results in isolation of soft faults for a given low pass biquad filter.

Citation: Mozetic I., Novak F., Santo-Zarnik M., Biasizzo A.: Diagnosing Analog Circuits Designed-for-Testability by Using CLPR, Proc. 4th Intl. Workshop on Principles of Diagnosis, DX-93, pp. 105-120, University of Wales, Aberystwyth, September 6-8, 1993.


OFAI-TR-93-20 ( 409kB g-zipped PostScript file)

Modelling the Rational Basis of Musical Expression

Gerhard Widmer

The article deals with the phenomenon of expressive music interpretation, that is, the variations in tempo, dynamics, etc. that are applied to a piece of written music by a skilled performer. The guiding hypothesis is that musical expression is not an inexplicable, "artistic" phenomenon, but that there is a rational component to it that can be traced back to the performer's (and the listener's) perception of structure in the music. This hypothesis is tested empirically, with the help of Artificial Intelligence methods, via a three-step research methodology: (1) various types of general musical knowledge are identified which might be relevant to perceiving structure in music and to understanding expressive interpretations; (2) a formal computational model of this knowledge is presented; and (3) the model is empirically tested by using it as the basis of a computer program that learns general expression rules from examples of actual performances. The experimental results indicate that certain aspects of musical expression are indeed rationally learnable and that the musical knowledge formulated in the model is necessary to learn expression rules in a sensible way. And finally, as parts of the model are based on two well-known theories of tonal music, the results also provide empirical support for the relevance of these theories.

Keywords: , Cognitive musicology, expression, interpretation, perception, artificial intelligence, machine learning

Citation: Widmer G.: Modelling the Rational Basis of Musical Expression, in: Computer Music Journal 18, MIT Press, 1994.


OFAI-TR-93-19 ( 36kB g-zipped PostScript file)

Interval Arithmetic with CLPR

Igor Mozetic, Christian Holzbaur

We describe two extensions of CLPR, motivated by an application to model-based diagnosis of active analog filters. The first extension addresses the problem of rounding errors in CLPR. We represent Reals with floating-point intervals which are computed by outward rounding. The second extension increases the expressiveness of linear CLPR. Constants in linear expressions can now be intervals, which enables reasoning with imprecise model parameters (tolerances). Bounds (sup and inf) for individual variables are computed by the linear optimization via modified Simplex. Both extensions are implemented in a CLP shell --- an adaptation of SICStus Prolog, which allows for easy and fast developments and modifications of CLP languages.

Citation: Mozetic I., Holzbaur C.: Interval Arithmetic with CLPR, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-19, 1993.


OFAI-TR-93-18 ( 69kB g-zipped PostScript file)

VIE-VENT: Knowledge-Based Monitoring and Therapy Planning of the Artificial Ventilation of Newborn Infants

Silvia Miksch, Werner Horn, Christian Popow, Franz Paky

We developed a knowledge-based system, VIE-VENT, for monitoring and therapy planning of artificially ventilated newborn infants. One of our aims was to implement clinical and textbook knowledge in VIE-VENT's knowledge base. Therapy planning is based on a combination of transcutaneous and invasively determined blood gas measurements and clinical observations. After the selection and validation of appropriate parameters, these data are transformed into qualitative values. These are used in our model of neonatal respiration for recommending therapeutic actions according to heuristic clinical rules of artificial ventilation. For the weaning process we implemented three possible strategies, which allow adapting different dynamics of weaning (conservative, normal, aggressive). VIE-VENT is specifically designed for practical use under real-time constraints in the Neonatal Intensive Care Units (NICUs).

Citation: Miksch S., Horn W., Popow C., Paky F.: VIE-VENT: Knowledge-Based Monitoring and Therapy Planning of the Artificial Ventilation of Newborn Infants, An extended version of the paper in Artificial Intelligence in Medicine: Proc. 4th Conf. on Artificial Intelligence in Medicine Europe, AIME-93 (S. Andreassen et al., Eds.), pp. 218-229, 1993.


OFAI-TR-93-17 ( 30kB g-zipped PostScript file)

Connectionism, Symbol Grounding, and Autonomous Agents

Georg Dorffner, Erich Prem

In this position paper we would like to lay out our view on the importance of grounding and situatedness for cognitive science. Furthermore we would like to suggest that both aspects become relevant almost automatically if one consequently pursues the original ideas from connectionism. Finally we discuss the relevance of grounding for theories of meaning and the possible contribution of symbol grounding for autonomous agents.

Citation: Dorffner G., Prem E.: Connectionism, Symbol Grounding, and Autonomous Agents, Proc. 15th Annual Meeting of the Cognitive Science Society, Lawrence Erlbaum, pp. 144-148, 1993.


OFAI-TR-93-16 ( 50kB g-zipped PostScript file)

Avoiding Noise Fitting in a FOIL-like Learning Algorithm

Johannes Fürnkranz

The research reported in this paper describes FOSSIL, an ILP system that uses a search heuristic based on statistical correlation. This algorithm implements a new method for learning useful concepts in the presence of noise. In contrast to FOIL's stopping criterion which allows theories to grow in complexity as the size of the training sets increase, we propose a new stopping criterion that is independent of the number of training examples. Instead, FOSSIL's stopping criterion depends on a search heuristic that estimates the utility of literals on a uniform scale.

Citation: Fürnkranz J.: Avoiding Noise Fitting in a FOIL-like Learning Algorithm, Proc. IJCAI-93 workshop on Inductive Logic Programming, Chambery, France, August 1993.


OFAI-TR-93-15

VieNet2 --- Ein Simulationstool für neuronale Netzwerke, Anwenderhandbuch

Georg Dorffner, Günter Linhart, Martin Rotter

VieNet2 (Vienna Network Toolkit 2) ist im Kern eine leistungsfähige Funktionsbibliothek, die es dem C-Programmierer ermöglicht, mit Hilfe der zur Verfügung gestellten Funktionen mit wenigen Sourcecodezeilen die Simulation eines beliebig gestalteten Netzwerks aufzubauen. Mit Hilfe der integrierten grafischen Benutzeroberfläche, die automatisch aufgrund der angegebenen Netzwerkspezifikation erzeugt wird, kann der Benutzer auf einfachste Weise die Unitaktivierungen und die Gewichte zwischen den Units visualisieren und interaktiv mit Hilfe einer Maus steuern. Zusätzlich stehen auch Funktionen zur Erzeugung einfacher Dialogelemente zur Verfügung, mit deren Hilfe Parameter verändert und selbst definierte Funktionen aufgerufen werden können. Derzeit steht VieNet2 auf vier verschiedenen Rechnerplattformen zur Verfügung (PC, Workstation), was die Einsetzbarkeit in der Praxis wesentlich erleichtert.

Citation: Dorffner G., Linhart G., Rotter M.: VieNet2 --- Ein Simulationstool für neuronale Netzwerke, Anwenderhandbuch, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-15, 1993.


OFAI-TR-93-14 ( 50kB g-zipped PostScript file)

Understanding and Self-Organization --- What can the Speaking Lion Tell Us?

Erich Prem

The current rebirth of self-organizing systems in several distinct domains of research poses new epistemic questions. Self-organizing systems have a tendency to not only behave in an unpredictable way, they are also extremely difficult to analyse. In this paper we discuss three problems with neural networks that are important for self-organization in general. They are related to the proper design of a self-organizing system, to the role of the system engineer, and to the proper explanation of system behaviour. We shall try to present a generally applicable solutions, which is based on a ``symbol grounding'' neural network architecture. We then discuss the relation of this approach to the measurement problem in physics and point out similarities to existing positions in philosophy. However, it should be noted that our ``solution'' of the explanation problem may be judged as being a very sceptic one.

Citation: Prem E.: Understanding and Self-Organization --- What can the Speaking Lion Tell Us?, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-14, 1993.


OFAI-TR-93-13

Coping With Derivation in a Morphological Component

Harald Trost

In this paper a morphological component with a limited capability to automatically interpret (and generate) derived words is presented. The system combines an extended two-level morphology (Trost 1991a,~b) with a feature-based word grammar building on a hierarchical lexicon. Polymorphemic stems not explicitly stored in the lexicon are given a compositional interpretation. That way the system allows to minimize redundancy in the lexicon because derived words that are transparent need not to be stored explicitly. Also, words formed ad-hoc can be recognized correctly. The system is implemented in CommonLisp and has been tested on examples from German derivation.

Keywords: , NLU, computational morphology, lexicon

Citation: Trost H.: Coping With Derivation in a Morphological Component, Proc. 6th Conf. of the European Chapter of the Association for Computational Linguistics, Utrecht, April 1993.


OFAI-TR-93-12 ( 26kB g-zipped PostScript file)

Experiences with Neural Networks as a Diagnostic Tool in Medical Image Processing

Georg Dorffner, Erich Prem, M. Mackinger, S. Kundrat, Paolo Petta, Gerold Porenta, H. Sochor

Through recent years artificial neural networks have proven to be a useful technique in the interpretation of high-dimensional data such as images. However, an adequate application of neural networks is often plagued by a lack of systematic methodology. It is one of the goals of the EC funded ESPRIT-II project NEUFODI (Neural Networks for Forecasting and Diagnosis Applications) to study and develop techniques for using artificial neural networks as tools for diagnosis.

In this paper we report about part of this larger project, namely about experiences with applying neural networks to the interpretation of planar thallium-201 scintigrams for the assessment of coronary artery disease. This application should serve as an example of how neural networks can be successfully applied in the area of medical image processing. In this realm we will discuss several aspects about the practical use of this widely used technique.

Citation: Dorffner G., Prem E., Mackinger M., Kundrat S., Petta P., Porenta G., Sochor H.: Experiences with Neural Networks as a Diagnostic Tool in Medical Image Processing, In Europäische Perspektiven der Medizinischen Informatik, Biometrie und Epidemologie (Michaelis et al., Eds.), MMV München, pp. 407-411, 1993.


OFAI-TR-93-11 ( 43kB g-zipped PostScript file)

Unsupervised Learning of Simple Speech Production Based on Soft Competitive Learning

Georg Dorffner, T. Schönauer

In this paper we present a simple connectionist model for the adaptive sensory-motor loop involved in perceiving and producing speech. At the heart of the production part lies an articulatory model which approximates the human vocal tract through polygons and splines. Output of this model is the envelope of the acoustic filter function, realized by this vocal tract, which is comparable to the spectrum of real speech segments. The goal of this research was to find a learning method to train a multi-layer neural network to produce the correct set of twelve articulatory parameters when given the spectrum of recorded real speech (stationary vowels). The method introduced in this paper explicitly makes use of a neural network categorization component. Through so-called soft competitive learning it learns to gradually compress the responses to more and more unitized categorical patterns. After a precategorization phase, during which presented real speech patterns are classified, the model starts to randomly produce output signals. A goodness-of-fit measure, which can be computed easily, is taken as the criterion whether the self-produced signal is close enough to any of the known categories, and as the learning rate to adapt the weights between the categorization layer and the output units.

Citation: Dorffner G., Schönauer T.: Unsupervised Learning of Simple Speech Production Based on Soft Competitive Learning, In Computation and Neural Systems (F. Eeckman, J. Bower, Eds.), Kluwer Academic Publishers, Boston/Dordrecht/London, pp. 363-368, 1993.


OFAI-TR-93-10 ( 53kB g-zipped PostScript file)

A Comparison of Three Different Methods for Acquiring Knowledge about Virological Hepatitis Tests

Petr Berka

We present a comparison of Knowledge Seeker, CN2 and Knowledge EXplorer based on a real problem domain, interpretation of virological hepatitis tests. The problem domain can be divided into 6 subdomains, where the knowledge can be acquired separately. Unlike classical machine learning problems the goal classes are not mutually exclusive. The information how to classify contradictory examples was not available for the systems during learning. So the key question was how the systems handle ambiguity in data. Although each system uses different approach, there was no significant difference in the results of testing of acquired rules done for each subdomain separately. Testing in the whole hepatitis domain was done only for Knowledge EXplorer because only this system can predict multiple classes. The results of testing are poorer then results obtained in the separate subdomains but can be improved by using some additional expert's knowledge.

Citation: Berka P.: A Comparison of Three Different Methods for Acquiring Knowledge about Virological Hepatitis Tests, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-10, 1993.


OFAI-TR-93-09 ( 128kB g-zipped PostScript file)

The Role of Qualitative Knowledge in Machine Learning

Johannes Fürnkranz

This paper analyzes the important role qualitative knowledge plays in Machine Learning. For this purpose important results from research in the fields approximate theory formation, automated qualitative modeling, learning in plausible domain theories and learning with abstractions are reviewed. The analysis of these approaches shows several of the benefits the use of qualitative knowledge can bring to Machine Learning and also points out important problems that have to be dealt with. The need for qualitative knowledge to keep learning tractable is illustrated with examples from the domain of chess. Finally we make some suggestions for further research based on the shortcomings of previous approaches.

Citation: Fürnkranz J.: The Role of Qualitative Knowledge in Machine Learning, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-09, 1993.


OFAI-TR-93-08 ( 78kB g-zipped PostScript file)

Plausible Explanations and Instance-Based Learning in Mixed Symbolic/Numeric Domains

Gerhard Widmer

The paper is concerned with supervised learning of numeric target concepts. The task is to learn to predict or determine the exact values of some numeric target variables. Training examples may be described by both symbolic and numeric predicates. General domain knowledge may be available in qualitative form. The paper presents a general learning model for such domains. The model integrates a symbolic learning component, which is based on a multi-instance plausible explanation algorithm, and an instance-based learning component, which stores instances with precise values and predicts new values by interpolation. The symbolic component can use available qualitative background knowledge; it learns sub-concepts that partition the space for the underlying instance-based method. A realization of the model in a system named IBL-Smart is then described. The system has been applied to a complex symbolic/numeric task from the domain of tonal music, and some experimental results are reported that demonstrate the effectiveness of the method.

Citation: Widmer G.: Plausible Explanations and Instance-Based Learning in Mixed Symbolic/Numeric Domains, Draft of a paper in Proc. 2nd Intl. Workshop on Multistrategy Learning, MSL-93, (R.S. Michalski, G. Tecuci, Eds.), Harper's Ferry, WV.


OFAI-TR-93-07

Learning under Changing Conditions by Controlled Forgetting and Recalling

Gerhard Widmer, Miroslav Kubat

The paper deals with the problem of on-line learning of context-dependent concepts. Changes in the hidden context can induce changes in the target concepts, thus producing concept drift. The paper presents a general learning model for this type of scenario. The main idea of our approach consists in (1) keeping only a window of currently trusted instances and hypotheses and discarding old, outdated information (`forgetting'), (2) storing general descriptions of contexts that can later be used as newly constructed descriptors (`recalling'), and (3) controlling both of these functions with the help of a heuristic that constantly monitors the system's behaviour. The methodology is illustrated by one particular instantiation of the model --- the system FLORA3. Experiments with synthetic data sets are reported. In particular, it is shown that the model is very good at distinguishing between concept drift and noise.

Citation: Widmer G., Kubat M.: Learning under Changing Conditions by Controlled Forgetting and Recalling, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-07, 1993.


OFAI-TR-93-06 ( 14kB g-zipped PostScript file)

Automatische Textgenerierung --- Anwendungsgebiete und Zukunftschancen

Ernst Buchberger

Verschiedene Bereiche der Artificial Intelligence-Forschung, wie etwa Expertensysteme, haben ihren Einzug ins Wirtschaftleben gehalten, und die entwickelten Produkte bringen mittlerweile hohe Gewinne ein. Auch das Gebiet sprachverstehender Systeme wird zunehmend kommerziell interessant. Ein Bereich im Rahmen der automatischen Sprachverarbeitung hat bisher noch nicht allzu großes Augenmerk erhalten: das Gebiet der automatischen Textgenerierung, also der Erzeugung (natürlich-)sprachlicher Sätze bzw. Texte mittels Computer. In diesem Beitrag soll ein Überblick über den State-of-the-Art automatischer Textgenerierung mit besonderer Betonung der möglichen Anwendungsgebiete gegeben werden. Dabei werden die zum Einsatz kommenden Techniken und Methoden kurz präsentiert und anschließend verschiedenste Arten von Anwendungen vorgestellt.

Keywords: , Natural language generation, text generation, overview, impacts

Citation: Buchberger E.: Automatische Textgenerierung --- Anwendungsgebiete und Zukunftschancen, Proc. Informations- und Kommunikationstechnologie für das neue Europa (Wiener IT-Kongreß), pp. 491-500, ADV, Vienna, 1993.


OFAI-TR-93-05

Effektive Nutzung von a-priori Wissen für Neuronale Netze --- Europäische Forschung im Rahmen von ESPRIT II

Erich Prem, Georg Dorffner

Neuronale Netze müssen oft unter Verzicht auf die Verwendung eines manchmal vorhandenen Expertenwissens allein durch Beispiele trainiert werden. Dieser Vortrag gibt einen Überblick über kürzlich abgeschlossene Forschungen zur Entwicklung von Techniken, die es gestatten, neben den üblichen Trainingsdaten auch evt. vorhandenes a priori Wissen zu berücksichtigen. Die Entwicklung dieser Techniken erfolgte als Teil des im Rahmen von ESPRIT-II durchgeführten Projekts NEUFODI. Das Problem wird gemeinsam mit wichtigen Kriterien, die die Beurteilung von Lösungen erlauben, vorgestellt. Weiters werden die möglichen Arten von a priori Wissen diskutiert. Eine besondere Rolle in diesem Projekt kommt dem multinationalen Charakter der Projektorganisation zu, der eine effiziente Problemzerlegung erfordert. Neben einer Übersicht über erarbeitete Lösungen wird über Erfahrungen mit einem realen Anwendungsproblem und einer Vorprogrammierungstechnik (Concept Support) berichtet. Ein neuronales Netzwerk wird dazu verwendet, anhand von Thallium-201 Scintigrammen koronare Herzerkrankungen (CAD) zu diagnostizieren und zu lokalisieren. Bei diesem Verfahren gelingt es, vages medizinisches Expertenwissen zur Steigerung der Generalisierungsleistung eines neuronalen Netzes zu verwenden.

Citation: Prem E., Dorffner G.: Effektive Nutzung von a-priori Wissen für Neuronale Netze --- Europäische Forschung im Rahmen von ESPRIT II, Proc. Informations- und Kommunikationstechnologie für das neue Europa (Wiener IT-Kongreß), pp. 501-510, ADV, Vienna, 1993.


OFAI-TR-93-04

Entwicklung multimedialer CBT-Systeme

Mario Veitl, Paolo Petta

In dieser Arbeit werden verschiedene Designansätze und Einsatzkonzepte für computergestützte Lehrsysteme (Computer Based Tutoring Systems, CBT-Systeme) beschrieben und daraus ein möglichst allgemeingültiger Leitfaden für EntwicklerInnen abgeleitet. Eingangs erfolgt dazu eine punktuelle Problemanalyse erster Systeme für computerunterstützten Unterricht, um aufzuzeigen, welche Umstände für die relative Erfolglosigkeit ihres Einsatzes verantwortlich waren. Vor der Beschreibung der eigentlichen Designansätze werden Überlegungen zur Auswahl der Arbeitsplattformen angestellt. Anschließend werden die Struktur von CBT--Systemen, verschiedene Entwicklungsansätze und einzelne Erstellungsschritte erörtert. Besonderes Augenmerk wird auf Benutzerführung, Interaktionsmöglichkeiten und die Entwicklung der Benutzerschnittstelle gelegt. Vorgangsweisen zur Evaluation sowohl hinsichtlich des Entwurfes als auch der Benutzerschnittstelle werden ebenfalls aufgezeigt. Ein Ausblick stellt die gemachten Aussagen in Bezug zu vorhersehbaren Entwicklungen.

Keywords: , Multimedia, hypermedia, computer-based tutoring, software engineering

Citation: Veitl M., Petta P.: Entwicklung multimedialer CBT-Systeme, Proc. Informations- und Kommunikationstechnologie für das neue Europa (Wiener IT-Kongreß), pp. 1215-1225, ADV, Vienna, 1993.


OFAI-TR-93-03 ( 56kB g-zipped PostScript file)

Knowledge EXplorer: A Tool for Automated Knowledge Acquisition from Data

Petr Berka

Knowledge EXplorer is a system for exploratory analysis and knowledge acquisition in categorial data. The knowledge acquisition component induces knowledge in the form of weighted rules. These rules can be directly used in an PROSPECTOR like expert system.

Citation: Berka P.: Knowledge EXplorer: A Tool for Automated Knowledge Acquisition from Data, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-03, 1993.


OFAI-TR-93-02 ( 46kB g-zipped PostScript file)

CLP based HPSG Parsing

Johannes Matiasek

We describe a system for principle based parsing of HPSG employing constraint logic programming techniques. Typed features structures are implemented as constraints on PROLOG variables and are instantiated in a lazy fashion. Grammar principles as well as relational constraints are stated in a declarative way by means of conditional constraints on feature structures. The procedural interpretation given to these conditional constraints together with the data driven delay mechanism implemented yields efficient parsing behavior.

Citation: Matiasek J.: CLP based HPSG Parsing, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-02, 1993.


OFAI-TR-93-01

Artificial Intelligence-Forschung in Österreich 1992: 133 Projekte, 237 Personen, 37 Institutionen, 963 Publikationen

Robert Trappl, Johannes Matiasek, Gerda Helscher

Mittels einer Fragebogenerhebung wurde als Follow-up der ersten Untersuchung aus dem Jahr 1990 ein Überblick über die österreichische Artificial Intelligence-Forschungslandschaft gewonnen: 1. was, 2. von wem, 3. wo, 4. wie lange 5. von wem gefördert, 6. mit welchen Ergebnissen in Österreich auf dem Gebiet der AI geforscht wird. Die Rücklaufquote betrug 98%. Insgesamt wurden 133 (1990: 78) Projekte erhoben, von denen 39 (15) bereits abgeschlossen sind --- ein Zuwachs von 70% innerhalb von 2 Jahren. Das Spektrum der Projekte reicht von theoretischen Arbeiten über Konsistenz und Vollständigkeit von Wissensbasen über Case-Based Scheduling, ein Hochofendiagnosesystem, Expertensysteme für Schädlingsbekämpfung und Pflanzenschutz oder zur Überwachung und Optimierung der künstlichen Beatmung von Neugeborenen bis zu Überlegungen, AI zur Vermeidung von Krisen und Kriegen einzusetzen. Schwerpunkte der Forschung betreffen unverändert Methoden der Wissensrepräsentation, Expertensysteme und den Bereich der Mensch-Maschine-Kommunikation. 32% (40%) der Projekte sind im Bereich der Grundlagenforschung, 44% (41%) in der angewandten Forschung und 24% (19%) in der Entwicklung angesiedelt. Gegenüber 1990 ergibt dies ein überproportionales Wachstum von angewandter Forschung und Entwicklung --- der Weg in die Praxis wird also bereits erfolgreich beschritten. 237 (142) Personen haben an diesen Forschungsprojekten mitgearbeitet. Es gibt in Österreich derzeit bereits rund 100 haupt- bis nebenberufliche AI-ForscherInnen. Unverändert sticht aus der österreichischen AI-Forschungslandschaft Wien auch als AI-Wasserkopf hervor: In Wien werden mehr als dreimal so viele Projekte als in allen anderen Bundesländern zusammen durchgeführt. Auf dem Gebiet der AI wird derzeit in 37 (26) Institutionen geforscht, gegenüber 1990 ein Zuwachs um rund 42%. Rund 40% aller Projekte werden von nur fünf Instituten durchgeführt, die aufgrund von Personalunionen von insgesamt nur drei Wissenschaftern geleitet werden. Die mittlere Projektdauer beträgt 28 (24) Monate. 77% der Projektförderungen kommen von staatlichen Förderstellen, 23% von Firmen --- im Vergleich zu 1990 praktisch unverändert. Das Verzeichnis aller bisher von österreichischen AI-ForscherInnen veröffentlichten Arbeiten umfaßt nunmehr 963 (563) Publikationen, ein Zuwachs um 71%. Viele davon sind in renommierten internationalen Fachzeitschriften erschienen. Doppelt so viele kooperierende Forschungsinstitute und Firmen wurden 1992 im Vergleich zu 1990 angegeben (132 : 66). 49 der Partner waren im Ausland (Zuwachs um 113%). Spitzenplätze nehmen Frankreich, die USA, Großbritannien, Deutschland und Italien ein. Die österreichische AI-Forschung hat bereits internationales Niveau, wie auch aus der Mitarbeit österreichischer Forschungsinstitutionen in zahlreichen ESPRIT-Projekten und der Aufnahme in mehrere Networks of Excellence der EG hervorgeht. Um Schritt zu halten, sind erforderlich: Weitere Intensivierung der internationalen Kontakte; eine Zunahme der Bereitschaft, vergleichbar den westlichen Nachbarländern, sowohl der Wirtschaft zur Zusammenarbeit als auch der öffentlichen Hand zur Förderung von Grundlagen- und angewandter Forschung.

Citation: Trappl R., Matiasek J., Helscher G.: Artificial Intelligence-Forschung in Österreich 1992: 133 Projekte, 237 Personen, 37 Institutionen, 963 Publikationen, Austrian Research Institute for Artificial Intelligence, Vienna, TR-93-01, 1993.


OFAI-TR-92-38 ( 168kB g-zipped PostScript file)

Taxonomies and Part-Whole Hierarchies in the Acquisition of Word Meaning --- A Connectionist Model

Georg Dorffner

This paper introduces a simple connectionist model for the acquisition of word meaning, and demonstrates how this model can be enhanced based on empirical observations about language learning in children. The main sources are observations by Markman about constraints children place on word meaning, and Nelson, as well as Benelli, about the role of language in the acquisition of concept taxonomies. The model enhancements based on these observations, and those authors' conclusions, are mainly built on well-known neural mechanisms such as resonance, reset and recruitment, as first introduced in the adaptive resonance theory (ART) models by Carpenter and Grossberg. This way the strength of connectionist models in plausibly modeling detailed aspects of natural language is underlined.

Citation: Dorffner G.: Taxonomies and Part-Whole Hierarchies in the Acquisition of Word Meaning --- A Connectionist Model, Proc. 14th Annual Conf. of the Cognitive Science Society, Lawrence Erlbaum, Hillsdale, NJ, pp. 803-808, 1992.


OFAI-TR-92-37 ( 48kB g-zipped PostScript file)

A CLP Schema to Integrate Specialized Solvers and its Application to Natural Language Processing

Bernhard Pfahringer, Johannes Matiasek

The problem of combining different constraint solvers has been mentioned among others by (Schroedl 1991, Holzbaur 1990, Lim and Stuckey 1990, Nelson and Oppen 1980) without giving satisfactory solutions. We propose a general framework for implementing specialized reasoners/constraint solvers in a logic programming environment using semantic unification. It allows for a modular and declarative definition of the interactions of such reasoners. This is achieved by using `attributed variables' (Huitouze 1990) as a data-structure relating a variable to the `set of all' constraints for this variable. `Conditional rewrite rules' specify simplification and possible interactions of these constraints. A few examples will demonstrate constraints relating to single variables and interactions thereof. We will demonstrate, how this framework leads to a very natural and concise formulation of principles, grammar and lexicon in a HPSG-like formalism. Furthermore the necessity of extending the framework to handle constraints relating two or more variables will be discussed.

Keywords: , Constraints, Logic Programming, HPSG

Citation: Pfahringer B., Matiasek J.: A CLP Schema to Integrate Specialized Solvers and its Application to Natural Language Processing, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-37, 1992.


OFAI-TR-92-36 ( 40kB g-zipped PostScript file)

The Science of a ``New AI''

Erich Prem

This paper seeks to describe the science of a ``New AI''. It is explained why this new development is belonging to Artificial Intelligence and why it should be called ``new''. We establish a list of criteria, which a new AI must satisfy. Several existing mainstreams of AI are revisited in the light of this explanation. Finally, we describe one bottom-up approach to symbol grounding in more detail and point out implications of such a model, which can be seen as a step towards ``new'' production systems or heuristics.

Citation: Prem E.: The Science of a ``New AI'', Proc. of the ECAI-92 workshop `Neural Networks and a New AI' (G. Dorffner, Ed.), Springer-Verlag.


OFAI-TR-92-35 ( 60kB g-zipped PostScript file)

Effective Learning in Dynamic Environments by Explicit Context Tracking

Gerhard Widmer, Miroslav Kubat

Daily experience shows that in the real world, the meaning of many concepts heavily depends on some implicit context, and changes in that context can cause radical changes in the concepts. This paper introduces a method for incremental concept learning in dynamic environments where the target concepts may be context-dependent and may change drastically over time. The method has been implemented in a system called Flora3. Flora3 is very flexible in adapting to changes in the target concepts and tracking concept drift. Moreover, by explicitly storing old hypotheses and re-using them to bias learning in new contexts, it possesses the ability to utilize experience from previous learning. This greatly increases the system's effectiveness in environments where contexts can reoccur periodically. The paper describes the various algorithms that constitute the method and reports on several experiments that demonstrate the flexibility of Flora3 in dynamic environments.

Keywords: , Incremental Learning, Context-Dependent Concepts, Concept Drift

Citation: Widmer G., Kubat M.: Effective Learning in Dynamic Environments by Explicit Context Tracking, Proc. European Conf. on Machine Learning, ECML-93, (P. Brazdil, Ed.), pp. 227-243, Vienna, Springer-Verlag (LNAI 667), 1993.


OFAI-TR-92-34 ( 14kB g-zipped PostScript file)

Kriterien in der natürlichsprachigen Generierung: Mindestkriterien und damit verbundene Probleme

Ernst Buchberger

Given the enormous increase of research in the field of natural language generation, the need for a systematic account of criteria for assessing natural language generators is stressed. This paper tries to give an answer to a number of recently addressed topics regarding criteria, laying special emphasis on which criteria are necessary and relevant and how they influence design decisions for natural language generators. A minimal number of four closely interrelated criteria is presented. It is argued that values for these criteria are usually available before design decisions are made, and the direct impact of these criteria on selecting alternatives in various dimensions of natural language generation is shown together with selected examples. Finally, problems associated with the classification proposed are addressed and directions for future research in this relatively new field of impact assessment are pointed out.

Keywords: , Natural Language Generation, Systematics, Classification, Evaluation

Citation: Buchberger E.: Kriterien in der natürlichsprachigen Generierung: Mindestkriterien und damit verbundene Probleme, A revised and extended English version appears as ``Criteria in Natural Language Generation: Minimal Criteria and Their Impacts'' in Proc. 16th German AI Conference, GWAI-92 (H.J. Ohlbach, Ed.), pp. 347-356, Springer-Verlag (LNAI 671), 1993.


OFAI-TR-92-33 ( 154kB g-zipped PostScript file,  438kB PDF file)

Das Internet als ``Docuverse für alle BenutzerInnen''

Paolo Petta

Seit der Inbetriebnahme vor knapp zehn Jahren verzeichnet das Internet --- ein weltweiter Rechnerverbund mit wissenschaftlichen, kommerziellen und vermehrt privaten Teilnehmern --- ein kontinuierliches exponentielles Wachstum. Neben vielen anderen angebotenen Dienstleistungen stellt dabei besonders die Möglichkeit des interaktiven Zugriffs auf Informationsarchive aller Art (aus den Bereichen Natur- und Geisteswissenschaft ebenso wie aktueller Politik, Chronik, Sport, usw.) in zunehmendem Maße eine Triebfeder dieser massiven Expansion dar. Da die traditionellen Zugriffswege allein das effiziente Auffinden von Information in diesen großen Datenbeständen durch durchschnittliche BenutzerInnen nicht mehr ermöglichen können, ist es in der allerjüngsten Vergangenheit zur Entwicklung einander ergänzender Methoden gekommen, die sich bereits täglich im breiten praktischen Einsatz bewähren. Diese Informationswelt wird in einem einführenden Überblick vorgestellt.

Keywords: , Internet, Information Retrieval, Hypertext, Overview

Citation: Petta P.: Das Internet als ``Docuverse für alle BenutzerInnen'', Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-33, 1992.


OFAI-TR-92-32

Hypertext und AI

Paolo Petta

Durch die Integration von Hypertext mit Elementen aus der AI ergeben sich laufend neue Einsatzmöglichkeiten, so zum Beispiel durch die Kombination der komplementären Eigenschaften von Hypertext- und Expertensystemen zum sogenannten Expertext. Die Palette bisher realisierter Systeme reicht dabei von lokalen, auf homogenenen und reich stukturierten Daten aufbauenden Expertext Implementierungen bis hin zu (welt-)weit verteilten heterogenen Informationssystemen. Anhand von Beispielen (Museum Room Editor, Goesta's Book, Open Architecture for Reasoning, Wide Area Information Servers, Gopher, World-Wide Web...) werden die Möglichkeiten konkreter gegenwärtiger und möglicher zukünftiger Nutzung aufgezeigt.

Keywords: , Hypertext, Expertext, Overview

Citation: Petta P.: Hypertext und AI, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-32, 1992.


OFAI-TR-92-31

How to Integrate Specialized Solvers: A CLP Approach

Bernhard Pfahringer

The problem of combining different constraint solvers has been mentioned by among others (Schroedl 1991, Holzbaur 1990, Lim and Stuckey 1990, Nelson and Oppen 1980) without giving satisfactory solutions. We will outline a general framework, that if followed when implementing specialized reasoners/constraint solvers in a logic programming environment using semantic unification, allows for a modular and declarative definition of the interactions of such reasoners. This is achieved by using `attributed variables' (Huitouze 1990) as a data-structure relating a variable to the `set of all' constraints for this variable. `Conditional rewrite rules' specify interaction and simplification of these constraints. A few very simple examples will demonstrate constraints relating to single variables and interactions thereof. We will further discuss the necessity of extending the framework to handle constraints relating two or more variables. A proof-of-concept implementation has been done and is currently being optimized.

Keywords: , Semantic Unification, Constraints, Logic Programming

Citation: Pfahringer B.: How to Integrate Specialized Solvers: A CLP Approach, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-31, 1992.


OFAI-TR-92-30

Unifikationserweiterungen: Vergleich und mögliche Nutzung in der Wissensrepräsentation anhand ausgewählter Beispiele

Bernhard Pfahringer

In diesem Bericht werden vier verschiedene Wege untersucht, Unifikation für den Benutzer erweiterbar zu machen. Weiters wird gezeigt, wie diese erweiterte, semantische Unifikation dazu dienen kann, diverse Formen von Wissen sowohl elegant darzustellen als auch effizient verarbeitbar zu machen.

Keywords: , Semantic Unification, Knowledge Representation, Constraints, Logic Programming

Citation: Pfahringer B.: Unifikationserweiterungen: Vergleich und mögliche Nutzung in der Wissensrepräsentation anhand ausgewählter Beispiele, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-30, 1992.


OFAI-TR-92-29 ( 86kB g-zipped PostScript file)

Mechanisms for Handling Sequences with Neural Networks

Claudia Ulbricht, Georg Dorffner, Stephane Canu, Didier Guillemyn, Gurutze Marijuan, Javier Olarte, Clemente Rodriguez, Ignacio Martin

This paper is intended to give an overview of methods for handling sequences with neural networks. Since many typical neural networks cannot be used for processing temporal information they have to be extended by some mechanism to be able to do so. Two types of mechanisms can be distinguished: non-recurrent mechanisms, such as windows and time delays, and recurrent mechanisms based on feedback. However, the properties of networks handling sequences are not only dependent on the employed mechanism, but also on the underlying network paradigm (e.g. multi-layer perceptron, Hopfield network, Kohonen network, etc.). Each combination of a network paradigm with a mechanism results in a different temporal network having its own features. The outcome of a comparative study within the ESPRIT-II project NEUFODI (``Neural Networks for Forecasting and Diagnosis Applications'') suggested a division of these networks according to their overall characteristics into the following four categories: extended feedforward networks, single-step recurrent networks, extended stabilizing networks, and extended competitive networks. This paper provides a description of the advantages and disadvantages of different approaches. It also presents a few selected results out of the comparative study. As such this paper can be used as a basis for deciding which network to take for a given task.

Keywords: , Neural Networks, Connectionism, Sequences, Time Series, Forecasting, Time Series Analysis

Citation: Ulbricht C., Dorffner G., Canu S., Guillemyn D., Marijuan G., Olarte J., Rodriguez C., Martin I.: Mechanisms for Handling Sequences with Neural Networks, Extended version of a paper in Proc. of the ANNIE'92 Conf., St. Louis, Missouri, USA.


OFAI-TR-92-28 ( 56kB g-zipped PostScript file)

An Expert System for Parenteral Nutrition of Neonates

Silvia Miksch, Maria Dobner, Werner Horn, Christian Popow

The planning of an adequate nutritional support for meeting the metabolic requirements of sick neonates is a tedious time consuming calculation, needs practical expert knowledge and involves the risk of introducing possibly fatal errors. We therefore developed the interactive support system VIE-PNN for calculating the composition of parenteral nutrition solutions (PNS) for neonates at intensive care units. The aims were to avoid errors within certain limits, to save time, and to keep data for further statistical analysis. We combined textbook rules of nutrition planning with the knowledge of experienced physicians. The dynamic and static knowledge is represented in frame structure and backward chaining rules. The daily requirements are determined in units per kilogram body weight and have to be adjusted according to the patient's age, its body weight, and its clinical conditions (e.g. specific diseases, past and present-day blood analysis). The system uses default values and strategies of estimation in the absence of real values. The physician has the possibility to accept or to adjust proposed values on the screen. VIE-PNN offers the possibility to adjust compositions of the PNS to the total fluid intake, which is often difficult when total fluid allowance is restricted. This task is time consuming if done systematically by hand. Finally, the PNS may be reduced according to the proportion of oral feedings. The final output is a PNS schedule form, which can directly be used in the case history of neonates. A knowledge acquisition module supports the input of new bypasses and new oral feeding products.

A technical, empirical and subjective evaluation of the system was performed. It proved VIE-PNN's soundness, its ability to provide a standard for the composition of parenteral nutrition, and its clinical applicability.

Citation: Miksch S., Dobner M., Horn W., Popow C.: An Expert System for Parenteral Nutrition of Neonates, Proc. Ninth IEEE Conference on Artificial Intelligence for Applications, CAIA-93, Orlando, Florida, March 1-5, 1993.


OFAI-TR-92-27 ( 47kB g-zipped PostScript file)

Attributed Equivalence Classes in Prolog Implementations of CLP Languages

Christian Holzbaur

We introduce attributed equivalence classes as an explicit abstract data type for the representation and manipulation of general equation systems where the decision algorithm is based on quantifier elimination. The partition of quantifiers into equivalence classes results very naturally from a simple abstraction that separates the equation solving process in the object domain from global manipulation of equation systems. We propose and report on an implementation of a linear complexity equivalence relation maintenance algorithm within the framework of logic programming, based on extensible unification. The explicit representation of global aspects of equation systems leads to an object oriented approach to equation solving.

Citation: Holzbaur C.: Attributed Equivalence Classes in Prolog Implementations of CLP Languages, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-27, 1992.


OFAI-TR-92-26 ( 76kB g-zipped PostScript file)

A Polynomial-Time Algorithm for Model-Based Diagnosis

Igor Mozetic

We present IDA --- an Incremental Diagnostic Algorithm which computes minimal diagnoses from diagnoses, and not from conflicts. As a consequence, by using a `weak' fault model, the worst-case complexity of the algorithm to compute the k+1-st minimal diagnosis is O(n^(2k)), where n is the number of components. On the practical side, an experimental evaluation indicates that the algorithm can efficiently diagnose devices consisting of a few thousand components. IDA separates model interpretation from the search for minimal diagnoses in the sense that the model interpreter is replaceable. This fits well into the Constraint Logic Programming modeling paradigm where, for example, combinatorial circuits are modeled by CLPB, analog circuits by CLPR, and physiological models in medicine by constraints over finite domains.

Citation: Mozetic I.: A Polynomial-Time Algorithm for Model-Based Diagnosis, Extended version of a paper in Proc. 10th European Conf. on Artificial Intelligence, ECAI-92, pp. 729-733, Vienna, Austria, John Wiley and Sons, 1992.


OFAI-TR-92-25

AI: Introduction, Paradigms, Applications (including CBR), Impacts, Visions

Robert Trappl

AI is one of the fastest growing scientific and technical disciplines in history. In this chapter, it is attempted to show why, by giving an overview about definitions of AI, areas of research, basic paradigms, applications with special emphasis on the new case-based reasoning (CBR) systems, impacts, and visions. An extensive list of references enables to delve deeper in areas of special interest.

Citation: Trappl R.: AI: Introduction, Paradigms, Applications (including CBR), Impacts, Visions, In Advanced Topics in Artificial Intelligence (V. Marik, O. Stepankova, R. Trappl, Eds.), pp. 1-24, Springer-Verlag (LNAI 617), 1992.


OFAI-TR-92-24 ( 45kB g-zipped PostScript file)

DMCAI CLP Reference Manual

Christian Holzbaur

Within the version of SICStus Prolog documented in this manual, the unification mechanism has been changed in such a way that the user may introduce interpreted terms and specify their unification through Prolog predicates. Extensible unification in turn, aims at the implementation of instances of the general constraint logic programming (CLP) scheme.

Keywords: , Unification, Implementation, Constraints

Citation: Holzbaur C.: DMCAI CLP Reference Manual, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-24, 1992.


OFAI-TR-92-23 ( 42kB g-zipped PostScript file)

Metastructures vs. Attributed Variables in the Context of Extensible Unification

Christian Holzbaur

We relate two mechanisms which aim at the extension of logic programming languages. The first mechanism directly extends syntactic unification through the introduction of a data type, whose (unification) semantics are specified through user-defined predicates. The second mechanism was utilized for the implementation of coroutining facilities, and was independently derived with optimal memory management for various Prolog extensions in mind. Experience from the application of both mechanisms to the realization of CLP languages, without leaving the logic programming context, enables us to reveal similarities and the potential with respect to this task. Constructive measures that narrow or close the gap between the two conceptual schemes are provided.

Keywords: , Unification, Implementation, Constraints

Citation: Holzbaur C.: Metastructures vs. Attributed Variables in the Context of Extensible Unification, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-23, 1992.


OFAI-TR-92-22

The Role of Artificial Intelligence in the Avoidance of War

Robert Trappl

Luckily, not all crises have led to war. This paper briefly reviews some AI applications like cognitive mapping, rule-based systems, machine learning, case-based reasoning to understand/predict geopolitical developments. The combination of several of them with a view to their potential role in the avoidance of war is proposed.

Citation: Trappl R.: The Role of Artificial Intelligence in the Avoidance of War, In Cybernetics and Systems '92 (R. Trappl, Ed.), pp. 1667-1772, World Scientific, Singapure, 1992.


OFAI-TR-92-21 ( 434kB g-zipped PostScript file)

Abstract Qualitative Perception Modelling and Intelligent Musical Learning

Gerhard Widmer

The research described in this article is concerned with one of the fundamental phenomena of intelligence, namely, the ability to learn. In particular, we are interested in intelligent systems that can learn, or be taught, to perform musical tasks and solve musical problems. This kind of research naturally leads to the question of what the prerequisites for effective learning are. In this article, it will be argued and demonstrated that fundamental knowledge about the domain is important and, indeed, indispensable if a system is to learn problem solving rules for a complex musical task effectively. That again leads us to ask what fundamental musical knowledge is and how it can be represented and reasoned about in a computer program. The research described in this article is a logical continuation of an earlier project that dealt with much simpler musical problems. It contributes to the above-mentioned goals in several ways: the article presents a general, abstract perception model for a comparatively simple sub-domain of tonal music; the model is meant to capture, in a qualitative way, some of the aspects that govern the way people `hear' harmonized melodies. This is the fundamental musical `knowledge' we want to equip our system with. We then present arguments for the importance of such knowledge for learning and describe a system that uses the perception model in the process of learning to harmonize given melodies from examples of correct and incorrect solutions. Accordingly, the article is divided into two parts; the first part is devoted to the general perception model; the second part then describes the learning system and illustrates its workings with an example.

Citation: Widmer G.: Abstract Qualitative Perception Modelling and Intelligent Musical Learning, Computer Music Journal 16 (2), pp. 51-68, MIT Press, 1992.


OFAI-TR-92-20 ( 363kB g-zipped PostScript file)

A Knowledge Intensive Approach to Machine Learning in Music

Gerhard Widmer

The article describes the first results of an ongoing project that is being pursued at the Austrian Research Institute for Artificial Intelligence. The long-term goal of the project is the development of a new generation of flexible and adaptive musical systems. The central concern is therefore with techniques of Machine Learning, and with research into the role that general musical knowledge plays in a system that is to learn new musical concepts. The first test domain for our system was two-voice counterpoint composition. The starting point for the project was the realization that `intelligent' learning requires a considerable amount of domain-specific knowledge. We will describe our approach to defining some basic knowledge about tonal music which can serve as the basis for learning processes. We will also briefly describe the integrated learning strategy which can take advantage of such knowledge during learning. Finally, the paper also suggests that intelligent, knowledge-based learning systems could be useful tools for testing general theories about music.

Citation: Widmer G.: A Knowledge Intensive Approach to Machine Learning in Music, In Understanding Music with AI: Perspectives on Music Cognition (M. Balaban, K. Ebcioglu, O. Laske, Eds.), AAAI Press, Menlo Park, CA, 1992.


OFAI-TR-92-19 ( 73kB g-zipped PostScript file)

Improving Diagnostic Efficiency in KARDIO: Abstractions, Constraint Propagation, and Model Compilation

Igor Mozetic, Bernhard Pfahringer

The KARDIO system deals with the problem of diagnosing cardiac arrhythmias from symbolic descriptions of electrocardiograms. The system incorporates a qualitative model which simulates the electrical activity of the heart. In the paper we outline three methods for an efficient application of a simulation model to diagnosis. First, through abstractions and refinements, the model is represented at several levels of detail. Second, the model is reformulated in terms of constraints which enable efficient propagation of relational dependencies and reduce backtracking. And finally, the model is `compiled' into surface diagnostic rules. Through simulation, a relational table is generated and subsequently compressed into efficient diagnostic rules by inductive learning. A novel contribution to KARDIO, presented here, includes a comparison of diagnostic efficiency and space complexity of five types of knowledge: a simulation model of the heart, a hierarchical four-level model, a model represented in terms of constraints, a relational table, and compressed diagnostic rules.

Citation: Mozetic I., Pfahringer B.: Improving Diagnostic Efficiency in KARDIO: Abstractions, Constraint Propagation, and Model Compilation, In Deep Models for Medical Knowledge Engineering (E. Keravnou, Ed.), pp. 1-25, Elsevier, Amsterdam, 1992.


OFAI-TR-92-18 ( 53kB g-zipped PostScript file)

Model-Based Diagnosis: An Overview

Igor Mozetic

Diagnosis is an important application area of Artificial Intelligence. First generation expert diagnostic systems had exhibited difficulties which motivated the development of model-based reasoning techniques. Model-based diagnosis is the activity of locating malfunctioning components of a system solely on the basis of its structure and behavior. The paper gives a brief overview of the main concepts, problems, and research results in this area.

Citation: Mozetic I.: Model-Based Diagnosis: An Overview, In Advanced Topics in Artificial Intelligence (V. Marik, O. Stepankova, R. Trappl, Eds.), pp. 419-430, Springer-Verlag (LNAI 617), 1992.


OFAI-TR-92-17

VIE-PNN: Ein Expertensystem zur Berechnung der parenteralen Ernährung von intensiv behandelten Früh- und Neugeborenen

Silvia Miksch, Christian Popow, Werner Horn, Maria Dobner

Ziel des Projektes VIE-PNN (Vienna Expert System for Parenteral Nutrition for Neonates) war es die Zusammenstellung der parenteralen Ernährung von intensiv behandelten Früh- und Neugeborenen in einem Expertensystem zu modelliern. Die Berechnung der parenteralen Ernährung von intensiv behandelten PatientInnen ist eine komplexe Aufgabenstellung, die sehr zeitaufwendig und fehleranfällig ist. Zielsetzung des Projekts war die Entwicklung eines Computerprogramms, das den Zeitaufwand, der durch die manuelle Berechnung und die notwendige Fehlerüberprüfung entsteht, zu reduzieren. Die bisher bekannten Programme zur Ernährungszusammenstellung verfügen ausschließlich über einfache Rechen- und Dokumentationsfunktionen und sind auch auf die speziellen Bedürfnisse von Früh- und Neugeborenen nicht adaptierbar. Wir haben daher ein Expertensystem für die Zusammenstellung der parenteralen Ernährung von intensiv behandelten Früh- und Neugeborenen entwickelt. Das klinische Wissen über die Berechnung der parenteralen Ernährung wurde aufbereitet und dann in einem Expertensystem regelbasiert repräsentiert. Dabei fliessen insbesonders der aktuelle Gesundheitszustand des Säuglings, die aktuellen Laborwerte und die der Vortage ein. Das Programm berücksichtigt den verringerten Nährstoffbedarf bei teilweiser enteralen Ernährung und erstellt Vorschläge für einen eventuellen Korrekturbedarf und einer mögliche Steigerung des Nährstoffbedarfs. Ein Wissensaquisitionsmodul ermöglicht die Aktualisierung der oralen Ernährung und der Medikamente in der Wissensbasis. Das System wurde auf einem PC realisiert und ist derzeit an der Wiener Universitätsklinik für Kinderheilkunde (AKH) in praktischer Erprobung.

Citation: Miksch S., Popow C., Horn W., Dobner M.: VIE-PNN: Ein Expertensystem zur Berechnung der parenteralen Ernährung von intensiv behandelten Früh- und Neugeborenen, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-17, 1992.


OFAI-TR-92-16 ( 423kB g-zipped PostScript file)

Automatic Knowledge Base Refinement: Learning from Examples and Deep Knowledge in Rheumatology

Gerhard Widmer, Werner Horn, Bernhard Nagele

MESICAR is a second generation expert system which contains very general disease descriptions about rheumatological disorders in the primary medical care field. With the help of a detailed hierarchical description of the human anatomy the system is able to support diagnostic decisions. The current paper describes how machine learning techniques are used to automatically build more specific disease descriptions for common, frequently occurring cases. The system MESICAR-LEARN implements a learning method which integrates analytical and empirical learning techniques. Cases diagnosed by MESICAR form the training examples, and MESICAR's knowledge base is used as domain theory. The learned concepts are integrated into a hierarchy of disease descriptions. They support efficient and fast reasoning on common cases in addition to the general diagnostic support afforded by MESICAR's deep knowledge.

Keywords: , Medical Expert System, Rheumatology, Deep Knowledge, Machine Learning, Knowledge Base Refinement

Citation: Widmer G., Horn W., Nagele B.: Automatic Knowledge Base Refinement: Learning from Examples and Deep Knowledge in Rheumatology, Artificial Intelligence in Medicine 5 (3), pp. 225-243, special issue on ``Expert Systems, Knowledge Acquisition, and Learning'', 1993.


OFAI-TR-92-15

Reanalyzing Similarity Measures in Neural Networks and Their Practical Consequences

Georg Dorffner, Herbert Wiklicky

Neural networks are said to treat inputs sensitively to their similarities. Therefore, success of a neural network application depends on whether the ``right'' similarities are recognized. We argue that, although some theory exists, many practical applications have neglected some, often surprising, limits based on the similarity measures employed by the networks. We then reanalyze some common measures from a practical point of view, revealing some of those limits, including a paradox of patterns that are ``more similar'' than the desired prototypes. We conclude this paper with a list of heuristics for network design and argue for a fruitful cross-fertilization of neural network subfields.

Keywords: , Neural Networks, Similarity Measures, Practical, Applications, Performance Analysis

Citation: Dorffner G., Wiklicky H.: Reanalyzing Similarity Measures in Neural Networks and Their Practical Consequences, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-15, 1992.


OFAI-TR-92-14 ( 77kB g-zipped PostScript file)

A Self-Learning Visual Pattern Explorer and Recognizer Using a Higher Order Neural Network

Günther Linhart, Georg Dorffner

Higher order neural networks, although known for their power, have not been acknowledged very much in literature, mainly due to their apparent computational complexity. In this paper a proposal (Reid et al. 1989) to improve the efficiency of such networks is taken and built into a pattern recognition system that autonomously learns to categorize and recognize patterns independently from their position in an input image. It does this by combining higher order with first order networks and the mechanisms known from ART. Its recognition is based on a 16 x 16 pixel input which contains a section of the image found by a separate centering mechanism. It can be shown that with this system position invariant recognition can be implemented efficiently, while combining all the advantages of the employed sub-systems.

Keywords: , Neural Networks, Higher-Order Networks, Pattern, Recognition, Position Invariance, Categorization, ART

Citation: Linhart G., Dorffner G.: A Self-Learning Visual Pattern Explorer and Recognizer Using a Higher Order Neural Network, Proc. Intl. Joint Conference on Neural Networks, IJCNN-92, Baltimore, IEEE 1992.


OFAI-TR-92-13

On Redefining Symbols and Reuniting Connectionism with Cognitively Plausible Symbol Manipulation

Georg Dorffner

This paper attempts to reunite distributed connectionist models with models of higher-level cognition such as symbol manipulation (as suggested in Hadley 1990). This is done by introducing a novel way of seeing symbols in their cognitively relevant sense. Symbol manipulation turns out to be a process cognitive models should be able to do, but not in the same way traditional AI programs do it. Therefore it is argued that higher-level (i.e. consciously accessible and sequential) processes have to be deeply grounded in connectionist models of the lower (i.e. instantaneous, associative and intuitive) level. For this, keeping the two levels apart in all discussions is an important issue. Some implemented connectionist models are briefly described to illustrate a few basic elements higher-level processing could be built on.

Keywords: , Connectionism, Philosphical Issues, Symbols, Symbol Grounding, Symbol Manipulation, Cognitive Modeling

Citation: Dorffner G.: On Redefining Symbols and Reuniting Connectionism with Cognitively Plausible Symbol Manipulation, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-13, 1992.


OFAI-TR-92-12 ( 176kB g-zipped PostScript file)

``Winner-Take-More'' --- A Mechanism for Soft Competitive Learning

Georg Dorffner

In this paper a novel mechanism for ``soft'' competitive learning is introduced. It is designed to replace the brittle and often premature decisions or information reductions in winner-take-all mechanisms by a more relaxed learning scheme. In this scheme, not only the winner but also other highly active units are permitted to adapt their weights --- hence it has been dubbed ``winner-take-more.'' The inhibitory connections that are vital for competition are also left to change. By doing this, the learning mechanism discovers distinct classes of input patterns while permitting some (possibly useful) distributedness in the competitive layer to remain. An experiment highlighting the properties of the mechanism, as well as some possible applications are described.

Keywords: , Neural Networks, Connectionist Learning, Classification, Self-Organization, Competitive Learning

Citation: Dorffner G.: ``Winner-Take-More'' --- A Mechanism for Soft Competitive Learning, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-12, 1992.


OFAI-TR-92-11

Extensible Unification as Basis for the Implementation of CLP Languages

Christian Holzbaur

We address various aspects of the proposal to use user-defined extensible unification as the basic formalism for the implementation of constraint logic programming (CLP) languages. The close connection between unification theory and CLP, exhibited through the theoretical work of Jaffar et al., justifies the proposed step to make this link explicit and, particularly, operational. We sketch how this can be done via extensible unification, a single, simple mechanism which automatically induces a minimum of uniformity in the realization of CLP languages. If CLP languages are implemented via extensible unification, they will inherit the capability of being extended on a sound basis, leading to the attractive construction of towers of metacircular CLP languages. In the conclusion we report on earlier and recent work that provides empirical evidence for the feasibility of the mechanism.

Keywords: , Logic Prog., Unification, Constraints

Citation: Holzbaur C.: Extensible Unification as Basis for the Implementation of CLP Languages, Austrian Research Institute for Artificial Intelligence, Vienna, TR-92-11, 1992.


OFAI-TR-92-10 ( 46kB g-zipped PostScript file)

Handling Sequences with a Competitive Recurrent Network

Claudia Ulbricht

The Competitive Recurrent Network presented is a neural network with competitive layers that is able to handle sequences of input patterns. The mechanism enabling the neural network to handle sequential aspects is a feedback loop. A modified form of Hebbian learning is employed together with several types of forgetting. The updating mechanisms used have been developed specifically for the network architecture. It is shown for both forecasting and classification tasks how various types of sequences can be handled by this neural network.

Keywords: , Neurocomputing, Neural Network, Competition, Sequences, Time Series, Recurrence, Feedback

Citation: Ulbricht C.: Handling Sequences with a Competitive Recurrent Network, Extended version of a paper in Proc. Intl. Joint Conference on Neural Networks, IJCNN-92, Baltimore, USA.


OFAI-TR-92-09 ( 77kB g-zipped PostScript file)

Argument Structure and Case Assignment in German

Wolfgang Heinz, Johannes Matiasek

Case is a means of linking items in utterances. Its realization varies both within and between languages. Within a single language its realization may vary according to syntactic environment. Across languages different means (morphological, positional, lexical) are used to express case. In GB-Theory this has led to a distinction between structural and inherent argument positions and to differentiating between syntactic case and its (in German morphological) realization. Not only the realization of case but also the availability of argument positions may depend on the syntactic construction (consider argument reduction phenomena such as the passive) and on the morphological form of the heads (e.g. finite verb forms vs. participles). Argument structure and case assigment are thus topics which are closely related to each other. We investigate argument structure at different levels (syntactic, semantic, lexical) and show how the principles of case assignment can be stated in terms of the interaction between the representation of argument structure at these levels. The anlysis is stated using the HPSG formalism and accounts for a broad range of phenomena (such as passivization, auxiliary selection, argument reduction, absolutive constructions). The framework also extends to different realizations of Case by morphological, positional and lexical means and thus is not confined to use in a grammar of German.

Citation: Heinz W., Matiasek J.: Argument Structure and Case Assignment in German, in: German in Head-Driven Phrase Structure Grammer (J. Nerbonne, C. Pollard, K. Netter, Eds.), CSLI-LN, Univ. of Chicago Press, 1994.


OFAI-TR-92-08 ( 50kB g-zipped PostScript file)

Comparison in NLIs --- Habitability and Database Reality

Wolfgang Heinz, Johannes Matiasek, Harald Trost, Ernst Buchberger

This paper describes the treatment of comparison and measures in Datenbank-DIALOG a German language interface to relational databases. Besides giving a short overview of the system architecture the paper shows how design strategies support the development of a habitable system taking as example comparison and measures both of which are important for many application domains of NLIs and non-trivial from a linguistic point of view. In contrast to some former work we pay attention not only to the purely linguistic part but equally to the mapping to the underlying database model. This way we arrive at a balanced treatment of syntax as well as semantics and pragmatics.

Citation: Heinz W., Matiasek J., Trost H., Buchberger E.: Comparison in NLIs --- Habitability and Database Reality, Proc. 10th European Conf. on Artificial Intelligence, ECAI-92, pp. 548-552, Vienna, Austria, John Wiley and Sons, 1992.


OFAI-TR-92-07 ( 33kB g-zipped PostScript file)

Structure Sharing Unification of Disjunctive Feature Descriptions

Johannes Matiasek

A method is presented which allows for unification of disjunctive feature descriptions with a minimal amount of copying. This is accomplished by using a lazy incremental copy technique in combination with a representation of feature descriptions that allows for distributed disjunctions. The use of context descriptions keeps disjunctions as local as possible and prevents independent alternatives to interact unnessecarily, thus helping to avoid redundant copying. In that way structure sharing is possible between different feature descriptions as well as between disjuncts. Furthermore the unification algorithm need not consider nondisjunctive parts of a feature descriptions twice when dealing with alternatives as e.g., in implementations employing backtracking. This allows for an efficient implementation of feature based systems.

Citation: Matiasek J.: Structure Sharing Unification of Disjunctive Feature Descriptions, Presented at the ECAI-92 workshop Coping with Linguistic Ambiguity in Typed Feature Formalisms, a revised and extendend version appeared in Trost H.(ed.), Feature Formalisms and Linguistic Ambiguity, Ellis Horwood, Chichester, UK, 1993.


OFAI-TR-92-06 ( 412kB g-zipped PostScript file)

Learning with a Qualitative Domain Theory by Means of Plausible Explanations

Gerhard Widmer

This chapter describes an approach to learning on the basis of a qualitative domain theory. The theory consists of a mixture of strict rules and general dependency statements. The domain theory supports plausible explanations of training instances. These explanations are used to create initial concepts via a kind of `plausible EBG', and also to guide subsequent empirical generalization of learned concepts. The method has been implemented in a system that learns to solve complex problems in the domain of tonal music. This chapter presents the application domain, describes the learning method (with special emphasis on the plausible inference strategies used), presents empirical results, and shows how this approach naturally leads to a framework for multistrategy learning.

Citation: Widmer G.: Learning with a Qualitative Domain Theory by Means of Plausible Explanations, In Machine Learning: A Multistrategy Approach, Vol. IV, (R.S. Michalski, G. Tecuci, Eds.), San Mateo, CA, Morgan Kaufmann, 1994.


OFAI-TR-92-05 ( 45kB g-zipped PostScript file)

Learning Flexible Concepts from Streams of Examples: FLORA2

Gerhard Widmer, Miroslav Kubat

FLORA2 is a program for supervised learning of concepts that are subject to concept drift. The learning process is incremental in that the examples are processed one by one. A special feature of our program consists in keeping in memory a subset of examples --- a window. In time, new examples are being added to the window while other ones are considered outdated and are forgotten. In order to track the concept drift, the system keeps in memory not only valid descriptions of the concepts as they are derived from the objects currently present in the window, but also `candidate descriptions' that may turn into valid descriptions in the future.

Keywords: , Empirical Learning, Incremental Learning, Concept Drift

Citation: Widmer G., Kubat M.: Learning Flexible Concepts from Streams of Examples: FLORA2, Proc. 10th European Conf. on Artificial Intelligence, ECAI-92, pp. 463-467, Vienna, Austria, John Wiley and Sons, 1992.


OFAI-TR-