FACILE: Classifying Texts Integrating Pattern Matching and Information Retrieval

Fabio Ciravegna
Alberto Lavelli
Nadia Mana
Johannes Matiasek Luca Gilardoni
Silvia Mazza
Massimo Ferraro
William J. Black
Fabio Rinaldi
David Mowatt
ITC-irst
Loc. Pantè di Povo
38050 Trento Italy
ÖFAI
Schottengasse 3
1010 Vienna Austria
Quinary SpA
Via Fara 35
20124 Milan Italy
UMIST - PO Box 88
Manchester M60 1QD
United Kingdom

Abstract

Successfully managing information means being able to find relevant new information and to correctly integrate it with pre-existing knowledge. Much information is nowadays stored as multi-lingual textual data; therefore advanced classifi-cation systems are currently considered as strategic components for effective knowledge management. We describe an experience integrating different innovative AI technologies such as hierarchical pattern matching and information extraction to provide flexible multilingual classification adaptable to user needs. Pattern matching produces fairly accurate and fast categorisation over a large number of classes, while information extraction provides fine-grained classification for a reduced number of classes. The resulting system was adopted by the main Italian financial news agency providing a pay-to-view service.

Full Paper in PDF format

Citation:

Fabio Ciravegna, Alberto Lavelli, Nadia Mana, Johannes Matiasek, Luca Gilardoni, Silvia Mazza, Massimo Ferraro , William J. Black, Fabio Rinaldi and David Mowatt: FACILE: Classifying Texts Integrating Pattern Matching and Information Retrieval. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 31 July - 6 August 1999. to appear.
Copyright of the paper has been transferred to IJCAI, thus IJCAI is the sole owner of the paper copyright. However, the authors have royalty-free permissions to:
  1. retain all proprietary rights (such as patent rights) other than copyright and the publication right transferred to IJCAI
  2. personally reuse all or portions of the paper in other works of their own authorship
  3. make oral presentation of the material in any forum
  4. reproduce - or have reproduced - the paper for the author's personal use, or for company use provided that IJCAI copyright and the source are indicated, and that the copies are not used in a way that implies IJCAI endorsement of a product or a service of an employer, and that the copies per se are not offered for sale. The foregoing right shall not permit the posting of the paper in electronic or digital form on any computer network, except the author or the author's employer, and then only on the author's or the employer's own World Wide Web page or ftp site. Such Web page or ftp site , in addition to the aforementioned requirements of this Paragraph must provide an electronic reference or link back to the IJCAI electronic server (http://www.ijcai.org), and shall not post other IJCAI copyrighted materials not of the author's or the employer's creation (including tables of contents with links to other papers) without IJCAI's written permission;
  5. make limited distribution of all or portions of the above paper prior to publication

Johannes Matiasek
Last modified: Thu Apr 1 18:52:07 MET DST 1999