A. Albright, M.A., UCLA Linguistics Dept., Los Angeles

18 April 2002 Lecture

                                VORTRAG
                                *******

Oesterreichisches Forschungsinstitut fuer Artificial Intelligence(OeFAI)
                      Schottengasse 3, A-1010 Wien
 Tel.: +43-1-53361120,  Fax: +43-1-5336112-77,  Email: sec@oefai.at
-------------------------------------------------------------------------

Adam Albright, M.A.
UCLA Linguistics Department, Los Angeles

     A MINIMAL GENERALIZATION APPROACH TO RULE INDUCTION                          

Numerous computational models of morphology have taken on the task 
of identifying morphemes and decomposing complex words into their 
constituent parts.  Relatively fewer models have taken on the reverse 
task, of learning rules to compose novel complex forms.  Before a 
model can combine morphemes to create new forms, it must learn two 
things: (1) the contexts in which the morphemes occur (their 
distribution), and (2) the rates at which they occur (their 
productivity).  I present an inductive approach to learning the 
distribution and productivity of rules. The model starts by 
considering pairs of morphologically related forms (e.g., (present,past)), and 
comparing them to discover the rules that are needed to derive one 
form from the other.  Comparing (jump,jump[t]) and (sip,sip[t]), the 
model posits a rule suffixing [t] after stems ending in [p]; 
comparing further with (kick,kick[t]), it posits a rule suffixing [t] 
after stems ending in non-coronal voiceless stops, and so on.  This 
conservative strategy, which we refer to as "minimal 
generalization", can accurately discover the distribution of 
morphemes, because it never generalizes a process beyond the contexts 
in which it has been observed. In order to discover the productivity 
of rules, the model collects simple statistics about the reliability 
of rules in different environments.

After describing the basic model, I discuss a common but neglected 
pattern of linguistic exceptions, in which a few exceptional forms 
take the "wrong" allomorph.  For example, the English verb 
(burn,burnt) uses the [t] suffix, but in the context of the voiced 
sound [n], which should ordinarily take the [d] allomorph.  I present 
an algorithm for identifying this type of exception and learning the 
"true" distribution of allomorphs, even in the presence of such 
exceptions.


Zeit:   Donnerstag, 18. April, 2002, 18:30 Uhr pktl.
Ort:    Oesterreichisches Forschungsinstitut 
        fuer Artificial Intelligence    
        Schottengasse 3, 1010 Wien.


OESTERREICHISCHES FORSCHUNGSINSTITUT
FUER ARTIFICIAL INTELLIGENCE

o.Univ.-Prof.Ing.Dr. Robert Trappl