TreeTagger2 plugin for GATE
NOTE: This plugin has been tested under Linux. It is not tested under Windows. It needs perl and other commands that might not work as required under Windows. I have had reports that it does not work under Windows. If you want to help make it work under Windows, please contact me, send (detailed!) bug reports, or - even better - patches.This plugin was created as part of the Aurex/W project conducted by Webintegration Gmbh and OFAI.
This is a modified version of the original TreeTagger plugin that comes with GATE. The main changes and improvements are:
- There is no need to change anything within the plugin directory in order to configure the location of the TreeTagger binary and support files. The new script included in the plugin will nearly always figure out these locations automatically, and if all else fails, they can be specified as a processing resource parameter.
- All processing resource parameters are now runtime parameters which makes it much easier to change settings without the need to re-create the processing resource.
- The plugin now also supports the user of TreeTagger as a chunker
- The annotation type to be used can be specified
- The plugin does not any longer read back the temporary file and does not do a line-by-line check of original tokens against what comes back from the TreeTagger. Consistency is checked by merely comparing the number of lines returned with the original number of tokens passed to the tagger.
Current version: 2006-11-8
You can download the plugin as
INSTALLATION: both the gzipped and the ZIP file contain a precompiled
version compiled with Sun JDK 1.5.0_06-b05 under Linux.
This should work with newer Java versions, but if not,
the package can be recompiled in the standard way with
a simple ant
command.
NOTE: This requires "perl" and "tee" and might not work under Windows.
Simply unpack the archive, then within GATE go to File->Manage Creole Plugins, press the "Add new CREOLE repository" button and select the directory you have just created.
After the plugin has been loaded this way, you should find the new processing resources "TreeTaggerPOS" and "TreeTaggerChunk" in the "New" menu for processing resources.
All parameters for this plugin are defined as runtime parameters. This means that you have to change them in the pipeline, not when you first create the gazetteer object in the GUI. This makes it much easier to modify the parameters for an existing gazetteer object (previously, the old object had to be discarded and a new one had to be created for any parameter change).