ISCA Archive Eurospeech 2003
ISCA Archive Eurospeech 2003

Named entity extraction from Japanese broadcast news

Akio Kobayashi, Franz J. Och, Hermann Ney

This paper describes a method for named entity extraction from Japanese broadcast news. Our proposed named entity tagger gives entity categories for every character in order to deal with unknown words and entities correctly. This character-based tagger has models designed by maximum entropy modeling. We discuss the efficiency of the proposed tagger by comparison with a conventional word-based tagger. The results indicate that the capability of the taggers depends on the entity categories. Therefore, the features derived from both character and word contexts are required to obtain high performance of named entity extraction.