Two papers presented at Nodalida 2011

Two papers which I have co-written was presented at the The 18th Nordic Conference of Computational Linguistics, Riga, Latvia, May 11–13, 2011. The conference started out as focusing on NLP for the Nordic languages but now attracts participants from a score of countries. Attending this year were 130 people from mostly Europe.

The first paper, with the title Something Old, Something New – Applying a Pre-trained Parsing Model to Clinical Swedish, is co-written with Aron Henriksson and Sumithra Velupillai.

Abstract: “Information access from clinical text is a research area which has gained a large amount of interest in recent years. Automatic syntactic analysis for the creation of deeper language models is potentially very useful for such methods. However, syntactic parsers that are tailored to accommodate for the distinctive properties of clinical language are rare and costly to build. We present an initial study on the applicability of an existing parser, pre-trained on general Swedish, to clinical text in Swedish. We manually evaluate twelve documents and obtain a 92.4% part-of-speech tagging accuracy and a 76.6% labeled attachment score for the syntactic dependency parsing.”

The second paper, with the title The Impact of Part-of-Speech Filtering on Generation of a Swedish-Japanese Dictionary Using English as Pivot Language, is co-written with Ingemar Hjälmstad and Maria Skeppstedt. The paper is an adaptation of Ingemar’s Master thesis with the same title. Good work!

Abstract: “A common problem when combining two bilingual dictionaries to make a third, using one common language as a pivot language, is the emergence of false translations due to lexical ambiguity between words in the languages involved. This paper examines if the translation accuracy improves when using part-of-speech filtering of translation candidates. To examine this, two different Japanese-Swedish lexicons were created, one with part-of-speech filtering, one without. The results show 33% less translation candidates and a higher quality lexicon when using part-of-speech filtering. It also resulted in a free lexicon of Swedish translations to 40 716 Japanese entries with a 90% precision, and the conclusion that part-of-speech filtering is an easy way of improving the translation quality in this context.”