Tehničko veleučilište u Zagrebu · Zagreb

Text Summarization of XML documents in Croatian

izvorni znanstveni rad

izvorni znanstveni rad

Text Summarization of XML documents in Croatian

Vrsta prilog sa skupa (u zborniku)
Tip izvorni znanstveni rad
Godina 2008
Nadređena publikacija Modern topics of computer science : proceedings of the 2nd WSEAS International Conference on Computer Engineering and Applications (CEA'08), Electrical and computer engineering series
Stranice str. 143-148
ISSN 1790-5117
Status objavljeno

Sažetak

The paper describes automatic summarization of the XML documents in Croatian language. The goal of the summarizer is to generate extracts with high percent of extract-worthiness and similarity to the author's abstract. Our research shows that extracts generated using our algorithm are well formed, but it also shows that algorithm is very domain dependant. The research brought us to conclusion that we should develop the implementation of the Porter's stemming algorithm in order to improve the text summarization for Croatian language, which is currently at an early stage of development.

Ključne riječi

automatic summarization ; XML documents ; Croatian language ; Perl