Tehničko veleučilište u Zagrebu · Zagreb

CROXMLSUM – the System for XML Document Summarization in Croatian

izvorni znanstveni rad

izvorni znanstveni rad

CROXMLSUM – the System for XML Document Summarization in Croatian

Vrsta prilog u časopisu
Tip izvorni znanstveni rad
Godina 2007
Časopis International journal of mathematics and computers in simulation
Nadređena publikacija International journal of mathematics and computers in simulation
Volumen 1
Svesčić 1
Stranice str. 81-89
EISSN 1998-0159
Status objavljeno

Sažetak

The paper describes automatic summarization of the XML documents in Croatian language. The goal of the summarizer is to generate extracts with high percent of extract-worthiness and similarity to the author's abstract. Our research shows that extracts generated using our algorithm are well formed, but it also shows that algorithm is very domain dependant. The research brought us to conclusion that we should develop the implementation of the Porter's stemming algorithm in order to improve the text summarization for Croatian language, which is currently at an early stage of development.

Ključne riječi

automatic summarization; XML documents; Croatian language; Perl