Skip to content


Glossary

Abstract

  • is a summary at least some of whose material is not present in the input (Mani 2001, p. 77)

Automatic summarization

  • a reductive transformation of source text to summary text through content reduction by selection and/or generalisation on what is important in the source (Sparck Jones, 1999)

Compression rate

  • measure used to indicate the length of a summary defined as summary length over source length. The lengths of the source and summary can be calculated in characters, words, sentences or paragraphs. A compression rate of 1% is considered high, whereas a compression rate of 99% is considered low.

Cross-lingual summarisation

  • processing several languages, with summary in different language from input (Mani 2001, p. 22)

Document Understanding Conferences

  • series of evaluation conferences which took place between 2001 and 2007 and focused on evaluation of single and multi-document summarisation methods

Find out more: http://www-nlpir.nist.gov/projects/duc/


The Edmundsonian paradigm

  • general framework proposed by Edmundson (1969) for extractive summarisation. The proposed method combines several shallow indicators in order to determine the importance of a sentence. In the original paper Edmundson used cue words, title words, key words and sentence location. This approach is still widely used with more sophisticated methods. More details about this can be found in (Mani 2001, p. 47 – 53)

Monolingual summarisation

Multilingual summarisation

  • processing several languages, with summary in same languages as input (Mani 2001, p. 23)

ROUGE (Recall-Oriented Understudy for Gisting Evaluation)

  • evaluation method that automatically determines the quality of a summary by comparing it to other (ideal) summaries produced by humans. It relies on counting the number of overlapping units such as n-grams (ROUGE-N), least common substrings (ROUGE-L and ROUGE-W), and skip bigrams and unigrams (ROUGE-S and ROUGE-SU) between the automatic summary and the ideal summaries. (Lin and Hovy, 2003; Lin, 2004)

Find out more: http://berouge.com/default.aspx


Snippet

  • a short extract from a document which is not necessarily a complete sentence or paragraph and can contain gaps. This type of summary is usually found in the output of Web search engines and is very likely to contain query words highlighted.

Summary

  • an abbreviated, accurate representation of the content of a document preferably prepared by its author(s) for publication with it. Such abstracts are also useful in access publications and machine-readable databases (American National Standards Institute Inc., 1979)
  • a summary is a text produced from one or more texts, that contains a significant portion of the information in the original text(s), and is not longer than half of the original text(s) (Hovy, 2003)
  • a concise representation of a document’s content to enable the reader to determine its relevance to a specific information” (Johnson, 1995)


Easy AdSense by Unreal
Creative Commons License
This work is licenced under a Creative Commons Licence.