Atire dictionary

From ATIRE

Jump to: navigation, search

Description

atire_dictionary lists all the terms in the index along with the document frequency, the collection frequency, and the number of bytes taken to store that term on disc. Optionally the postings lists of each term will also be displayed. The output format is

<word> <document_frequency> <collection_frequency> <bytes taken this word>

Terms starting with a ~ are reserved for internal use and are not index terms

Usage

atire_dictionary [-s <start word> [-e <end word>]] [-d<oubleMetaphone>] [-x<soundex>] [-u<nicodeWideChars>] [-p<rintPostings>] [-l]

Parameters

  • -d
    • Display the double metaphone encoding of the term.
  • -e <end_word>
    • Stop listing after <end_word>.
  • -l
    • when used in conjunction with -p this option causes atire_dictionary to print each posting pair {docid, term_frequency} on a seperate line.
  • -p
    • Print the postings as an set of ordered pairs {docid, term_frequency}.
  • -s <start_word>
    • Start listing from <start_word>.
  • -u
    • Convert from UTF8 to wide characters if the operating system doesn't support printf of UTF8 characters.
  • -x
    • Display the soundex encoding of the term.