Class Zend_Search_Lucene_Analysis_Token

Description

Located in /Zend/Search/Lucene/Analysis/Token.php (line 29)


	
			
Variable Summary
 integer $_endOffset
 integer $_startOffset
 string $_termText
 string $_type
Method Summary
 Zend_Search_Lucene_Analysis_Token __construct (string $text, integer $start, integer $end, [string $type = 'word'])
 integer getEndOffset ()
 integer getStartOffset ()
 string getTermText ()
 string getType ()
 void setPositionIncrement (integer $positionIncrement)
Variables
integer $_endOffset (line 50)

End in source text

  • access: private
integer $_positionIncrement (line 81)

The position of this token relative to the previous Token.

The default value is one.

Some common uses for this are: Set it to zero to put multiple terms in the same position. This is useful if, e.g., a word has multiple stems. Searches for phrases including either stem will match. In this case, all but the first stem's increment should be set to zero: the increment of the first instance should be one. Repeating a token with an increment of zero can also be used to boost the scores of matches on that token.

Set it to values greater than one to inhibit exact phrase matches. If, for example, one does not want phrases to match across removed stop words, then one could build a stop word filter that removes stop words and also sets the increment to the number of stop words removed before each non-stop word. Then exact phrase queries will only match when the terms occur with no intervening stop words.

  • access: private
integer $_startOffset (line 43)

Start in source text.

  • access: private
string $_termText (line 36)

The text of the term.

  • access: private
string $_type (line 57)

Lexical type.

  • access: private
Methods
Constructor __construct (line 92)

Object constructor

  • access: public
Zend_Search_Lucene_Analysis_Token __construct (string $text, integer $start, integer $end, [string $type = 'word'])
  • string $text
  • integer $start
  • integer $end
  • string $type
getEndOffset (line 155)

Returns this Token's ending offset, one greater than the position of the last character corresponding to this token in the source text.

  • access: public
integer getEndOffset ()
getPositionIncrement (line 118)

Returns the position increment of this Token.

  • access: public
integer getPositionIncrement ()
getStartOffset (line 144)

Returns this Token's starting offset, the position of the first character corresponding to this token in the source text.

Note: The difference between getEndOffset() and getStartOffset() may not be equal to strlen(Zend_Search_Lucene_Analysis_Token::getTermText()), as the term text may have been altered by a stemmer or some other filter.

  • access: public
integer getStartOffset ()
getTermText (line 128)

Returns the Token's term text.

  • access: public
string getTermText ()
getType (line 165)

Returns this Token's lexical type. Defaults to 'word'.

  • access: public
string getType ()
setPositionIncrement (line 108)

positionIncrement setter

  • access: public
void setPositionIncrement (integer $positionIncrement)
  • integer $positionIncrement

Documentation generated on Tue, 18 Apr 2006 11:55:46 -0700 by phpDocumentor 1.3.0RC3