Function LangTokeniser_EN->text_to_ngrams

Definitions

sources/lang_tokeniser_EN.php

  • Convert some text into some ngrams (basically words or word sequences) to be indexed.
  • Visibility: public
  • Is abstract?: No
  • Is static?: No
  • Is final?: No
  • Return: array

Parameters

Name Type Default Set Range Description
$text string required parameter N/A N/A Text to be indexed
$max_ngram_size integer 1 N/A N/A The maximum number of ngrams (typically words) to sequence together
$total_singular_ngram_tokens ?integer Null N/A N/A Write into a count of singular ngrams (typically, words) in here (null: do not count)

Return

  • A list of ngrams (along with a boolean to indicate whether they are a boolean ngram)
  • Type: array
  • Set: N/A
  • Range: N/A