Function LangTokeniser_EN->text_to_ngrams

Definitions

sources/lang_tokeniser_EN.php

  • Convert some text into some ngrams (basically words or word sequences) to be indexed.
  • Visibility: public
  • Is abstract?: No
  • Is static?: No
  • Is final?: No
  • Returns: array

Parameters

Name Type Passed by reference? Variadic? Default Set Range Description
$text string No No required parameter N/A N/A Text to be indexed
$max_ngram_size integer No No 1 N/A N/A The maximum number of ngrams (typically words) to sequence together
&$total_singular_ngram_tokens ?integer Yes No Null N/A N/A Write into a count of singular ngrams (typically, words) in here (null: do not count)

Returns

  • A list of ngrams (along with a boolean to indicate whether they are a boolean ngram)
  • Type: array
  • Set: N/A
  • Range: N/A

Preview

Code (PHP)

/**
 * Convert some text into some ngrams (basically words or word sequences) to be indexed.
 *
 * @param  string $text Text to be indexed
 * @param  integer $max_ngram_size The maximum number of ngrams (typically words) to sequence together
 * @param  ?integer $total_singular_ngram_tokens Write into a count of singular ngrams (typically, words) in here (null: do not count)
 * @return array A list of ngrams (along with a boolean to indicate whether they are a boolean ngram)
 */

public function text_to_ngrams(string $text, int $max_ngram_size = 1, ?int &$total_singular_ngram_tokens = null) : array