Function Fast_custom_index->tokenise_text

Definitions

sources/database_search.php

  • Tokenise some text, so it can be indexed by token.
  • Visibility: protected
  • Is abstract?: No
  • Is static?: No
  • Is final?: No
  • Returns: array

Parameters

Name Type Passed by reference? Variadic? Default Set Range Description
$text string No No required parameter N/A N/A The text
$lang LANGUAGE_NAME No No required parameter N/A N/A Language codename
$ngrams_exclude ?array No No Null N/A N/A A list of ngrams to explicitly exclude (used internally to stop repetitions across multiple APPEARANCE_CONTEXTs, ultimately required to stop row repetition in output) (null: none)
&$total_singular_ngram_tokens ?integer Yes No Null N/A N/A Maintain a count of singular ngrams (typically words) in here (null: do not maintain)
&$statistics_map ?array Yes No Null N/A N/A Write into this map of singular ngram (typically, words) to number of occurrences (null: do not maintain a map)

Returns

  • Map between ngrams and number of occurrences
  • Type: array
  • Set: N/A
  • Range: N/A

Preview

Code (PHP)

/**
 * Tokenise some text, so it can be indexed by token.
 *
 * @param  string $text The text
 * @param  LANGUAGE_NAME $lang Language codename
 * @param  ?array $ngrams_exclude A list of ngrams to explicitly exclude (used internally to stop repetitions across multiple APPEARANCE_CONTEXTs, ultimately required to stop row repetition in output) (null: none)
 * @param  ?integer $total_singular_ngram_tokens Maintain a count of singular ngrams (typically words) in here (null: do not maintain)
 * @param  ?array $statistics_map Write into this map of singular ngram (typically, words) to number of occurrences (null: do not maintain a map)
 * @return array Map between ngrams and number of occurrences
 */

protected function tokenise_text(string $text, string $lang, ?array $ngrams_exclude = null, ?int &$total_singular_ngram_tokens = null, ?array &$statistics_map = null) : array