Function LangTokeniser_EN->text_to_ngrams
Definitions
sources/lang_tokeniser_EN.php
- Convert some text into some ngrams (basically words or word sequences) to be indexed.
- Visibility: public
- Is abstract?: No
- Is static?: No
- Is final?: No
- Return: array
Parameters
Name | Type | Default | Set | Range | Description |
---|---|---|---|---|---|
$text | string | required parameter | N/A | N/A | Text to be indexed |
$max_ngram_size | integer | 1 | N/A | N/A | The maximum number of ngrams (typically words) to sequence together |
$total_singular_ngram_tokens | ?integer | Null | N/A | N/A | Write into a count of singular ngrams (typically, words) in here (null: do not count) |
Return
- A list of ngrams (along with a boolean to indicate whether they are a boolean ngram)
- Type: array
- Set: N/A
- Range: N/A