Multi-Words
Many words, especially in English, are split into multiple words. Most of the time, this works as one would expect, but there are a few instances to be aware of.
Example
- Synonym:
ice cream
→glace
- Product1:
vanilla ice cream
- Product2:
ice vanilla cream
A search with the word glace
will only find Product1.
This means that when ingesting product data, any word that is a multi-word, must not be separated by another word.
The diagnostics page has a pipeline step for MultiWord. Here it shows that multiple tokens are combined to a single token with _
. This is done to ensure multi-words are handled correctly.
Synonyms, hypernyms or irregular words can all be created as multi-words.