Multi-Words

Many words, especially in English, are split into multiple words. Most of the time, this works as one would expect, but there are a few instances to be aware of.

Example

  • Synonym: ice creamglace
  • Product1: vanilla ice cream
  • Product2: ice vanilla cream

A search with the word glace will only find Product1.

This means that when ingesting product data, any word that is a multi-word, must not be separated by another word.

The diagnostics page has a pipeline step for MultiWord. Here it shows that multiple tokens are combined to a single token with _. This is done to ensure multi-words are handled correctly.

Synonyms, hypernyms or irregular words can all be created as multi-words.