Models

Introduction to Raffle Models

Model Training

All models are automatically checked on a daily basis and fully trained every 7 days (whenever underlying data changes).

To ensure that the model trains with more and better data regularly, it is recommended to resolve knowledge gaps with Instant Answers or Existing Sources.

Note

Adding questions with Rules Engine have an immediate effect, meaning that the boosted pages will come up as top results in the searches.

Question Generation

Model re-training ensures that the model relearns with new or updated data (both answers and questions) and auto-generates search-improving questions behind the scenes.

Raffle is GDPR-compliant: the search does not use, set or store cookies, and anonymizes content in search questions that are passed through any Raffle Search UI.

When a user searches for a phrase, the original question is used to search for relevant answers. After which, these questions are anonymized with “tags” using Raffle’s anonymization Search UI, before they are saved for search analytics. Simply put, Raffle does not keep questions that contain sensitive information in the database. These appear in the Trending Questions and Knowledge Gaps lists.

Question tags:

CPR/SSN/ID numbers are replaced with “<ID>”
E-mails are replaced with “<EMAIL>”
Phone numbers are replaced with “<NUMBER>”

Note

Only the content in questions are anonymized, content in answers are not included in the anonymization process.

Data Clusters

Clusters are groups that have been created as a folder structure to give customers the possibility of exploring data (per Search UI) on different levels.

User searches in the Trending Questions and Knowledge Gaps lists are clustered according to relevance, which means that a question has to be related to other questions in order to be part of a cluster and displayed on the list.

Trending Question clusters are sorted according to the frequency of asked questions, while Knowledge Gaps are sorted according to the similarity between the question and answers.

Cluster titles are based on the most common topics of a group, taking into account all data Raffle has collected. Simply put, if a Search UI has been live for a year, the titles are then based on all the data for the whole year. This also means that some titles may include words that are not currently present in the list of questions for the selected time frame.

Question Generation

Once a page has been scraped, it is then divided into sections which the Raffle model searches through for relevant answers. A section is then picked by the model and shown as an answer snippet, in the list of search answers. Questions are autogenerated after model training and can be viewed only in the Raffle backstage.

Synonyms and Misspellings

Synonyms and misspellings are automatically handled by the semantic aspect of Raffle Search. Raffle understands the context of a search query, and returns answers accordingly.

To train the Raffle model associated with your account with specific keywords or phrases, custom synonyms or abbreviations, refer to Rules Engine or provide a list of search phrases (with their target URLs or pages) to Raffle Support.

Answer Sections

The Raffle model searches through all connected indexes, finds and displays the most important and relevant section in a page, and ranks answers accordingly.

Note that the displayed section in the snippet may or may not contain the exact match of words in the search phrase.

Order and Ranking

Raffle Search results are model-based. A page could be manually boosted to come up as the top result, but the exact section displayed or the order in which answers are shown CANNOT be manually set, as this is decided by the Raffle model itself.

Search Data

Raffle Search, Chat and Summary uses the connected indexes as the sole basis for the displayed answers in a Search UI.

Additional Notes:

Chat and Summary both use the top answers in the search results (done in the background) as references for the generated answers
The generated answers cannot be manually changed but can be influenced with Rules Engine
The reference limit can be further adjusted to refine the generated response

Models

Model Training

Note

Question Generation

GDPR Compliance

Note

Data Clusters

Question Generation

Synonyms and Misspellings

Answer Sections

Order and Ranking

Search Data