Important Words & Phrases Report

 

This report ranks the most important words or phrases based on their TF-IDF score.

TF-IDF (and variations thereof) has been used by search engines for many years to better understand content. Essentially, it can be viewed as a weight with each word or phrase weighted based on:

  • the number of times the word or phrase appears in a call (increases the weight); and

  • the number of calls that contain the word or phrase (decreases the weight)

Words and phrases mentioned frequently in a few calls will be assigned a greater weight than words and phrases appearing in many calls. The TF-IDF score helps to identify the words and phrases that have been mentioned in the calls but were hidden in the rankings because of frequent words and phrases that are numerous within the dataset but add little or no meaning to the analysis (words such as “what” “is” “that”). These latter words and phrases are assigned low weights thus removing them from the analysis, leaving the more important words and phrases.

If you are interested in the most frequent words and phrases without any weights, then refer to the Trending Words & Phrases report.

TF-IDF Score

In information retrieval, tf–idf, TF*IDF, or TFIDF, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.[1] It is often used as a weighting factor in searches of information retrieval, text mining, and user modeling. The tf–idf value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps to adjust for the fact that some words appear more frequently in general. tf–idf is one of the most popular term-weighting schemes today. A survey conducted in 2015 showed that 83% of text-based recommender systems in digital libraries use tf–idf.[2]

Variations of the tf–idf weighting scheme are often used by search engines as a central tool in scoring and ranking a document's relevance given a user query. tf–idf can be successfully used for stop-words filtering in various subject fields, including text summarization and classification.

Source: wikipedia

 

 

 

The chart ranks the most important words or phrases in descending order based on their TF-IDF score.

The percentage of calls that contains the word or phrase is shown on the right.

Hover over any phrase for more details or select the phrase to list the calls in which the phrase appears.

 

 

 

The table lists the calls in which the word or phrase appears.

You can sort the table by selecting any of the header columns:

  • Filename

  • Duration (in minutes)

  • Score: TF-IDF score

  • Count: number of top 100 words or phrases included in the call

  • URL: link to Wordbench for a deep dive analysis

Selecting a call within the table will highlight the top 100 words or phrases included in the call:

 

 

 

The panel on the right of the report shows an alphabetical list of words or phrases and provides a search feature to quickly locate a particular word or phrase.

 

 

 

 

 

If “Words” is selected from the Top Level filter, then further filtering is provided by word type.

Hold down the CTRL key to select multiple categories.

 

 

 

Additional Specific Type filtering is available when Words are selected.

Hover over the field names to show the full name of the field.

Hold down the CTRL key to select multiple categories.

 

 

Copyright © VoiceAI Pty Ltd 2021, All Rights Reserved