Discovering new intents

Discovering intents with uncertainty#

In this video we explain how to use the `uncertainty` metric to discover new intents in the unlabeled data of an existing workspace.

  1. You'll need a workspace with at least a dozen or more intents that each have a significant amount of training data.
  2. Make sure to have trained your NLU models.
  3. Sort your unlabeled data by uncertainty.
  4. Select one of the top results that has appeal.
  5. Sort by similarity to stash and select more relevant results.

`Uncertainty` surfaces utterances that your model has a very difficult time understanding. However, one of the caveats to this metric is that it does not take frequency or density into account. This means that it may surface outlier utterances that have little to no business value being labeled because they only occurred once in your entire data set.

Discovering intents with density#

In this video we explain how to use `density` sorting with clustering to discover new intents in your unlabeled data.

  1. Sort your unlabeled data by density.
  2. Select one of the top items.
  3. Sort by similarity to stash and activate clustering.
  4. Select relevant clusters.

`Density` is an interesting metric because it surfaces utterances that have many siblings (_occur often in your data_).