Label Data

Bottom-up#

Basics

This labeling flow starts from your raw data (i.e: historical conversation logs)

It allows you to efficiently review your data utterance by utterance, and either assign it to an existing label, or create a new label on the fly.

Similarity search accelerates this process by allowing you to find and group all similar utterances together instantly, and label them all in one shot.

continuous improvement

If you have a deployed conversational AI, this workflow allows you to continuously improve its coverage and accuracy, by easily identifying new intents, and sourcing training examples for existing ones

Top down#

Basics

This labeling starts from existing labeled data (i.e: intents)

It allows you to quickly find and label additional training examples for a given intent from your unlabeled data, using Similarity search.

Start this workflow from the labeled data view, by selecting an intent and clicking Get Suggestions.

Similarity Search will run a nearest neighbor search accross your unlabeled data and suggest relevant training examples. You can accept individual suggestions individually, in bulk, and reject any suggestions that don't apply. This feature is powered by active-learning: suggestions will improve based on how you interact with it, i.e: the more you accept and reject suggestions, the more accurate the suggestions.

Advanced: you can adjust the cosine similarity of similarity search suggestions, with the similarity slider: this will discard suggestions that are above a given threshold, and is useful to discover broader training examples.

tip

You must first upload unlabeled data for HumanFirst's Similarity Search to work.

Video Demo#

Unsupervised labeling#

Unsupervised

This feature is currently in a closed beta - if you're interested in using it, please let us know!