ABCD is a Human (client) to Human (expert) conversation set. Humans often talk differently to bots than other humans depending on what their previous bot experiences have been. You may find they are more abrupt, shorter and much more directive. If you introduce buttons you tend to accentuate this and will find your conversation utterances change tfrom more free flowing longer utterances, to contain many more staccato short utterances of 1-3 words.
We need to make sure that we expand our training data for our new case, and check that our assumptions about the labels we've given to the existing data set well represent the ground truth to other annotaters who haven't been involved in developing the ground truth.
This is a great usecase to explore tags and testing out our labels with a dedicated test set.
Helpful core doc references:
- Avoid pitraps going from a human-human chat bootstrapped to human to bot conversations
- Tag a dedicated test set to check you label names
- Add tags
- Remove tags
- Rename and recolor tags
- Verify your labels are representative
- Get a full report of a test set or blind
- Find the false positives
- Fix the labels
- Understand where to fix under-represented phrases in the test set
- Use your test and dev phrases to see how people really interact with the bot
In "Create a workspace">"From a demo workspace" you will find a demo workspace for Academy Ex04: BlindSets This is the workspace in the exercise above, a different annotator has created an intent, but the label doesn't seem very representative Can you find it and fix it?
You may notice some other labels for test-regression - looks like someone is preparing a bigger blind set as well...
8 minutes - Video 1
3 minutes - Video 2