Build and strengthen models

60 minutes

Jointly plan to have loaded some data prior to the session.

The HumanFirst Team can help prepare a loading example - please send through your data in whichever format it currently.

  • Building a model from scratch

    • Bootstraping intents
    • Using fuzzy search
    • Semantic similarity search
    • Divide and Conquer
      • Starting with large, high-level concepts
      • Ensuring that we capture the diversity of issues and user phrasing
        • Avoiding taking the Top-n
        • Seeding the Stash with variety
        • The importance of using similarity
        • The importance of setting a goal of creating a clear definition and why more is not always better
  • Building at least 2 intents with at least 6 examples

    • Understanding the advantages of hierarchical classifier
    • How matching parent and child level works and how to manage the settings
    • The recommended minimum examples needed for each intent
    • Maintaining balance between intents and keeping one of magnitude between smallest and largest intents
  • Exploring different NLU engines

    • KNN - fast and doesn't require training.
    • HumanFirst NLU - an example-trained NLU engine based on embeddings
    • Understanding the NLU tab difference between Evaluation and Train and Infer
      • Understanding similarity to stash and its nuances
  • An introduction to coverage

    • Navigating the intents pane
    • Understanding the difference between total and unique coverage
    • Maximizing space for better visualization
    • Out of scope topics
  • Strengthening intents using HumanFirst NLU

    • Accessing suggestions from the intent pane
    • Pinning intents and reviewing the contents
    • Using intent similarity matching and finding examples
  • Fallback analysis

    • Filtering unlabeled data using metadata such as fallback tags or conversational turn
    • Understanding filtering, sorting and clustering
  • Identifying poorly covered concepts

    • Understanding the uncertainty metric (1 - confidence)
    • Using the testing panel for evaluation
    • Comparing and understanding the difference between testing similarity and stash similarity
    • Understanding the causes of differences (rounding, parent intents, normalization)
    • Adding data to an intent or creating a new one

By the session's end, you'll be fully familiar with the tool's controls, adept at finding poorly covered utterances in production logs, and proficient in rectifying them by reinforcing existing intents or creating new ones.