Search
Once a workspace has a trained classifier, you can configure the workspace to run this classifier on all linked unlabelled data sets. This classifies every utterance in every conversation that is uploaded, and exposes it through a search API capable including intents in the search terms.
This can easily be hooked into custom dashboards that showcase activity from a conversation corpus.
#
ConceptsIn this API, everything maps to a conversation (whether you uploaded an utterances file, or conversations from a csv file or from a supported integration). The search request is done on a specific workspace, and include results in the form of an annotated conversation.
The annotated conversation contains:
- The list of inputs containing the text of every utterance
- Metadata about the conversation (when it took place, what file it is contained in)
- Various annotations objects, the most important one containing the result the last trained classifier on each input of the conversation
All URLs in this document are relative to https://api.humanfirst.ai/
#
IntentsSome predicates refer to intent ids, get them via a GET call to /v1alpha1/workspaces/${namespace}/${workspaceId}/intents
The response will look like this:
#
QueryingPOST /v1alpha1/conversations/${namespace}/${workspaceId}/query
This is where you'd start to make queries, you can combine predicates that expose the output of the classifier along with full text search and time windowing. The request is built by stacking a series of predicates (a conversation has to match them all to be returned).
Most API calls depend on a namespace and workspace id, you can find these in the URL bar - the workspace id has a format of playbook-....
and the namespace is part of the query string as ?namespace=...
. From the command line, look at hf namespace list
and hf workspace list
Let's start by the case of doing a full text search within conversations:
info
This uses our hf command line tool to generate an access token. Refer to the authentication section for instructions on how to obtain them programmatically.
#
Predicate: timeRangeSpecifies a time window in which the conversation must have taken place, both the start and end time are optional so unbounded windows are possible.
#
Predicate: intentMatchSpecifies a series of intents ids that have to be matched in order. An optional minimum match score can be set to threshold the quality of the matches.
note
Only the classifier's top result is indexed, but all subsequent probabilities will be returned as part of the response.
#
Predicate: conversationSourceSpecifies the filename and file type to match (if you only want conversations and not flat utterances).
note
Depending on the format, certain suffixes have to be appended to the filename in order for results to match our internal representation.
[filename].fmt1 represents CSV data
[filename].fmt2 represents a flat utterance file
#
Predicate: inputMatchSpecifies a full text search predicate in order to search all conversation inputs.
#
Response formatEach conversation is represented as the original converted conversation, and additional annotations containing the result of various components in our pipeline.
note
Some of the annotations won't be present if a predicate doesn't prefer the use of the classifier's output. For example, inputs_intents
and distribution
will not be present unless intentMatch
is requested. A minimum score of 0
can be used to include everything.