Helper Packages and Scripts

Intro and Walkthrough of HumanFirst Package module#

00: Introduction
22: PYPI HumanFirst Module
32: NLG and Object Helper Module
19: APIs Helper Module

Conversion of Conversations in CSV to HumanFirst JSON Using Academy Script#

00: Introduction
22: Academy Repo
54: csv_to_json_unlabelled Script
24: Example Conversation CSV
41: Important Fields in the CSV
33: Specifying Metadata Columns
49: Python Environment
42: Command to Execute Script
23: Command Line Arguments
27: Script Execution
47: Resultant HumanFirst JSON

Acquiring and Running loading script examples#

We provide a humanfirst module which represents the objects in the HumanFirst JSON and HumanFirst APIs and examples of scripts to load data to HumanFirst in various ways.

Examples scripts are provided for running a python conversion of popular formats into the humanfirst json
They are kept in this repository https://github.com/zia-ai/academy
It provides an ubuntu:focal based dockerfile to run the scripts
They should run fine on a linux python3 environment
If you are a MacOS user, it has been reported that you can run the scripts directly but we do not support it.
We have made scripts run on Windows for clients but you will need to adjust the commands suitably for path names, ENV variables and other such windows things
Using the dockerfile and mounting your windows directory is much easier
For instructions see the README and the licence is in the repository.
If you are not using the academy docker container then make sure to install the libraries required to run the academy scripts

General usage#

The example conversation used for this demonstration can be found here. It is in the CSV format. It has the following columns in them - convoid, conv_created_at, utterance, speaker, confidence_score, intent, fallback_indicator, escalation_indicator. We are limiting with these columns for demonstration purposes. You can have as many columns as you want and include them as metadata.

Following is the python command for the convertion of CSV file containing conversations into the HumanFirst JSON format

python csv_to_json_unlabelled.py
Options:
  -f, --filename TEXT        Input File Path  [required]
  -m, --metadata_keys TEXT   <metadata_col_1,metadata_col_2,...,metadata_col_n
                             >
  -u, --utterance_col TEXT   Column name containing utterances  [required]
  -d, --delimiter TEXT       Delimiter for the csv file
  -c, --convo_id_col TEXT    If conversations which is the id otherwise
                             utterances and defaults to hash of utterance_col
  -t, --created_at_col TEXT  If there is a created date for utterance
                             otherwise defaults to now
  -x, --unix_date            If created_at column is in unix epoch format
  -r, --role_col TEXT        Which column the role in
  -p, --role_mapper TEXT     If role column then role mapper in format "source
                             _client:client,source_expert:expert,*:expert"
  -e, --encoding TEXT        Input CSV encoding
  --filtering TEXT           column:value,column:value;column:value,column:val
                             ue
  -h, --striphtml            Whether to strip html tags from the utterance col
  --help                     Show this message and exit.

Make sure your python venv setup, and then run the csv_to_json_unlabelled script.

pyenv activate venv

python csv_to_json_unlabelled.py \
-f "./examples/load_using_toolbox_demo_data.csv" \
-c convoid -t conv_created_at \
-r speaker \
-m "confidence_score,intent,fallback_indicator,escalation_indicator" \
-u utterance

Produces the conversations in HumanFirst JSON format.