Helper Packages and Scripts

Intro and Walkthrough of HumanFirst Package module#

Intro and Walkthrough of HumanFirst Package module

00:00: Introduction
00:22: PYPI HumanFirst Module
01:32: NLG and Object Helper Module
02:19: APIs Helper Module

Conversion of Conversations in CSV to HumanFirst JSON Using Academy Script#

Conversion of Conversations in CSV to  HumanFirst JSON Using Academy Script

00:00: Introduction
00:22: Academy Repo
00:54: csv_to_json_unlabelled Script
01:24: Example Conversation CSV
01:41: Important Fields in the CSV
03:33: Specifying Metadata Columns
03:49: Python Environment
04:42: Command to Execute Script
05:23: Command Line Arguments
07:27: Script Execution
08:47: Resultant HumanFirst JSON

Acquiring and Running loading script examples#

We provide a humanfirst module which represents the objects in the HumanFirst JSON and HumanFirst APIs and examples of scripts to load data to HumanFirst in various ways.

  • Examples scripts are provided for running a python conversion of popular formats into the humanfirst json
  • They are kept in this repository https://github.com/zia-ai/academy
  • It provides an ubuntu:focal based dockerfile to run the scripts
  • They should run fine on a linux python3 environment
  • If you are a MacOS user, it has been reported that you can run the scripts directly but we do not support it.
  • We have made scripts run on Windows for clients but you will need to adjust the commands suitably for path names, ENV variables and other such windows things
  • Using the dockerfile and mounting your windows directory is much easier
  • For instructions see the README and the licence is in the repository.
  • If you are not using the academy docker container then make sure to install the libraries required to run the academy scripts

General usage#

The example conversation used for this demonstration can be found here. It is in the CSV format. It has the following columns in them - convoid, conv_created_at, utterance, speaker, confidence_score, intent, fallback_indicator, escalation_indicator. We are limiting with these columns for demonstration purposes. You can have as many columns as you want and include them as metadata.

Following is the python command for the convertion of CSV file containing conversations into the HumanFirst JSON format

python csv_to_json_unlabelled.py
Options:
-f, --filename TEXT Input File Path [required]
-m, --metadata_keys TEXT <metadata_col_1,metadata_col_2,...,metadata_col_n
>
-u, --utterance_col TEXT Column name containing utterances [required]
-d, --delimiter TEXT Delimiter for the csv file
-c, --convo_id_col TEXT If conversations which is the id otherwise
utterances and defaults to hash of utterance_col
-t, --created_at_col TEXT If there is a created date for utterance
otherwise defaults to now
-x, --unix_date If created_at column is in unix epoch format
-r, --role_col TEXT Which column the role in
-p, --role_mapper TEXT If role column then role mapper in format "source
_client:client,source_expert:expert,*:expert"
-e, --encoding TEXT Input CSV encoding
--filtering TEXT column:value,column:value;column:value,column:val
ue
-h, --striphtml Whether to strip html tags from the utterance col
--help Show this message and exit.

Make sure your python venv setup, and then run the csv_to_json_unlabelled script.

pyenv activate venv
python csv_to_json_unlabelled.py \
-f "./examples/load_using_toolbox_demo_data.csv" \
-c convoid -t conv_created_at \
-r speaker \
-m "confidence_score,intent,fallback_indicator,escalation_indicator" \
-u utterance

Produces the conversations in HumanFirst JSON format.