Skip to content

Commit

Permalink
README and organized notebooks + slides with markdown in section 2
Browse files Browse the repository at this point in the history
  • Loading branch information
Hobson Lane committed Nov 3, 2017
1 parent 55ef38c commit b21c53d
Show file tree
Hide file tree
Showing 11 changed files with 1,731 additions and 102 deletions.
41 changes: 39 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,39 @@
# prosocial-chatbot
Building a Prosocial Chatbot, Workshop, ODSC West November 2017 in San Francisco
# Training a Prosocial Chatbot

2017 ODSC West Workshop
San Francisco

Hobson Lane
*Head of Artificial Intelligence*
**Aira**, a Visual Interpreter for the Blind​
+1-503-974-6274 | [email protected]
[aira.io](aira.io) | [@airaio](twitter.com/airaio)

## Application

[![Aira, a visual interpreter for the blind](../data/aira_video_demo_blind_person_512.png)](https://vimeo.com/143070863)

## Slides

* [Training a Prosocial Chatbot](https://docs.google.com/presentation/d/1IND6PXOxgYb2IVmXnaSoNBcIOaiUVU-iZ6vnYz6J6-E/edit?usp=sharing)
* [Exploring Hyperspace](https://docs.google.com/presentation/d/1SEU8VL0KWPDKKZnBSaMxUBDDwI8yqIxu9RQtq2bpnNg/edit?usp=sharing)

## Ubuntu Dialog Corpus

* Ubuntu IRC channel (for developers and contributors)
* ~1.5 M Dialogs (interactions)
* ~10 M Utterances (statements)

## Application

[![Aira, a visual interpreter for the blind](../data/aira_video_demo_blind_person_512.png)](https://vimeo.com/143070863)

### References

* [Aira](http://aira.io), A Visual Interpreter for the Blind
* 2018, Lane, Howard and Hapke: [Natural Language Processing in Action](https://www.manning.com/books/natural-language-processing-in-action/?a_aid=totalgood)
* 2016, Lowe, et al: ["The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems"](https://arxiv.org/pdf/1506.08909.pdf)
* [training set generator v1](https://github.com/rkadlec/ubuntu-ranking-dataset-creator)
* Lowe, [training set generator v2](https://github.com/ryan-lowe/Ubuntu-Dialogue-Generationv2)
* Shmalko, [lexica and slang normalizers](https://github.com/rasendubi/noisy-text/tree/master/data

Binary file added data/aira_video_demo_blind_person_512.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,25 @@
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from nlpia.data.loaders import get_data\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# df = get_data('ubuntu_dialog')\n",
"# df = pd.read_csv('/data/ubuntu-dialog/trainset.csv', header=0)\n",
"df = pd.read_csv('ubuntu_dialog.csv', header=0)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -28,16 +39,6 @@
"!python create_ubuntu_dataset.py --data_root . -o ubuntu_dialog.csv train -p 1 -e 2000000"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# df = pd.read_csv('/data/ubuntu-dialog/trainset.csv', header=0)\n",
"df = pd.read_csv('ubuntu_dialog.csv', header=0)"
]
},
{
"cell_type": "code",
"execution_count": 3,
Expand Down Expand Up @@ -267,38 +268,11 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": null,
"metadata": {},
"outputs": [
{
"ename": "KeyError",
"evalue": "'Label'",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m~/.virtualenvs/civicu/lib/python3.6/site-packages/pandas/core/indexes/base.py\u001b[0m in \u001b[0;36mget_loc\u001b[0;34m(self, key, method, tolerance)\u001b[0m\n\u001b[1;32m 2441\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2442\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2443\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32mpandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5280)\u001b[0;34m()\u001b[0m\n",
"\u001b[0;32mpandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)\u001b[0;34m()\u001b[0m\n",
"\u001b[0;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20523)\u001b[0;34m()\u001b[0m\n",
"\u001b[0;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20477)\u001b[0;34m()\u001b[0m\n",
"\u001b[0;31mKeyError\u001b[0m: 'Label'",
"\nDuring handling of the above exception, another exception occurred:\n",
"\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-13-b7c92b54d87a>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mdel\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdf\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'Label'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mdf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mto_csv\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'/home/hobs/Dropbox/data/ubuntu_dialog.csv.gz'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcompression\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'gzip'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/civicu/lib/python3.6/site-packages/pandas/core/generic.py\u001b[0m in \u001b[0;36m__delitem__\u001b[0;34m(self, key)\u001b[0m\n\u001b[1;32m 1899\u001b[0m \u001b[0;31m# there was no match, this call should raise the appropriate\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1900\u001b[0m \u001b[0;31m# exception:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1901\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_data\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdelete\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1902\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1903\u001b[0m \u001b[0;31m# delete from the caches\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/civicu/lib/python3.6/site-packages/pandas/core/internals.py\u001b[0m in \u001b[0;36mdelete\u001b[0;34m(self, item)\u001b[0m\n\u001b[1;32m 3647\u001b[0m \u001b[0mDelete\u001b[0m \u001b[0mselected\u001b[0m \u001b[0mitem\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mitems\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mnon\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0munique\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32min\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0mplace\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3648\u001b[0m \"\"\"\n\u001b[0;32m-> 3649\u001b[0;31m \u001b[0mindexer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mitems\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mitem\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3650\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3651\u001b[0m \u001b[0mis_deleted\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mzeros\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbool_\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/.virtualenvs/civicu/lib/python3.6/site-packages/pandas/core/indexes/base.py\u001b[0m in \u001b[0;36mget_loc\u001b[0;34m(self, key, method, tolerance)\u001b[0m\n\u001b[1;32m 2442\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2443\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2444\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_maybe_cast_indexer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2445\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2446\u001b[0m \u001b[0mindexer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_indexer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmethod\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mmethod\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtolerance\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mtolerance\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32mpandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5280)\u001b[0;34m()\u001b[0m\n",
"\u001b[0;32mpandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)\u001b[0;34m()\u001b[0m\n",
"\u001b[0;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20523)\u001b[0;34m()\u001b[0m\n",
"\u001b[0;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20477)\u001b[0;34m()\u001b[0m\n",
"\u001b[0;31mKeyError\u001b[0m: 'Label'"
]
}
],
"outputs": [],
"source": [
"del(df['Label'])\n"
"del(df['Label'])"
]
},
{
Expand Down
Loading

0 comments on commit b21c53d

Please sign in to comment.