Updated notebook comments

setu4993 · setu4993 · commit dc335b4b2400 · 2018-08-21T20:53:22.000-04:00
diff --git a/.ipynb_checkpoints/predict_example-checkpoint.ipynb b/.ipynb_checkpoints/predict_example-checkpoint.ipynb
@@ -1,6 +1,269 @@
 {
- "cells": [],
- "metadata": {},
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Example for predicting using the RNN\n",
+    "\n",
+    "This is an example for running the RNN and get water demand predictions.\n",
+    "\n",
+    "Requirements:\n",
+    "- Python 3.6+\n",
+    "- NumPy 1.14+ (http://www.numpy.org/)\n",
+    "\n",
+    "To run this Notebook locally:\n",
+    "- Jupyter (https://jupyter.org/)\n",
+    "\n",
+    "___Note:___ This model is an online prediction model. The current version of code does provides a separate option for training the RNN. Refer to [training_example.ipynb](./training_example.ipynb) for training the RNN."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Step 1: Import external libraries and the `rnn` package"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from rnn import *"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Step 2: Specify the location on disk of the data directory and the input CSV file (optional)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "DATA_LOCATION = os.getcwd()\n",
+    "INPUT_FILE = os.path.join(DATA_LOCATION, 'input_files', 'water_6_17_reduced_2_cap_rs.csv')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Step 3: To initialize the predictor RNN, create a `rnn_predict` object.\n",
+    "\n",
+    "Required parameters:\n",
+    "- `weights`: Location on disk of the weights file. Typically a Python serialized object (Pickle) file.\n",
+    "\n",
+    "Optional parameters:\n",
+    "- `alpha`: Learning rate for the network, typically a value in the range `(0, 1)`. Default value: `0.38`.\n",
+    "- `log`: Boolean value specifying if the execution should be logged to the console. Recommended if running from the command line or in an IDE. Default value: `False`.\n",
+    "- `h_bias`: Boolean value specifying if hidden layer should have a bias node. Currently trained models include a hidden layer bias. Default value: `True`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "predictor = rnn_predict(os.path.join(DATA_LOCATION, 'input_files', 'daily_rnn_retrained_16.pickle'))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "At this point, the model is ready to make predictions.\n",
+    "\n",
+    "#### Step 4a: Give the network a list of inputs to make predictions\n",
+    "\n",
+    "Required parameters:\n",
+    "- `inputs`: A list of `[day_of_the_year, maximum_temperature, precipitation]` lists. These should be real world observed or predicted values.\n",
+    "\n",
+    "Optional parameters:\n",
+    "- `networks`: A list of the same length as `inputs` list with values of the range `[0, 11]`, specifying what network each prediction to run on, typically `num_month - 1`. If not specified, all predictions will be made for the month of January.\n",
+    "- `targets`: A list of the same length as `inputs` of the observed values for the day. The target values are required for updating the weights of the model.\n",
+    "- `recal`: Boolean value specifying if the network weights should be recalculated. Recommended if using observed demand as target, to improve future predictions. If set to `True`, requires `targets` to be specified. Default value: `False`.\n",
+    "\n",
+    "Output:\n",
+    "- `predictions`: A list of the same length as `inputs` of the predictions made by the network."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "scrolled": false
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The prediction for inputs  [1, 71, 0]  for network  0  is 110.72\n",
+      "The prediction for inputs  [2, 101, 121]  for network  0  is 110.71\n"
+     ]
+    }
+   ],
+   "source": [
+    "inputs = [[1, 71, 0], [2, 101, 121]]\n",
+    "networks = [0, 0]\n",
+    "\n",
+    "predictions = predictor.predict(inputs, networks=networks)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Step 4b: Or give the network a list of inputs to make predictions for and targets to continue learning from\n",
+    "\n",
+    "___Note:___ Once retrained, the weights are updated in the object immediately and cannot be undone, unless restored from a previously stored weights file."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The target for inputs  [1, 71, 0]  for network  0  was  106.53 and the prediction was 110.72, error observed was 3.93%. \n",
+      "The target for inputs  [2, 101, 121]  for network  0  was  110.72 and the prediction was 115.86, error observed was 4.64%. \n"
+     ]
+    }
+   ],
+   "source": [
+    "inputs = [[1, 71, 0], [2, 101, 121]]\n",
+    "networks = [0, 0]\n",
+    "targets = [106.53, 110.72]\n",
+    "\n",
+    "predictions = predictor.predict(inputs, networks=networks, targets=targets, recal=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The network is now retrained and has updated the weights.\n",
+    "\n",
+    "(Observe the difference in the second prediction because the weights were updated after predicting the first value.)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Step 4c: Or give the network a CSV file of inputs, targets and networks to make predictions for and to continue learning from\n",
+    "\n",
+    "___Note:___ Once retrained, the weights are updated in the object immediately and cannot be undone, unless restored from a previously stored weights file."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The target for inputs  [1, 71, 0]  for network  0  was  106.53 and the prediction was 115.19, error observed was 8.13%. \n",
+      "The target for inputs  [2, 101, 121]  for network  0  was  114.81 and the prediction was 114.44, error observed was 0.32%. \n",
+      "The target for inputs  [3, 107, 116]  for network  0  was  111.63 and the prediction was 113.52, error observed was 1.70%. \n",
+      "The target for inputs  [4, 24, 81]  for network  0  was  111.85 and the prediction was 113.78, error observed was 1.72%. \n",
+      "The target for inputs  [5, -69, 39]  for network  0  was  111.87 and the prediction was 113.83, error observed was 1.75%. \n"
+     ]
+    }
+   ],
+   "source": [
+    "inputs, networks, targets = read_input_file(INPUT_FILE)\n",
+    "\n",
+    "predictions = predictor.predict(inputs[:5], networks=networks[:5], targets=targets[:5], recal=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Step 5: Save the predictions to a CSV file\n",
+    "\n",
+    "Required parameters:\n",
+    "- `targets`: As above.\n",
+    "- `predictions`: As above.\n",
+    "- `output_file`: Location on disk of the output file, with the name."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "write_output_to_file(targets, predictions, os.path.join(DATA_LOCATION, 'predictions_2017.csv'))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Step 6: Store the weights for future use\n",
+    "\n",
+    "Store the network state for future re-use."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "predictor.save(os.path.join(DATA_LOCATION, 'daily_rnn_retrained_17.pickle'))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Further documentation about other functions implemented is available within the source files. The shared variables and parameters are documented in `__init__.py`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.5"
+  }
+ },
  "nbformat": 4,
  "nbformat_minor": 2
 }
diff --git a/predict_example.ipynb b/predict_example.ipynb
@@ -4,7 +4,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Example for running the RNN\n",
+    "### Example for predicting using the RNN\n",
     "\n",
     "This is an example for running the RNN and get water demand predictions.\n",
     "\n",
@@ -15,7 +15,7 @@
     "To run this Notebook locally:\n",
     "- Jupyter (https://jupyter.org/)\n",
     "\n",
-    "___Note:___ Current model is an online prediction model. The current version of code does __not__ provide an option for training."
+    "___Note:___ This model is an online prediction model. The current version of code does provides a separate option for training the RNN. Refer to [training_example.ipynb](./training_example.ipynb) for training the RNN."
    ]
   },
   {
@@ -56,7 +56,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Step 3: To initialize the predictor RNN, create a `recurrent_neural_network` object.\n",
+    "#### Step 3: To initialize the predictor RNN, create a `rnn_predict` object.\n",
     "\n",
     "Required parameters:\n",
     "- `weights`: Location on disk of the weights file. Typically a Python serialized object (Pickle) file.\n",
@@ -88,7 +88,7 @@
     "- `inputs`: A list of `[day_of_the_year, maximum_temperature, precipitation]` lists. These should be real world observed or predicted values.\n",
     "\n",
     "Optional parameters:\n",
-    "- `networks`: A list of the same length as `inputs` list with values of the range `[0, 1]`, specifying what network each prediction to run on, typically `num_month - 1`. If not specified, all predictions will be made for the month of January.\n",
+    "- `networks`: A list of the same length as `inputs` list with values of the range `[0, 11]`, specifying what network each prediction to run on, typically `num_month - 1`. If not specified, all predictions will be made for the month of January.\n",
     "- `targets`: A list of the same length as `inputs` of the observed values for the day. The target values are required for updating the weights of the model.\n",
     "- `recal`: Boolean value specifying if the network weights should be recalculated. Recommended if using observed demand as target, to improve future predictions. If set to `True`, requires `targets` to be specified. Default value: `False`.\n",
     "\n",
@@ -123,7 +123,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Step 4b: Or give the network a list of inputs to make predictions for and targets to learn from\n",
+    "#### Step 4b: Or give the network a list of inputs to make predictions for and targets to continue learning from\n",
     "\n",
     "___Note:___ Once retrained, the weights are updated in the object immediately and cannot be undone, unless restored from a previously stored weights file."
    ]
@@ -159,6 +159,38 @@
     "(Observe the difference in the second prediction because the weights were updated after predicting the first value.)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Step 4c: Or give the network a CSV file of inputs, targets and networks to make predictions for and to continue learning from\n",
+    "\n",
+    "___Note:___ Once retrained, the weights are updated in the object immediately and cannot be undone, unless restored from a previously stored weights file."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The target for inputs  [1, 71, 0]  for network  0  was  106.53 and the prediction was 115.19, error observed was 8.13%. \n",
+      "The target for inputs  [2, 101, 121]  for network  0  was  114.81 and the prediction was 114.44, error observed was 0.32%. \n",
+      "The target for inputs  [3, 107, 116]  for network  0  was  111.63 and the prediction was 113.52, error observed was 1.70%. \n",
+      "The target for inputs  [4, 24, 81]  for network  0  was  111.85 and the prediction was 113.78, error observed was 1.72%. \n",
+      "The target for inputs  [5, -69, 39]  for network  0  was  111.87 and the prediction was 113.83, error observed was 1.75%. \n"
+     ]
+    }
+   ],
+   "source": [
+    "inputs, networks, targets = read_input_file(INPUT_FILE)\n",
+    "\n",
+    "predictions = predictor.predict(inputs[:5], networks=networks[:5], targets=targets[:5], recal=True)"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -173,7 +205,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -191,7 +223,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [],
    "source": [