Initial commit

sharma-n · May 10, 2020 · 570497a · 570497a
commit 570497a
Show file tree

Hide file tree

Showing 35 changed files with 3,085 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,4 @@
+.ipynb_checkpoints/*
+**/__pycache__/*
+8. Tabu Search.ipynb
+**/*.mp4
diff --git a/1. Greedy Search.ipynb b/1. Greedy Search.ipynb
diff --git a/2. Simulated Annealing.ipynb b/2. Simulated Annealing.ipynb
diff --git a/3. Genetic Algorithms Theory.ipynb b/3. Genetic Algorithms Theory.ipynb
diff --git a/4. Genetic Algorithms Examples.ipynb b/4. Genetic Algorithms Examples.ipynb
diff --git a/5. Theoretical Analysis of Heuristic Algorithms.ipynb b/5. Theoretical Analysis of Heuristic Algorithms.ipynb
@@ -0,0 +1,169 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Theoretical Analysis of Heuristic Algorithms\n",
+    "We would like to understand the convergence behavior of heuristic algorthms. These algrihtms are usually stochastic in beaviour and therefore require a probabilistic framework for analysis. For case of dicrete decision variables, Markov Chain analysis can be used.\n",
+    "\n",
+    "## I. Markov Chains\n",
+    "Let $S(t)$ be a mapping from outcomes fo a random experiment to a function of (discrete) iteration number $t$. For a fixed $t_k$, $S(t_k)$ is a random variable where the outcomes are countable/discrete. A **Markov Process** is one where the future of the process depends upon the present and is independant of the past. Hence, \n",
+    "$$\\mathbb{P}(S^{t+1}=i|S^t=k)\\text{ does not depend on }$S^{t-1}, S^{t-2}, \\ldots$$\n",
+    "\n",
+    "We define $p_{kj}$ as the probability in any time $t$ of going from state $k$ to state $j$. For the case of a random walk (equal chance of every neighbor):\n",
+    "$$p_{ij} = \\begin{cases}\n",
+    "\\frac{1}{|N(S_i)|}\\quad\\text{, if }S_j\\in N(S_i) \\\\\n",
+    "0\\quad\\text{, otherwise}\n",
+    "\\end{cases}$$\n",
+    "\n",
+    "### Configuration Graph\n",
+    "* A configuration graph in the $t^{th}$ iteration is $C_t = (\\Omega, E)$ is the set of legal configurations and all edges. A “configuration” is a specific value of the decision vector also called “state”.\n",
+    "* An edge between two states indicates that they are neighbors. The edges are defined as $E = \\{(S,Z)|S\\in \\Omega, Z\\in\\Omega, Z\\in N(S)\\}$\n",
+    "* A state $S$ is called a “**local minimum**” if $Cost(S)\\leq Cost(T)\\quad\\forall\n",
+    "T \\in N(S)$, where $N(S)$ is the neighborhood of $S$, defined as $N(S) = \\{Z\\in\\Omega|(S,Z)\\in E\\}\\forall S\\in\\Omega$.\n",
+    "* If $Cost(S)\\leq Cost(Z),\\quad \\forall Z\\in\\Omega$, then $S$ is a \"**global minimum**\".\n",
+    "\n",
+    "The problem is that the configuration graph is not very useful for computation for reasonably sized problems.\n",
+    "\n",
+    "### Defining and Computing $\\Pi(t)$\n",
+    "Let $\\pi(j,t)$ be the probability of being in state $j$ in iteration $t$. $N$ being the number of states,\n",
+    "$$\\pi(t) = (\\pi(1,t), \\pi(2,t), \\ldots, \\pi(N,t))$$\n",
+    "We have $\\sum_{i=1}^N \\pi(i,t) = 1$ We would like to define a $(N\\times N)$ matrix $[P]$ such that $\\pi(t+1) = \\pi(t) \\times [P]$. Such a matrix $P$ is called the **pertubation probability matrix**. We therefore have:\n",
+    "$$\\pi_i(t+1) = \\sum_{j=1}^N \\pi_j(t)\\cdot p_{ji}$$\n",
+    "where $p_{ji}$ is the $j$th row and the $i$th column of $P$, corresponding from transition from state $j$ to state $i$.\n",
+    "\n",
+    "For **Ergodic Markov Chains**, there is a unique stationary distribution $\\pi$ which is a solution tot he equation:\n",
+    "$$\\pi^* = \\pi^* \\cdot P$$\n",
+    "where $\\pi^* = \\lim_{t\\rightarrow\\infty} \\pi(t)$. We have\n",
+    "$$\\pi(t+1) = \\pi(t)\\cdot P = \\pi(0)\\cdot P^t \\Rightarrow \\pi^* = \\lim_{t\\rightarrow\\infty}\\pi(0)\\cdot P^t = \\pi(0)\\cdot \\lim_{t\\rightarrow\\infty} P^t$$\n",
+    "\n",
+    "A stationary distribution (a vector of dimension $N$) satisfies $\\pi^* = \\pi^* \\cdot P$.\n",
+    "\n",
+    "### Ergodic Markov Chains\n",
+    "A Markov Chain is **ergodic** if and only if it is \n",
+    "1. irreducible (i.e. all states are reachable from all other states)\n",
+    "2. aperiodic, that is for each state, the probability of returning to that state is positive for all steps.\n",
+    "3. recurrent, that is for each state of the chain, the probability of returning to that state at some time in the future is equal to one.\n",
+    "4. non null, that is the expected number of steps to return to a state is finite.\n",
+    "\n",
+    "## II. Analysis of Random and Greedy Search\n",
+    "With algorithms like greedy search or simulated annealing, there is a matter of accepting the change (one if lower in greedy search, for example). We call $A_{jk}$ the acceptance pertubation and further define transition probability as $\\theta_{jk} = p_{jk}\\cdot A_{jk}$. $\\theta_{jk}\\in\\Theta$ is the probability of going from $j$ to $k$ given the probability of both state $k$ being picked as the neighbor and the probability of acceptance. Since these are independent events, the probailiy is the product of the individual probabilities.\n",
+    "\n",
+    "### Random Search / Walk\n",
+    "For random search,\n",
+    "$$\\theta_{ij} = p_{ij}\\cdot A_{ij} = p_{ij}$$\n",
+    "since $\\forall i,j, A_{ij}=1$ (\"always accept\"). For **random walk**, the transition matrix will have equal probability in the neighbourhood of $i$, for example, if the neighborhood is defined as being 1 state out,\n",
+    "$$\\Theta = P = \\begin{bmatrix}\n",
+    " \\frac{1}{3}& \\frac{1}{3} & 0 & 0 & \\frac{1}{3}\\\\ \n",
+    " \\frac{1}{3}& \\frac{1}{3} &  \\frac{1}{3}& 0 & 0\\\\ \n",
+    " 0& \\frac{1}{3} & \\frac{1}{3} &  \\frac{1}{3}& 0\\\\ \n",
+    " 0& 0 & \\frac{1}{3} & \\frac{1}{3} & \\frac{1}{3}\\\\ \n",
+    "\\frac{1}{3} & 0 & 0 & \\frac{1}{3} & \\frac{1}{3}\n",
+    "\\end{bmatrix}$$\n",
+    "\n",
+    "Solving the equation $\\pi = \\pi \\cdot \\Theta$ using the $\\Theta$ above gives us $\\pi = [1/5, 1/5, 1/5, 1/5, 1/5]$.\n",
+    "\n",
+    "### Greedy Search\n",
+    "For greedy search, the acceptance probability is defined as\n",
+    "$$A_{ij} = \\begin{cases} 1\\quad\\text{, if }\\Delta Cost_{ij}\\leq 0\\\\\n",
+    "0\\quad\\text{, otherwise}\n",
+    "\\end{cases}$$\n",
+    "For a simple case like below, where $Cost(S) = S$:\n",
+    "![Simple Markov Process](simpleMC.png)\n",
+    "$A_{ij}$ for the above system will be:\n",
+    "$A_{ij} = \\begin{bmatrix}\n",
+    "1 & 0 \\\\\n",
+    "\\frac{1}{2} & \\frac{1}{2}\n",
+    "\\end{bmatrix}$\n",
+    "Therefore, if initial $\\pi_0 = [1 0]$ (in state 1), then $\\forall t, \\pi(t) = [1 0]$. If initial $\\pi_0 = [0 1]$ (in state 2), then $\\pi(1) = [\\frac{1}{2} \\frac{1}{2}], \\pi(2) = [\\frac{3}{4} \\frac{1}{4}]\\rightarrow_{t\\rightarrow\\infty} \\pi(t) = [1 0]$\n",
+    "\n",
+    "## III. Anaysis of Simulated Annealing\n",
+    "For simulated annealing (SA), the probability of accepting an uphill move depends on the value of teh \"temperature\" $T$. Therefore,\n",
+    "$$p_{ij} = \\begin{cases}\n",
+    "\\frac{1}{|N(S_i)|}\\quad\\text{, if }S_j\\in N(S_i)\\\\\n",
+    "0\\quad\\text{, otherwise}\n",
+    "\\end{cases}$$\n",
+    "$$A_{ij} = \\begin{cases} f(C_i, C_j, T)\\quad\\text{, if }\\Delta Cost_{ij}>0\\\\\n",
+    "1\\quad\\text{, if }\\Delta C_{ij}\\leq 0\n",
+    "\\end{cases}$$\n",
+    "From these, we get the transition probability $\\Theta$ as:\n",
+    "$$\\theta_{ij}(T) = \\begin{cases}\n",
+    "A_{ij}(T)\\cdot p_{ij}\\quad\\text{, if } i\\neq j\\\\\n",
+    "1-\\sum_{k}A_{ik}(T)\\cdot p_{ij}\\quad\\text{, if } i=j\n",
+    "\\end{cases}$$\n",
+    "The second case above is the probability of not moving ($\\theta_{jj}$). For simulated annealing, we can have 4 different possibilites:\n",
+    "1. Downhill move\n",
+    "2. Uphill move accepted: probability of accepting uphill move from $i$ to $j$ is $\\exp{(\\frac{-\\Delta Cost_{ij}}{T})}$\n",
+    "3. No move\n",
+    "4. No edge so no possibility to move from $i$ to $j$\n",
+    "Corresponding to each possibility we have:\n",
+    "$$\\theta_{ij} = \\begin{cases}\n",
+    "\\frac{1}{|N(S_i)|}\\quad\\text{, if } \\Delta Cost_{ij}\\leq 0, S_j\\in N(S_i)\\\\\n",
+    "\\frac{1}{|N(S_i)|}\\cdot \\exp{(\\frac{-\\Delta Cost_{ij}}{T})}\\quad\\text{, if } \\Delta Cost_{ij}> 0, S_j\\in N(S_i)\\\\\n",
+    "1-\\sum_{k, k\\neq i}p_{ik}\\cdot A_{ik}(T)\\quad\\text{, if } i=j, S_j\\in N(S_i)\\\\\n",
+    "0\\quad\\text{, if }  S_j\\notin N(S_i)\n",
+    "\\end{cases}$$\n",
+    "\n",
+    "Theoretically, it can be shown that in the limit as the number of iterations goes to infinity and temperature $T$ is constant, the probability of being in state $j$ is\n",
+    "$$\\pi_j(T) = \\pi_0(T)\\exp{(\\frac{\\Delta Cost_{i_0j}}{T})}$$\n",
+    "where $S_{i_0}$ is the optimal configuration, $Cost_{i_0}$ is the cost of the optimal configuration and $\\pi_0(T)$ is a normalization factor given by:\n",
+    "$$\\pi_0(T) = \\frac{1}{\\sum_{k=1}^n \\exp{-\\frac{\\Delta Cost_{i_0k}}{T}}}$$\n",
+    "the $\\Delta$ above is the difference between the cost at $j$ and the best solution cost $Cost_{i_0}$.\n",
+    "\n",
+    "Firthermore, we can capture the **essence of simulated annealing** by observing\n",
+    "$$\\lim_{T\\rightarrow 0, L\\rightarrow\\infty}(\\epsilon_i \\Theta(T))_j = \\lim_{T\\rightarrow 0}\\pi_j(T)\\\\\n",
+    "=\\begin{cases}\\frac{1}{|\\Omega_0|}\\quad\\text{, if }S_{i_0}\\in\\Omega_0 \\\\\n",
+    "0\\quad\\text{, if }S_{i_0}\\notin \\Omega_0\n",
+    "\\end{cases}$$\n",
+    "where $\\Omega_0$ is the set of optimal configurations, i.e., $\\Omega_0 = \\{S_i\\in\\Omega | Cost_i = Cost_{i_0}\\}$. Thus if SA is given unlimited time the algorithm will achieve one of the optimal configurations with equal porbabilites (uniform probability distribution) and the probability of achieving a suboptimal configuration is zero.\n",
+    "### Double Limit Analysis for SA\n",
+    "To find the solution of the eventual SA search, we do double limit analysis:\n",
+    "1. Holding $T$ constatnt, let $t$ (no. of iterations) go in the limit to infinity. This is denoted as $\\pi^*(T) = \\lim_{t\\rightarrow\\infty}\\pi(t|T)$.\n",
+    "2. Given the limit #1 above, let $T$ go in the limit to 0, so the process will converge to $\\lim_{T\\rightarrow\\infty}\\pi^*(T) = \\lim_{T\\rightarrow\\infty}\\lim_{t\\rightarrow\\infty}\\pi(t|T)$\n",
+    "\n",
+    "Taking an example problem, assume the 4 state markov chain below, with the transition probabilities already calculated, assuming $Cost(S) = S$.\n",
+    "![Example problem for SA analysis](exampleSA.png)\n",
+    "\n",
+    "If we solve $\\pi = \\pi\\Theta$ (using MATLAB) we get\n",
+    "$$\\pi_i = \\frac{\\exp{(-\\frac{Cost_i}{T})}}{\\sum_{j=1}^4 \\exp{-(\\frac{Cost_j}{T})}}$$\n",
+    "Since here $Cost_i = i$, we have $\\pi = (\\frac{\\exp{(-\\frac{1}{T})}}{N_0(T)}, \\frac{\\exp{(-\\frac{2}{T})}}{N_0(T)}, \\frac{\\exp{(-\\frac{3}{T})}}{N_0(T)},\\frac{\\exp{(-\\frac{4}{T})}}{N_0(T)})$ where $N_0(T) = \\sum_{i=1}^4 \\exp{(-\\frac{i}{T})}$\n",
+    "\n",
+    "We can re-write $N_0(T) = e^{-\\frac{1}{T}}\\cdot (1+e^{-\\frac{1}{T}}+e^{-\\frac{2}{T}}+e^{-\\frac{3}{T}}) = e^{-\\frac{1}{T}}\\cdot R(T)$\n",
+    "$$\\Rightarrow \\pi = (\\frac{1}{R(T)}, \\frac{e^{-\\frac{1}{T}}}{R(T)}, \\frac{e^{-\\frac{2}{T}}}{R(T)}, \\frac{e^{-\\frac{3}{T}}}{R(T)})$$\n",
+    "$$\\text{But, }\\lim_{T-\\rightarrow 0}e^{-\\frac{i}{T}} = 0,\\quad \\lim_{T\\rightarrow 0}R(T) = 1$$\n",
+    "$$\\Rightarrow \\lim_{T\\rightarrow 0}\\pi = (1,0,0,0)$$\n",
+    "\n",
+    "Therefore, for the example problem, the probability of beign in the best state is 1. Hence these convergence results are consistent with the general results."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}