The repository contains a simple example of Black-Box function approximation using Reinforcement Learning. The technique used is based on the algorithm of Deep Deterministic Policy Gradient, but for this application, the algorithm can be simplified since there are no future states. The final solution is a simple Actor-Critic algorithm that can approximate a continuous Black-Box function with a Neural Network.
A car has a tank leakage. The car is moving at constant speed v toward the mechanic shop. The air resistance is proportional to the cube of the velocity. The leakage rate is constant. Find the speed that minimizes fuel consumption and maximizes the chance of arriving at the mechanic in time before the car stops, given the air resistance and the leakage rate.
The function that describes fuel consumption will be considered as a black box that takes air resistance and leakage rate as inputs. We will train a neural network to approximate this function using a simple Actor-Critic algorithm and the solution will be compared to the analytical solution of the problem. The code is based on TensorFlow.