TODO: Make a better test harness that will run and accumulate and average out each language's runtime metrics across all benchmarks