Skip to content

sampling

Tony Van Eerd edited this page Dec 4, 2019 · 2 revisions

Source: sampling.h

Similar to C++17 std::sample() but I don't have C++17 yet. And I like the out() function better than an output iterator.

The underlying algorithm is stable_sample() and is similar to Reservoir Sampling (https://en.wikipedia.org/wiki/Reservoir_sampling)

The following example is one of the functions in sampling.h, but it also is an example of how to use the generic stable_sample(beg,end,count,urng,out) function.

// sample count items from a vector
// return the sample in a new vector
// also each item sampled is transformed via the transform function
// and the result of the transform is what is put in the vector
// (we found that we often transformed after sampling, so why not do both at the same time in one pass)
template<typename T, typename Transform>
std::vector<T> sample(std::vector<T> const & in, int count, Transform const & trans)
{
    out.reserve(count);
    std::random_device rd;
    std::mt19937 gen(rd());
    stable_sample(in.begin(), in.end(), count, gen,
           [&trans, &out](T const & elem) { out.push_back(trans(elem)); });
    return out;
}

And then how to use the nice and easy version:

float estimateThreshold(std::vector<float> const & errors)
{
    const int maxSamples = 30000;
    std::vector<float> absErrs = sample(errors, maxSamples, [](float val){return abs(val);});
      
    ... do stuff with sample ...
}
Clone this wiki locally