Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combinations avoiding patterns #6

Open
alandefreitas opened this issue Jul 23, 2019 · 3 comments
Open

Combinations avoiding patterns #6

alandefreitas opened this issue Jul 23, 2019 · 3 comments

Comments

@alandefreitas
Copy link

alandefreitas commented Jul 23, 2019

Thank you for the great library.

Given your experiment, is there (or is it possible to implement) any algorithm or container interface that enumerates all combinations in a pseudo-random order (or at least avoiding patterns) without resampling?

Of course, apart from the impracticable alternative of keeping all possible combinations in memory and shuffling them.

This can be very useful for hypothesis testing and generate-and-test algorithms.

For instance, if we had these combinations:

$c_1$ $c_2$ $c_3$
0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 1 2
0 2 0
0 2 1
0 2 2
1 0 0
1 0 1
1 0 2
1 1 0
1 1 1
1 1 2
1 2 0
1 2 1
1 2 2
2 0 0
2 0 1
2 0 2
2 1 0
2 1 1
2 1 2
2 2 0
2 2 1
2 2 2

I'm wondering if there would be a container interface that would return these combinations in random order (or at least an order that avoids patterns) but still being a container interface (without keeping all combinations in memory and shuffling a list of indexes).

In statistics, we could use that for randomized experiment control. In generate-and-test heuristics, this would be a systematic way of enumerating the search space without getting stuck in a region of the objective function.

Thank you.

@mraggi
Copy link
Owner

mraggi commented Jul 24, 2019

Well, I don't know of a fast method, but combinations right now supports indexing. That is, you can ask it for the i-th combination, so if you simply shuffle the integers {0,1,...,binomial(n,k)-1} and then do

auto X = combinations(n,k);
for (size_t i = 0; i < X.size(); ++i)
{
    auto combination = X[S[i]];
}

where S = shuffled indices.

It won't be fast (only about a million combinations per second) but maybe it doesn't matter.

@alandefreitas
Copy link
Author

Thanks. I just realized what I meant was "non-decreasing sequences"/"combinations with repetition" but I don't think it makes much difference in this context.

If there were a way to access these sequences/combinations in random order (or pseudo-random, or at least avoiding patterns) without the shuffled indexes, this could be useful in many environments.

Whenever the number of sequences grows exponentially on the number of items (factorial hypothesis tests and generate-and-test algorithms), it's not viable to keep a list of random indexes and, because of that, people usually recur to algorithms that resample, which is very inefficient and it's impossible to know if the algorithm been through all sequences/combinations.

If you know of any algorithm we can put in a container interface to do that, please let me know. Maybe I can help implement this container.

@jwood000
Copy link

jwood000 commented Jul 25, 2019

Hello @alandefreitas,

As mentioned in the README, combinations with repetitions isn't currently available. Until then, your problem can easily be attacked by doing the following:

  1. Get the total number of combinations w/ repetition (assuming from a set of size n chosen k at a time). This is given by the binomial coefficient (n + k - 1, k).
  2. Next, you can generate a random sample S of integers using the number from point 1 as the upper bound.
  3. Finally, you can generate the jth combination with repetition for all j in S.

You can alter the method below for the case with repetition:

static inline void construct_combination(combination& data, size_type m)
{
IntType k = data.size();
// this is the biggest for which binomial is still well defined.
// Hopefully it's enough for most use cases.
size_type upper = 68;
for (IntType r = k; r > 1; --r)
{
IntType t;
big_integer_interval NR(r, upper);
t = NR.partition_point(
[m, r](auto x) { return binomial<size_type>(x, r) <= m; }) -
1;
data[r - 1] = t;
upper = t;
m -= binomial<size_type>(t, r);
}
if (k > 0)
data[0] = m;
}

@mraggi, please correct me if I am wrong.

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants