This isn't an error per say, but isterminal in GenerativeBeliefMDP is only true when the entire support is terminal, but when @gen is called, we sample single state from the belief, meaning isterminal(bmdp, b = false), but @gen(b, a) may fail because a terminal state is sampled. Is there a good reason to define "terminal belief" this way? I agree that checking if any terminal state is in the support might be overly reactive. This becomes a problem, for instance, when we want to generate a bunch of samples and early exit on a "terminal belief."
This isn't an error per say, but
isterminalinGenerativeBeliefMDPis only true when the entire support is terminal, but when@genis called, we sample single state from the belief, meaningisterminal(bmdp, b = false), but@gen(b, a)may fail because a terminal state is sampled. Is there a good reason to define "terminal belief" this way? I agree that checking if any terminal state is in the support might be overly reactive. This becomes a problem, for instance, when we want to generate a bunch of samples and early exit on a "terminal belief."