Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand model correctness tests #15

Closed
3 tasks done
null-a opened this issue Oct 1, 2019 · 1 comment
Closed
3 tasks done

Expand model correctness tests #15

null-a opened this issue Oct 1, 2019 · 1 comment
Milestone

Comments

@null-a
Copy link
Collaborator

null-a commented Oct 1, 2019

  • Check that mu is computed as the correct function of latents & data? (Add tests to check correctness of model's mu computation #52)
  • Check that the response distribution has the correct parameters? (test_expected_response_codegen might already do this to some degree. Perhaps that can be reworked along the lines of test_mu_correctness -- using .fitted('expectation') to generate actual values, and comparing against the output of the mean method of a pyro distribution.)
  • Extend code gen tests to check that the response is observed, and that it comes from the expected family?
@null-a null-a added this to the v0.0.1 milestone Oct 10, 2019
@null-a
Copy link
Collaborator Author

null-a commented Oct 11, 2019

Check that mu is computed as the correct function of latents & data?

Here's one way we might do this. We might define test cases that contain something like the following:

df = pd.DataFrame({
    'y': [0., 0.],
    'a': pd.Categorical(['a0', 'a1']),
    'b': pd.Categorical(['b0', 'b1']),
})
model = defm('y ~ 1 | a:b', df)
def expected(df, coef):
    return (((df['a']=='a0') & (df['b']=='b0')) * coef('r_a:b[a0_b0,intercept]') +
            ((df['a']=='a1') & (df['b']=='b1')) * coef('r_a:b[a1_b1,intercept]'))

... which would allow us to check that generated models correctly compute the location parameter of the response distribution with something like:

fit = model.generate(backend=numpyro).prior(num_samples=1)
actual_mu = fitted(fit, what='linear')[0]
expected_mu = expected(df, partial(get_scalar_param, fit)).to_numpy()
print(np.all(np.equal(actual_mu, expected_mu)))

I like this because such tests are essentially deterministic, and they're easy to write. While it wouldn't guarantee that generated models have the correct semantics, it would give us confidence that all backends compute mu in the same way, and provide reassurance when making changes to e.g. code generation. (e.g. #10.)

Eventually we might even consider generating the expected functions from the model definition itself. I guess this would be of most interest if we were also generating model descriptions in statistical notation (#33). If these shared a common implementation (you'd need generate something like the expected function when generating the math description) then these tests would help convince us that the math and code we generate are consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant