Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optimization exercise notebook #32

Merged
merged 4 commits into from
Feb 3, 2025
Merged

Add optimization exercise notebook #32

merged 4 commits into from
Feb 3, 2025

Conversation

jessegrabowski
Copy link
Member

@jessegrabowski jessegrabowski commented Feb 1, 2025

Closes #4

It also touches heavily on #26 , so maybe have a look from that perspective?

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -0,0 +1,1162 @@
{
Copy link
Member

@ricardoV94 ricardoV94 Feb 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here, what is exactly the problem? You can't vectorize in a way that the batch dimension of x and a/lam broadcast with each other?


Reply via ReviewNB

Copy link
Member Author

@jessegrabowski jessegrabowski Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah maybe I'm being dumb? But what we want is to have output shape (n,c), following the signature in the comment. If you make all of x, alpha, lam vectors, this doesn't work for obvious reasons. If you make x = pt.dmatrix(),it will whine about runtime broadcasting (because x has shape None, None). I guess we could specifically define x = pt.tensor('x', shape = (None, 1)), but this is also sucks somewhat, because now the user has to constantly remember to pass in 2d inputs for x.I found that x_vector = pt.dvector('x') and vectorize_graph(output, {x:x_vector[:, None]}) was the most ergonomic. But it's also not perfect, because sometimes you don't want to broardcast (e.g. just one row of data), but in this case you get out a matrix also. That's why I use trace for the utility function below, because you end up with a bunch of wasted computations due to the unnecessary broadcasting.

I hope you can find an easy solution to this, and I'll update everything accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x[:, None] sounds correct to me you want to broadcast mutiple xs with the vector of alpha and lam. If you wanted the 3d combs you would do x[:, None, None], alpha[None, :, None], and lam[None, None, :]? This is exactly like np.vectorize would work if you think of the whole graph as a function.

Alternatively you could call vectorize 3 times, one argument at a time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you found this odd we should definitely add some example to the docstrings of vectorize_graph

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notebook looks amazing, I'll try to solve it when I get the chance!

Thanks for the pointer to the vectorization, will try to add cross-references when those materials get written. There's also a small application in the page-rank problem.

I was just confused by one of the paragraphs, I left a comment in the NB review

@jessegrabowski jessegrabowski merged commit 1e4f527 into main Feb 3, 2025
@jessegrabowski jessegrabowski deleted the optimize branch February 3, 2025 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use PyTensor for root finding with Scipy
2 participants