Add JIT and minor text corrections

theo-brown · theo-brown · commit edfa9a45b524 · 2025-05-19T16:35:12.000+01:00
diff --git a/docs/interfacing_with_surrogates.rst b/docs/interfacing_with_surrogates.rst
@@ -68,7 +68,7 @@ Consider a PyTorch neural network,
 
     torch_model = PyTorchMLP(hidden_dim, n_hidden, output_dim, input_dim)
 
-This model can be converted to a Flax model as follows:
+This model can be replicated in Flax as follows:
 
 .. code-block:: python
 
@@ -117,15 +117,19 @@ For loading weights from a PyTorch checkpoint, you might do something like:
 
     params = {'params': params}
 
-
 The model can then be called like any Flax model,
 
 .. code-block:: python
 
-    output_tensor = flax_model.apply(params, input_tensor)
+    output_tensor = jax.jit(flax_model.apply)(params, input_tensor)
+
+
+.. warning::
+    You need to be very careful when loading from a PyTorch state dict, as Flax and PyTorch may have slightly different representations of the weights (for example, one could be the transpose of the other). It's worth validating the output of your PyTorch model against your JAX model to make sure.
+
 
 
-Option 2: converting a Pytorch model to a JAX model
+Option 2: converting a PyTorch model to a JAX model
 ===================================================
 
 .. warning::
@@ -145,7 +149,7 @@ The model can then be called as a pure JAX function:
 
 .. code-block:: python
 
-    output_tensor = jax_model_from_torch(params, input_tensor)
+    output_tensor = jax.jit(jax_model_from_torch)(params, input_tensor)
 
 To remove the need for performing the conversion every time the model is loaded, you might want to save a JAX-compatible version of the weights and model to disk:
 
@@ -155,7 +159,7 @@ To remove the need for performing the conversion every time the model is loaded,
     import numpy as np
 
     # jax.export uses StableHLO to serialize the model to a binary format
-    exported_model = jax.export(jax_model_from_torch)
+    exported_model = jax.export(jax.jit(jax_model_from_torch))
     with open("model.hlo", "wb") as f:
       f.write(exported_model.serialize())
 
@@ -210,7 +214,7 @@ To convert the ONNX model to a JAX representation, you can use the `jaxonnxrunti
 
     jax_model_from_onnx = ONNXJaxBackend.prepare(onnx_model)
     # NOTE: run() returns a list of output tensors, in order of the output nodes
-    output_tensors = jax_model_from_onnx.run({"input": jnp.asarray(input_tensor, dtype=jnp.float32)})
+    output_tensors = jax.jit(jax_model_from_onnx.run)({"input": jnp.asarray(input_tensor, dtype=jnp.float32)})
 
 
 Best practices
@@ -239,6 +243,14 @@ where
 
 By decorating with ``functools.lru_cache(maxsize=1)``, the result of this function - the loaded model - is stored in the cache and is only re-loaded if the function is called with a different ``path``.
 
+**JITting model calls**: In general, you should make sure that your forward call of the model is JITted:
+
+.. code-block:: python
+
+    output_tensor = jax.jit(flax_model.apply)(params, input_tensor) # Good
+    output_tensor = flax_model.apply(params, input_tensor) # Bad
+
+This is vital to ensure fast performance.
 
 ..  _Flax Linen: https://flax-linen.readthedocs.io/en/latest/index.html
 ..  _Flax documentation: https://flax-linen.readthedocs.io/en/latest/guides/flax_fundamentals/flax_basics.html#defining-your-own-models