Skip to content

Commit

Permalink
Add support for arbitrary linear combination gradient recipes (PennyL…
Browse files Browse the repository at this point in the history
…aneAI#909)

* Have positive and negative multiplier and shift values

* No print

* Formatting

* 3 element terms for grad_recipes; qubit okay; CV draft

* CV for tape mode

* Comments

* Remove unused

* Formatting

* Solve casting by specifying dtype at creation

* No casting needed for shifted

* Update module docstring and Operation.grad_recipe docstring

* Development guide update

* Wording

* Adding tests; adding error raised for unsupported logic for tape second-order CV case

* No f strings

* Update pennylane/qnodes/cv.py

Co-authored-by: Josh Izaac <[email protected]>

* Update pennylane/tape/tapes/cv_param_shift.py

* Simplify using np.dot in CV param shift tape

* Update tests/qnodes/test_qnode_cv.py

Co-authored-by: Josh Izaac <[email protected]>

* get_parameter_shift in tape mode as per Josh's suggestion; use that

* Changelog

* Update tests/tape/tapes/test_cv_param_shift.py

Co-authored-by: Josh Izaac <[email protected]>

* Update .github/CHANGELOG.md

Co-authored-by: Tom Bromley <[email protected]>

* merge in changes from 915

* Update pennylane/operation.py

Co-authored-by: Tom Bromley <[email protected]>

* Update grad recipe formulae as per Tom's suggestions

* Update other formula in comment

* CHANGELOG

* Add rendering img url approach

* Plus

* Update pennylane/operation.py

Co-authored-by: Tom Bromley <[email protected]>

* Applying review suggestions

* Update doc/development/plugins.rst

* Update pennylane/operation.py

* equation formatting fixes

Co-authored-by: Josh Izaac <[email protected]>
Co-authored-by: Tom Bromley <[email protected]>
  • Loading branch information
3 people authored and alejomonbar committed Dec 1, 2020
1 parent 9a85e96 commit 63fe205
Show file tree
Hide file tree
Showing 12 changed files with 263 additions and 96 deletions.
14 changes: 14 additions & 0 deletions .github/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,20 @@

<h3>Breaking changes</h3>

* Updated how parameter-shift gradient recipes are defined for operations, allowing for
gradient recipes that are specified as an arbitrary number of terms.
[(#909)](https://github.com/PennyLaneAI/pennylane/pull/909)

Previously, `Operation.grad_recipe` was restricted to two-term parameter-shift formulas.
With this change, the gradient recipe now contains elements of the form
:math:`[c_i, a_i, s_i]`, resulting in a gradient recipe of
:math:`\frac{\partial}{\partial\phi_k}f(\phi_k) = \sum_{i} c_i f(a_i \phi_k + s_i )`.

As this is a breaking change, all custom operations with defined gradient recipes must be
updated to continue working with PennyLane 0.13. Note though that if `grad_recipe = None`, the
default gradient recipe remains unchanged, and corresponds to the two terms :math:`[c_0, a_0, s_0]=[1/2, 1, \pi/2]`
and :math:`[c_1, a_1, s_1]=[-1/2, 1, -\pi/2]` for every parameter.

- The `VQECost` class has been renamed to `ExpvalCost` to reflect its general applicability
beyond VQE. Use of `VQECost` is still possible but will result in a deprecation warning.
[(#913)](https://github.com/PennyLaneAI/pennylane/pull/913)
Expand Down
22 changes: 10 additions & 12 deletions doc/development/plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -466,21 +466,19 @@ where
* :attr:`~.Operation.grad_method`: the gradient computation method; ``'A'`` for the analytic
method, ``'F'`` for finite differences, and ``None`` if the operation may not be differentiated

* :attr:`~.Operation.grad_recipe`: The gradient recipe for the analytic ``'A'`` method.
This is a list with one tuple per operation parameter. For parameter :math:`k`, the tuple is of
the form :math:`(c_k, s_k)`, resulting in a gradient recipe of
* :attr:`~.Operation.grad_recipe`: The gradient recipe for the analytic ``'A'``
method. This is a tuple with one nested list per operation parameter. For
parameter :math:`\phi_k`, the nested list contains elements of the form
:math:`[c_i, a_i, s_i]`, resulting in a gradient recipe of

.. math:: \frac{d}{d\phi_k}f(O(\phi_k)) = c_k\left[f(O(\phi_k+s_k))-f(O(\phi_k-s_k))\right].
.. math:: \frac{\partial}{\partial\phi_k}f(\phi_k) = \sum_{i} c_i f(a_i \phi_k+s_i),

where :math:`f` is an expectation value that depends on :math:`O(\phi_k)`, an example being
where :math:`f` is the expectation value of an observable on a circuit that has been evolved by
the operation being considered with parameter :math:`\phi_k`.

.. math:: f(O(\phi_k)) = \braket{0 | O^{\dagger}(\phi_k) \hat{B} O(\phi_k) | 0}

which is the simple expectation value of the operator :math:`\hat{B}` evolved via the gate
:math:`O(\phi_k)`.

Note that if ``grad_recipe = None``, the default gradient recipe is
:math:`(c_k, s_k)=(1/2, \pi/2)` for every parameter.
Note that if ``grad_recipe = None``, the default gradient recipe containing
the two terms :math:`[c_0, a_0, s_0]=[1/2, 1, \pi/2]` and :math:`[c_1, a_1,
s_1]=[-1/2, 1, -\pi/2]` is assumed for every parameter.

The user can then import this operation directly from your plugin, and use it when defining a QNode:

Expand Down
103 changes: 77 additions & 26 deletions pennylane/operation.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,25 @@
transformation on the quadrature operators.
For gates that *are* supported via the analytic method, the gradient recipe
(with multiplier :math:`c_k`, parameter shift :math:`s_k` for parameter :math:`\phi_k`)
works as follows:
.. math:: \frac{\partial}{\partial\phi_k}O = c_k\left[O(\phi_k+s_k)-O(\phi_k-s_k)\right].
.. math:: \frac{\partial}{\partial\phi_k}f = \sum_{i} c_i f(a_i \phi_k+s_i).
where :math:`f` is the expectation value of an observable on a circuit that has
been evolved by the operation being considered with parameter :math:`\phi_k`,
there are multiple terms indexed with :math:`i` for each parameter :math:`\phi`
and the :math:`[c_i, a_i, s_i]` are coefficients specific to the gate.
The following specific case holds for example for qubit operations that are
generated by one of the Pauli matrices and results in an overall positive and
negative shift:
.. math::
\frac{\partial}{\partial\phi_k}f = \frac{1}{2}\left[f \left( \phi_k+\frac{\pi}{2} \right) - f
\left( \phi_k-\frac{\pi}{2} \right)\right],
i.e., so that :math:`[c_0, a_0, s_0]=[1/2, 1, \pi/2]` and :math:`[c_1, a_1, s_1]=[-1/2, 1, -\pi/2]`.
CV Operation base classes
~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -613,19 +628,22 @@ def grad_method(self):
return None if self.num_params == 0 else "F"

grad_recipe = None
r"""list[tuple[float]] or None: Gradient recipe for the parameter-shift method.
r"""tuple(Union(list[list[float]], None)) or None: Gradient recipe for the
parameter-shift method.
This is a list with one tuple per operation parameter. For parameter
:math:`k`, the tuple is of the form :math:`(c_k, s_k)`, resulting in
a gradient recipe of
This is a tuple with one nested list per operation parameter. For
parameter :math:`\phi_k`, the nested list contains elements of the form
:math:`[c_i, a_i, s_i]` where :math:`i` is the index of the
term, resulting in a gradient recipe of
.. math:: \frac{\partial}{\partial\phi_k}O = c_k\left[O(\phi_k+s_k)-O(\phi_k-s_k)\right].
.. math:: \frac{\partial}{\partial\phi_k}f = \sum_{i} c_i f(a_i \phi_k + s_i).
If ``None``, the default gradient recipe
:math:`(c_k, s_k)=(1/2, \pi/2)` is assumed for every parameter.
If ``None``, the default gradient recipe containing the two terms
:math:`[c_0, a_0, s_0]=[1/2, 1, \pi/2]` and :math:`[c_1, a_1,
s_1]=[-1/2, 1, -\pi/2]` is assumed for every parameter.
"""

def get_parameter_shift(self, idx):
def get_parameter_shift(self, idx, shift=np.pi / 2):
"""Multiplier and shift for the given parameter, based on its gradient recipe.
Args:
Expand All @@ -636,16 +654,32 @@ def get_parameter_shift(self, idx):
"""
# get the gradient recipe for this parameter
recipe = self.grad_recipe[idx]
multiplier, shift = (0.5, np.pi / 2) if recipe is None else recipe

# internal multiplier in the Variable
var_mult = self.data[idx].mult
# Default values
multiplier = 0.5 / np.sin(shift)
a = 1

# We set the default recipe following:
# ∂f(x) = c*f(x+s) - c*f(x-s)
# where we express a positive and a negative shift by default
default_param_shift = [[multiplier, a, shift], [-multiplier, a, -shift]]
param_shift = default_param_shift if recipe is None else recipe

if hasattr(self.data[idx], "mult"):
# Parameter is a variable, we are in non-tape mode
# Need to use the internal multiplier in the Variable to update the
# multiplier and the shift
var_mult = self.data[idx].mult

for elem in param_shift:

multiplier *= var_mult
if var_mult != 0:
# zero multiplier means the shift is unimportant
shift /= var_mult
return multiplier, shift
# Update the multiplier
elem[0] *= var_mult
if var_mult != 0:
# Update the shift
# zero multiplier means the shift is unimportant
elem[2] /= var_mult
return param_shift

@property
def generator(self):
Expand Down Expand Up @@ -1588,16 +1622,33 @@ def heisenberg_pd(self, idx):
"""
# get the gradient recipe for this parameter
recipe = self.grad_recipe[idx]
multiplier = 0.5 if recipe is None else recipe[0]
shift = np.pi / 2 if recipe is None else recipe[1]

# Default values
multiplier = 0.5
a = 1
shift = np.pi / 2

# We set the default recipe to as follows:
# ∂f(x) = c*f(x+s) - c*f(x-s)
default_param_shift = [[multiplier, a, shift], [-multiplier, a, -shift]]
param_shift = default_param_shift if recipe is None else recipe

pd = None # partial derivative of the transformation

p = self.parameters
# evaluate the transform at the shifted parameter values
p[idx] += shift
U2 = self._heisenberg_rep(p) # pylint: disable=assignment-from-none
p[idx] -= 2 * shift
U1 = self._heisenberg_rep(p) # pylint: disable=assignment-from-none
return (U2 - U1) * multiplier # partial derivative of the transformation

original_p_idx = p[idx]
for c, _a, s in param_shift:
# evaluate the transform at the shifted parameter values
p[idx] = _a * original_p_idx + s
U = self._heisenberg_rep(p) # pylint: disable=assignment-from-none

if pd is None:
pd = c * U
else:
pd += c * U

return pd

def heisenberg_tr(self, wires, inverse=False):
r"""Heisenberg picture representation of the linear transformation carried
Expand Down
28 changes: 22 additions & 6 deletions pennylane/ops/cv.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,9 @@ class Squeezing(CVOperation):
grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / math.sinh(shift), shift), None]
multiplier = 0.5 / math.sinh(shift)
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]], None)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -180,7 +182,9 @@ class Displacement(CVOperation):
grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / shift, shift), None]
multiplier = 0.5 / shift
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]], None)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -278,8 +282,11 @@ class TwoModeSqueezing(CVOperation):
par_domain = "R"

grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / math.sinh(shift), shift), None]
multiplier = 0.5 / math.sinh(shift)
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]], None)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -326,8 +333,11 @@ class QuadraticPhase(CVOperation):
par_domain = "R"

grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / shift, shift)]
multiplier = 0.5 / shift
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]],)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -371,8 +381,11 @@ class ControlledAddition(CVOperation):
par_domain = "R"

grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / shift, shift)]
multiplier = 0.5 / shift
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]],)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -417,8 +430,11 @@ class ControlledPhase(CVOperation):
par_domain = "R"

grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / shift, shift)]
multiplier = 0.5 / shift
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]],)

@staticmethod
def _heisenberg_rep(p):
Expand Down
37 changes: 27 additions & 10 deletions pennylane/qnodes/cv.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,20 +181,37 @@ def _pd_analytic(self, idx, args, kwargs, **options):
temp_var.idx = n
op.data[p_idx] = temp_var

multiplier, shift = op.get_parameter_shift(p_idx)

# shifted parameter values
shift_p1 = np.r_[args, args[idx] + shift]
shift_p2 = np.r_[args, args[idx] - shift]
param_shift = op.get_parameter_shift(p_idx)

if not force_order2 and op.use_method != "B":
# basic parameter-shift method, for Gaussian CV gates
# succeeded by order-1 observables
# evaluate the circuit at two points with shifted parameter values
y2 = np.asarray(self.evaluate(shift_p1, kwargs))
y1 = np.asarray(self.evaluate(shift_p2, kwargs))
pd += (y2 - y1) * multiplier
# evaluate the circuit at multiple points with the linear
# combination of parameter values (in most cases at two points)
for multiplier, a, shift in param_shift:

# shifted parameter values
shift_p = np.r_[args, a * args[idx] + shift]

term = multiplier * np.asarray(self.evaluate(shift_p, kwargs))
pd += term
else:
if len(param_shift) != 2:
# The 2nd order CV parameter-shift rule only accepts two-term shifts
raise NotImplementedError(
"Taking the analytic gradient for order-2 operators is "
"unsupported for {op} which contains a parameter with a "
"gradient recipe of more than two terms."
)

# Get the shifts and the multipliers
pos_multiplier, a1, pos_shift = param_shift[0]
neg_multiplier, a2, neg_shift = param_shift[1]

# shifted parameter values
shift_p1 = np.r_[args, a1 * args[idx] + pos_shift]
shift_p2 = np.r_[args, a2 * args[idx] + neg_shift]

# order-2 parameter-shift method, for gaussian CV gates
# succeeded by order-2 observables
# evaluate transformed observables at the original parameter point
Expand All @@ -203,7 +220,7 @@ def _pd_analytic(self, idx, args, kwargs, **options):
Z2 = op.heisenberg_tr(self.device.wires)
self._set_variables(shift_p2, kwargs)
Z1 = op.heisenberg_tr(self.device.wires)
Z = (Z2 - Z1) * multiplier # derivative of the operation
Z = pos_multiplier * Z2 + neg_multiplier * Z1 # derivative of the operation

unshifted_args = np.r_[args, args[idx]]
self._set_variables(unshifted_args, kwargs)
Expand Down
18 changes: 10 additions & 8 deletions pennylane/qnodes/qubit.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,16 +128,18 @@ def _pd_analytic(self, idx, args, kwargs, **options):
temp_var.idx = n
op.data[p_idx] = temp_var

multiplier, shift = op.get_parameter_shift(p_idx)
param_shift = op.get_parameter_shift(p_idx)

# shifted parameter values
shift_p1 = np.r_[args, args[idx] + shift]
shift_p2 = np.r_[args, args[idx] - shift]
for multiplier, a, shift in param_shift:

# evaluate the circuit at two points with shifted parameter values
y2 = np.asarray(self.evaluate(shift_p1, kwargs))
y1 = np.asarray(self.evaluate(shift_p2, kwargs))
pd += (y2 - y1) * multiplier
# shifted parameter values
shift_p = np.r_[args, a * args[idx] + shift]

# evaluate the circuit at point with shifted parameter values
y = np.asarray(self.evaluate(shift_p, kwargs))

# add the contribution to the partial derivative
pd += multiplier * y

# restore the original parameter
op.data[p_idx] = orig
Expand Down
1 change: 1 addition & 0 deletions pennylane/tape/qnode.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ class QNode:
h=1e-7 (float): step size for the finite difference method
order=1 (int): The order of the finite difference method to use. ``1`` corresponds
to forward finite differences, ``2`` to centered finite differences.
shift=pi/2 (float): the size of the shift for two-term parameter-shift gradient computations
**Example**
Expand Down
Loading

0 comments on commit 63fe205

Please sign in to comment.