Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for arbitrary linear combination gradient recipes #909

Merged
merged 48 commits into from
Nov 25, 2020
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
6246431
Have positive and negative multiplier and shift values
antalszava Nov 18, 2020
8f4f5d3
No print
antalszava Nov 18, 2020
19d5ad9
Formatting
antalszava Nov 18, 2020
a77d73f
Merge branch 'master' into multiple_shifts
antalszava Nov 18, 2020
7eeb013
3 element terms for grad_recipes; qubit okay; CV draft
antalszava Nov 18, 2020
50db4f9
CV for tape mode
antalszava Nov 18, 2020
e166e5c
Merge branch 'master' into multiple_shifts
antalszava Nov 18, 2020
e34c034
Comments
antalszava Nov 18, 2020
b966cd9
Merge branch 'multiple_shifts' of https://github.com/XanaduAI/pennyla…
antalszava Nov 18, 2020
ef78263
Remove unused
antalszava Nov 18, 2020
0a0dbdf
Formatting
antalszava Nov 18, 2020
95d73f7
Solve casting by specifying dtype at creation
antalszava Nov 19, 2020
538bec5
No casting needed for shifted
antalszava Nov 19, 2020
b1b4334
Update module docstring and Operation.grad_recipe docstring
antalszava Nov 19, 2020
8b30ce5
Development guide update
antalszava Nov 19, 2020
8e201b1
Wording
antalszava Nov 19, 2020
6f08762
Adding tests; adding error raised for unsupported logic for tape seco…
antalszava Nov 19, 2020
9741a6b
No f strings
antalszava Nov 19, 2020
573795a
Merge branch 'master' into multiple_shifts
antalszava Nov 20, 2020
9489dcd
Update pennylane/qnodes/cv.py
antalszava Nov 20, 2020
a92decf
Update pennylane/tape/tapes/cv_param_shift.py
antalszava Nov 20, 2020
9d7096a
Simplify using np.dot in CV param shift tape
antalszava Nov 20, 2020
34debc9
Merge branch 'master' into multiple_shifts
antalszava Nov 20, 2020
7680d25
Merge branch 'master' into multiple_shifts
josh146 Nov 21, 2020
1dc4234
Update tests/qnodes/test_qnode_cv.py
antalszava Nov 23, 2020
db02026
get_parameter_shift in tape mode as per Josh's suggestion; use that
antalszava Nov 23, 2020
32de6d2
Merge branch 'multiple_shifts' of https://github.com/XanaduAI/pennyla…
antalszava Nov 23, 2020
aa7d6ea
Changelog
antalszava Nov 23, 2020
b0916a0
Update tests/tape/tapes/test_cv_param_shift.py
antalszava Nov 23, 2020
5970497
Update .github/CHANGELOG.md
antalszava Nov 23, 2020
19783eb
Merge branch 'master' into multiple_shifts
antalszava Nov 23, 2020
3d155c5
merge in changes from 915
josh146 Nov 24, 2020
9c425b4
Merge branch 'master' into multiple_shifts
josh146 Nov 24, 2020
debf057
Update pennylane/operation.py
antalszava Nov 24, 2020
1bac591
Update grad recipe formulae as per Tom's suggestions
antalszava Nov 24, 2020
ba443c0
Merge branch 'multiple_shifts' of https://github.com/XanaduAI/pennyla…
antalszava Nov 24, 2020
456b563
Update other formula in comment
antalszava Nov 24, 2020
4c00dc1
Merge branch 'master' into multiple_shifts
josh146 Nov 24, 2020
c5ca866
CHANGELOG
antalszava Nov 24, 2020
59454bf
Add rendering img url approach
antalszava Nov 24, 2020
30b3314
Plus
antalszava Nov 24, 2020
e8c849c
Update pennylane/operation.py
antalszava Nov 24, 2020
a3a821d
Applying review suggestions
antalszava Nov 24, 2020
85881e3
Merge branch 'multiple_shifts' of https://github.com/XanaduAI/pennyla…
antalszava Nov 24, 2020
4a03b07
Update doc/development/plugins.rst
antalszava Nov 24, 2020
834801e
Update pennylane/operation.py
antalszava Nov 24, 2020
b0989c2
Merge branch 'master' into multiple_shifts
antalszava Nov 24, 2020
a9a8122
equation formatting fixes
josh146 Nov 25, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 45 additions & 14 deletions pennylane/operation.py
Original file line number Diff line number Diff line change
Expand Up @@ -636,16 +636,30 @@ def get_parameter_shift(self, idx):
"""
# get the gradient recipe for this parameter
recipe = self.grad_recipe[idx]
multiplier, shift = (0.5, np.pi / 2) if recipe is None else recipe

# Default values
multiplier = 0.5
a = 1
shift = np.pi / 2

# We set the default recipe following:
# ∂f(x) = c1*f(a1*x+s1) + c2*f(a2*x+s2)
antalszava marked this conversation as resolved.
Show resolved Hide resolved
# where we express a positive and a negative shift by default
default_param_shift = [[multiplier, a, shift], [-multiplier, a, -shift]]
param_shift = default_param_shift if recipe is None else recipe

# internal multiplier in the Variable
var_mult = self.data[idx].mult

multiplier *= var_mult
if var_mult != 0:
# zero multiplier means the shift is unimportant
shift /= var_mult
return multiplier, shift
for elem in param_shift:

# Update the multiplier
elem[0] *= var_mult
if var_mult != 0:
# Update the shift
# zero multiplier means the shift is unimportant
elem[2] /= var_mult
return param_shift
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from this block here, it's a shame we can't re-use this method for the tape-mode gradient 🙁

Do you think it makes sense to do the following?

def get_parameter_shift(self, idx, shift=np.pi/2):
    recipe = self.grad_recipe[idx]

    # Default values
    multiplier = 0.5 / np.sin(s)
    a = 1

    # We set the default recipe following:
    # ∂f(x) = c1*f(a1*x+s1) + c2*f(a2*x+s2)
    # where we express a positive and a negative shift by default
    default_param_shift = [[multiplier, a, shift], [-multiplier, a, -shift]]
    param_shift = default_param_shift if recipe is None else recipe

    if hasattr(self.data[idx], "mult"):
        # Parameter is a variable, we are in non-tape mode
        ...

    return param_shift

This way:

  • We can call this method inside both the old QNode gradient methods, and the new tape gradient methods.

  • We can pass a different default shift value, for the qubit parameter-shift rule

  • The branch that worries about variables only is called in non-tape mode. Note that the above is safer than using qml.tape_active(), which only checks if the user has activated tape mode, not if the object itself is a tape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Josh, thanks so much (forgot to reply here 😅). Worked great, adjusted the method!


@property
def generator(self):
Expand Down Expand Up @@ -1588,16 +1602,33 @@ def heisenberg_pd(self, idx):
"""
# get the gradient recipe for this parameter
recipe = self.grad_recipe[idx]
multiplier = 0.5 if recipe is None else recipe[0]
shift = np.pi / 2 if recipe is None else recipe[1]

# Default values
multiplier = 0.5
a = 1
shift = np.pi / 2

# We set the default recipe to as follows:
# ∂f(x) = c1*f(a1*x+s1) + c2*f(a2*x+s2)
default_param_shift = [[multiplier, a, shift], [-multiplier, a, -shift]]
param_shift = default_param_shift if recipe is None else recipe

pd = None # partial derivative of the transformation

p = self.parameters
# evaluate the transform at the shifted parameter values
p[idx] += shift
U2 = self._heisenberg_rep(p) # pylint: disable=assignment-from-none
p[idx] -= 2 * shift
U1 = self._heisenberg_rep(p) # pylint: disable=assignment-from-none
return (U2 - U1) * multiplier # partial derivative of the transformation

original_p_idx = p[idx]
for c, _a, s in param_shift:
# evaluate the transform at the shifted parameter values
p[idx] = _a * original_p_idx + s
U = self._heisenberg_rep(p) # pylint: disable=assignment-from-none

if pd is None:
pd = c * U
else:
pd += c * U

return pd

def heisenberg_tr(self, wires, inverse=False):
r"""Heisenberg picture representation of the linear transformation carried
Expand Down
28 changes: 22 additions & 6 deletions pennylane/ops/cv.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,9 @@ class Squeezing(CVOperation):
grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / math.sinh(shift), shift), None]
multiplier = 0.5 / math.sinh(shift)
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]], None)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -180,7 +182,9 @@ class Displacement(CVOperation):
grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / shift, shift), None]
multiplier = 0.5 / shift
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]], None)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -278,8 +282,11 @@ class TwoModeSqueezing(CVOperation):
par_domain = "R"

grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / math.sinh(shift), shift), None]
multiplier = 0.5 / math.sinh(shift)
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]], None)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -326,8 +333,11 @@ class QuadraticPhase(CVOperation):
par_domain = "R"

grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / shift, shift)]
multiplier = 0.5 / shift
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]],)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -371,8 +381,11 @@ class ControlledAddition(CVOperation):
par_domain = "R"

grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / shift, shift)]
multiplier = 0.5 / shift
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]],)

@staticmethod
def _heisenberg_rep(p):
Expand Down Expand Up @@ -417,8 +430,11 @@ class ControlledPhase(CVOperation):
par_domain = "R"

grad_method = "A"

shift = 0.1
grad_recipe = [(0.5 / shift, shift)]
multiplier = 0.5 / shift
a = 1
grad_recipe = ([[multiplier, a, shift], [-multiplier, a, -shift]],)

@staticmethod
def _heisenberg_rep(p):
Expand Down
37 changes: 27 additions & 10 deletions pennylane/qnodes/cv.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,20 +181,37 @@ def _pd_analytic(self, idx, args, kwargs, **options):
temp_var.idx = n
op.data[p_idx] = temp_var

multiplier, shift = op.get_parameter_shift(p_idx)

# shifted parameter values
shift_p1 = np.r_[args, args[idx] + shift]
shift_p2 = np.r_[args, args[idx] - shift]
param_shift = op.get_parameter_shift(p_idx)

if not force_order2 and op.use_method != "B":
# basic parameter-shift method, for Gaussian CV gates
# succeeded by order-1 observables
# evaluate the circuit at two points with shifted parameter values
y2 = np.asarray(self.evaluate(shift_p1, kwargs))
y1 = np.asarray(self.evaluate(shift_p2, kwargs))
pd += (y2 - y1) * multiplier
# evaluate the circuit at multiple points with the linear
# combination of parameter values (in most cases at two points)
for multiplier, a, shift in param_shift:

# shifted parameter values
shift_p = np.r_[args, a * args[idx] + shift]

term = multiplier * np.asarray(self.evaluate(shift_p, kwargs))
pd += term
else:
if len(param_shift) != 2:
# TODO: check if more than two terms is supported
antalszava marked this conversation as resolved.
Show resolved Hide resolved
raise NotImplementedError(
f"Taking the analytic gradient for order-2 operators is\
unsupported for {op} which contains a parameter with a\
gradient recipe of more than two terms."
)
antalszava marked this conversation as resolved.
Show resolved Hide resolved

# Get the shifts and the multipliers
pos_multiplier, a1, pos_shift = param_shift[0]
neg_multiplier, a2, neg_shift = param_shift[1]

# shifted parameter values
shift_p1 = np.r_[args, a1 * args[idx] + pos_shift]
shift_p2 = np.r_[args, a2 * args[idx] + neg_shift]

# order-2 parameter-shift method, for gaussian CV gates
# succeeded by order-2 observables
# evaluate transformed observables at the original parameter point
Expand All @@ -203,7 +220,7 @@ def _pd_analytic(self, idx, args, kwargs, **options):
Z2 = op.heisenberg_tr(self.device.wires)
self._set_variables(shift_p2, kwargs)
Z1 = op.heisenberg_tr(self.device.wires)
Z = (Z2 - Z1) * multiplier # derivative of the operation
Z = pos_multiplier * Z2 + neg_multiplier * Z1 # derivative of the operation

unshifted_args = np.r_[args, args[idx]]
self._set_variables(unshifted_args, kwargs)
Expand Down
18 changes: 10 additions & 8 deletions pennylane/qnodes/qubit.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,16 +128,18 @@ def _pd_analytic(self, idx, args, kwargs, **options):
temp_var.idx = n
op.data[p_idx] = temp_var

multiplier, shift = op.get_parameter_shift(p_idx)
param_shift = op.get_parameter_shift(p_idx)

# shifted parameter values
shift_p1 = np.r_[args, args[idx] + shift]
shift_p2 = np.r_[args, args[idx] - shift]
for multiplier, a, shift in param_shift:

# evaluate the circuit at two points with shifted parameter values
y2 = np.asarray(self.evaluate(shift_p1, kwargs))
y1 = np.asarray(self.evaluate(shift_p2, kwargs))
pd += (y2 - y1) * multiplier
# shifted parameter values
shift_p = np.r_[args, a * args[idx] + shift]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think I need to understand what r_ is doing here 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.r_ is a really nice way of doing efficient row-wise concatenation using slices :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's essentially a slightly more efficient way of doing np.concatenate([args, np.array([a * args[idx] + shift])]), avoiding the need to create intermediate arrays

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice! My follow up thought was why this is a use case for doing that 🤔
E.g.
params = [1, 2, 3]
idx = 1, shift = 0.5
This means that
shift_p = [1, 2, 3, 2.5]? I guess I was expecting [1, 2.5, 3].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the gradient rules in non-tape mode are written a bit strangely 😆 Before shifting a parameter, a new 'temp' parameter is added to the circuit. It is this parameter that is shifted; it is then deleted once the gradient has been computed.

The tape mode implementation is a lot 'cleaner' imho

Copy link
Member

@josh146 josh146 Nov 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the corresponding logic in QubitQNode._pd_analytic:

def _pd_analytic(self, idx, args, kwargs, **options):
    n = self.num_variables
    pd = 0.0
    # find the Operators in which the free parameter appears, use the product rule
    for op, p_idx in self.variable_deps[idx]:

        # We temporarily edit the Operator such that parameter p_idx is replaced by a new one,
        # which we can modify without affecting other Operators depending on the original.
        orig = op.data[p_idx]
        assert orig.idx == idx

        # reference to a new, temporary parameter with index n, otherwise identical with orig
        temp_var = copy.copy(orig)
        temp_var.idx = n
        op.data[p_idx] = temp_var

        param_shift = op.get_parameter_shift(p_idx)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I mainly kept the implementation as was before.


# evaluate the circuit at point with shifted parameter values
y = np.asarray(self.evaluate(shift_p, kwargs))

# add the contribution to the partial derivative
pd += multiplier * y
antalszava marked this conversation as resolved.
Show resolved Hide resolved

# restore the original parameter
op.data[p_idx] = orig
Expand Down
59 changes: 44 additions & 15 deletions pennylane/tape/tapes/cv_param_shift.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,18 +236,30 @@ def parameter_shift_first_order(
p_idx = self._par_info[t_idx]["p_idx"]

recipe = op.grad_recipe[p_idx]
c, s = (0.5, np.pi / 2) if recipe is None else recipe

# Default values
multiplier = 0.5
a = 1
shift = np.pi / 2

# We set the default recipe to as follows:
# ∂f(x) = c1*f(a1*x+s1) + c2*f(a2*x+s2)
default_param_shift = [[multiplier, a, shift], [-multiplier, a, -shift]]
param_shift = default_param_shift if recipe is None else recipe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the changes in my previous comment, this could be replaced with simply op.get_parameter_shift() (unless I'm missing something?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #924


shift = np.zeros_like(params)
shift[idx] = s

shifted_forward = self.copy(copy_operations=True, tape_cls=QuantumTape)
shifted_forward.set_parameters(params + shift)
coeffs = []
tapes = []
for c, _a, s in param_shift:

shifted_backward = self.copy(copy_operations=True, tape_cls=QuantumTape)
shifted_backward.set_parameters(params - shift)
shift[idx] = s

tapes = [shifted_forward, shifted_backward]
# shifted parameter values
shifted_tape = self.copy(copy_operations=True, tape_cls=QuantumTape)
shifted_tape.set_parameters(params + shift)
coeffs.append(c)
tapes.append(shifted_tape)

def processing_fn(results):
"""Computes the gradient of the parameter at index idx via the
Expand All @@ -260,10 +272,13 @@ def processing_fn(results):
array[float]: 1-dimensional array of length determined by the tape output
measurement statistics
"""
shifted_forward = np.array(results[0])
shifted_backward = np.array(results[1])
stat = np.zeros_like(results[0])

for c, res in zip(coeffs, results):
shifted = np.array(res)
np.add(stat, c * shifted, out=stat, casting="unsafe")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workaround proposed by numpy/numpy#7225 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I'm guessing this is because c has a dtype that is causing casting issues?

Would it be better to simply cast/ensure that c and shifted are the correct dtype?


return c * (shifted_forward - shifted_backward)
return stat

return tapes, processing_fn

Expand Down Expand Up @@ -291,21 +306,35 @@ def parameter_shift_second_order(self, idx, params, **options):
dev_wires = options["dev_wires"]

recipe = op.grad_recipe[p_idx]
c, s = (0.5, np.pi / 2) if recipe is None else recipe

# Default values
multiplier = 0.5
a = 1
shift = np.pi / 2

# We set the default recipe following:
# ∂f(x) = c1*f(a1*x+s1) + c2*f(a2*x+s2)
# where we express a positive and a negative shift by default
default_param_shift = [[multiplier, a, shift], [-multiplier, a, -shift]]
param_shift = default_param_shift if recipe is None else recipe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same here)


c1, a1, s1 = param_shift[0]
c2, a2, s2 = param_shift[1]

shift = np.zeros_like(params)
shift[idx] = s
shift[idx] = s1

# evaluate transformed observables at the original parameter point
# first build the Heisenberg picture transformation matrix Z
self.set_parameters(params + shift)
self.set_parameters(a1 * params + shift)
Z2 = op.heisenberg_tr(dev_wires)

self.set_parameters(params - shift)
shift[idx] = s2
self.set_parameters(a2 * params + shift)
Z1 = op.heisenberg_tr(dev_wires)

# derivative of the operation
Z = (Z2 - Z1) * c
Z = Z2 * c1 + Z1 * c2

self.set_parameters(params)
Z0 = op.heisenberg_tr(dev_wires, inverse=True)
Expand Down