Implementing the Greenkhorn #159

davibarreira · 2022-01-16T16:55:48Z

This PR is related to #151 .
I've implemented the Greenkhorn algorithm which is a greedy version of the Sinkhorn algorithm. The method I've implemented is actually the one in POT, which is a bit different from the one in the original paper (https://arxiv.org/pdf/1705.09634.pdf).

The implementation needs improvement. I was not able to get it to work with AD and with the batch tests. I was not very involved in the coding of the original Sinkhorn algorithm, and had some difficulty getting around the whole step!, solve! and cache structure.

Another point. The iteration in the Greenkhorn algorithm only updates one row of u and v each time, thus, it needs many more iterations in order to converge compared to the original Sinkhorn. Some preliminar benchmarks showed that this implementation of the Greenkhorn is slower then the original Sinkhorn_Gibbs, which seems to contradict the claims in the paper. I believe the reason for this might be that the Sinkhorn implementation is very optimized in the package compared to my version of Greenkhorn. Another possibility is that the Sinkhorn version of the paper was not very efficient ( if you read it here, they present the Sinkhron algorithm which computer diagm(u) K diagm(v) in each iteration).

I've compared the results from my algorithm against POT, and indeed it seems to be returning the exact same result each iteration, i.e. it seems that the Greenkhorn implementation is correct but not optimal.

coveralls · 2022-01-16T17:09:16Z

Pull Request Test Coverage Report for Build 1704769432

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

45 of 46 (97.83%) changed or added relevant lines in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.2%) to 95.588%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/entropic/greenkhorn.jl	45	46	97.83%

Totals
Change from base Build 1620585433:	0.2%
Covered Lines:	650
Relevant Lines:	680

💛 - Coveralls

codecov-commenter · 2022-01-16T17:09:18Z

Codecov Report

Merging #159 (82b1026) into master (de56119) will increase coverage by 0.16%.
The diff coverage is 97.82%.

@@            Coverage Diff             @@
##           master     #159      +/-   ##
==========================================
+ Coverage   95.42%   95.58%   +0.16%     
==========================================
  Files          14       15       +1     
  Lines         634      680      +46     
==========================================
+ Hits          605      650      +45     
- Misses         29       30       +1

Impacted Files	Coverage Δ
src/entropic/greenkhorn.jl	`97.82% <97.82%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update de56119...82b1026. Read the comment docs.

devmotion · 2022-01-16T20:43:12Z

src/entropic/greenkhorn.jl

+# Greenkhorn is a greedy version of the Sinkhorn algorithm
+# This method is from https://arxiv.org/pdf/1705.09634.pdf
+# Code is based on implementation from package POT


It would be helpful to describe what the differences are (if there are any apart from implementation details).

Yeah, there are. The paper implementation is actually just a couple of lines commented out. I'll point out in the code.

Just realized that it's already in the code. Inside the step! function.

devmotion · 2022-01-17T21:24:55Z

src/entropic/greenkhorn.jl

+    u::U
+    v::V
+    K::KT
+    Kv::U #placeholder


Why "placeholder"?

This is a partial solution. Without this Kv, I got an error. It seems that the Sinkhorn structs further on require it. I could not find out how to get rid of it without changing the code for Sinkhorn.

I mean you use it explicitly below for checking convergence?

Yeah, it's been used to check the convergence. But I think the convergence check might be done another way more efficiently. But I was having trouble getting it to work with the existing "api" for Sinkhorn, so I updated Kv and used the convergence verification already in place.

You are right that, as is, Kv is not actually just a placeholder, since it's been used in the convergence.

devmotion · 2022-01-17T21:32:12Z

src/entropic/greenkhorn.jl

+    i₁ = argmax(abs.(Δμ))
+    i₂ = argmax(abs.(Δν))


Suggested change

i₁ = argmax(abs.(Δμ))

i₂ = argmax(abs.(Δν))

Δμ_max, Δμ_max_idx = findmax(abs, Δμ)

Δν_max, Δν_max_idx = findmax(abs, Δν)

devmotion · 2022-01-17T21:32:48Z

src/entropic/greenkhorn.jl

+    # if ρμ[i₁]> ρν[i₂]
+    if abs(Δμ[i₁]) > abs(Δν[i₂])


Suggested change

# if ρμ[i₁]> ρν[i₂]

if abs(Δμ[i₁]) > abs(Δν[i₂])

if Δμ_max > Δν_max

devmotion · 2022-01-17T21:34:18Z

src/entropic/greenkhorn.jl

+
+    # if ρμ[i₁]> ρν[i₂]
+    if abs(Δμ[i₁]) > abs(Δν[i₂])
+        old_u = u[i₁]


Also has to be changed for batch support it seems.

Suggested change

old_u = u[i₁]

old_u = u[Δμ_max_idx]

devmotion · 2022-01-17T21:36:07Z

src/entropic/greenkhorn.jl

+    # if ρμ[i₁]> ρν[i₂]
+    if abs(Δμ[i₁]) > abs(Δν[i₂])
+        old_u = u[i₁]
+        u[i₁] = μ[i₁]/ (K[i₁,:] ⋅ v)


Suggested change

u[i₁] = μ[i₁]/ (K[i₁,:] ⋅ v)

u[Δμ_max_idx] = μ[Δμ_max_idx] / dot(K[Δμ_max_idx, :], v)

It would be better to select columns instead of rows and to use views. Julia uses column major order, so a column is close in memory and hence accessing columns is faster.

True! I'll try to come up with something.

devmotion · 2022-01-17T21:37:37Z

src/entropic/greenkhorn.jl

+    if abs(Δμ[i₁]) > abs(Δν[i₂])
+        old_u = u[i₁]
+        u[i₁] = μ[i₁]/ (K[i₁,:] ⋅ v)
+        Δ = u[i₁] - old_u


Suggested change

Δ = u[i₁] - old_u

Δ = u[Δμ_max_idx] - old_u

devmotion · 2022-01-17T21:38:15Z

src/entropic/greenkhorn.jl

+        old_u = u[i₁]
+        u[i₁] = μ[i₁]/ (K[i₁,:] ⋅ v)
+        Δ = u[i₁] - old_u
+        G[i₁, :] = u[i₁] * K[i₁,:] .* v


Again, better to work with columns than with rows. Also some unnecessary allocations here:

Suggested change

G[i₁, :] = u[i₁] * K[i₁,:] .* v

G[Δμ_max_idx, :] .= u[Δμ_max_idx] .* K[Δμ_max_idx, :] .* v

What is the unnecessary allocation?

Oh there are multiple unnecessary allocations.

First of all, K[i₁, :] creates, i.e., allocates, a row vector. Then u[i₁] * K[i₁, :] scales it and allocates a new row vector. And finally u[i₁] * K[i₁,:] .* v multiplies the entries of u[i₁] * K[i₁,:] elementwise with v and allocates yet another row vector.

Whereas the alternative suggestion allocates only K[i₁, :] (could be avoided by using a view) and then fuses all multiplications and writes the result directly to G without allocating any other row vector.

Thanks for the answer, and sorry for the bad code. I'm still very crude with code optimization.

devmotion · 2022-01-17T21:39:01Z

src/entropic/greenkhorn.jl

+        u[i₁] = μ[i₁]/ (K[i₁,:] ⋅ v)
+        Δ = u[i₁] - old_u
+        G[i₁, :] = u[i₁] * K[i₁,:] .* v
+        Δμ[i₁] = u[i₁] * (K[i₁,:] ⋅ v) - μ[i₁]


devmotion · 2022-01-17T21:39:09Z

src/entropic/greenkhorn.jl

+        Δ = u[i₁] - old_u
+        G[i₁, :] = u[i₁] * K[i₁,:] .* v
+        Δμ[i₁] = u[i₁] * (K[i₁,:] ⋅ v) - μ[i₁]
+        @. Δν = Δν +  Δ * K[i₁,:] * v


devmotion · 2022-01-17T21:39:21Z

src/entropic/greenkhorn.jl

+        G[i₁, :] = u[i₁] * K[i₁,:] .* v
+        Δμ[i₁] = u[i₁] * (K[i₁,:] ⋅ v) - μ[i₁]
+        @. Δν = Δν +  Δ * K[i₁,:] * v
+    else


Same comments in the second branch.

davibarreira added 7 commits January 15, 2022 14:18

✨ Greenkhorn implementation.

0db7257

✨ Greenkhorn implementation, improved code.

21514cf

✨ Adding testset for greenkhorn.

436fb16

✨ Adding testset for greenkhorn.

dac80ba

📚 Docs for Greenkhorn.

47c3ec8

🐛 Trying to fix the batch tests.

163fb9c

🐛 Some tests not passing, the batch and the ad.

82b1026

davibarreira mentioned this pull request Jan 16, 2022

Greenkhorn and screenkhorn - near-linear time sinkhorn #151

Open

devmotion reviewed Jan 16, 2022

View reviewed changes

devmotion reviewed Jan 17, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing the Greenkhorn #159

Implementing the Greenkhorn #159

davibarreira commented Jan 16, 2022

coveralls commented Jan 16, 2022 •

edited

Loading

codecov-commenter commented Jan 16, 2022

devmotion Jan 16, 2022

davibarreira Jan 16, 2022

davibarreira Jan 17, 2022

devmotion Jan 17, 2022

davibarreira Jan 18, 2022

devmotion Jan 18, 2022

davibarreira Jan 18, 2022

devmotion Jan 17, 2022

devmotion Jan 17, 2022

devmotion Jan 17, 2022

devmotion Jan 17, 2022

davibarreira Jan 18, 2022

devmotion Jan 17, 2022

devmotion Jan 17, 2022

davibarreira Jan 18, 2022

devmotion Jan 18, 2022

davibarreira Jan 18, 2022

devmotion Jan 17, 2022

devmotion Jan 17, 2022

devmotion Jan 17, 2022

		# if ρμ[i₁]> ρν[i₂]
		if abs(Δμ[i₁]) > abs(Δν[i₂])

	# if ρμ[i₁]> ρν[i₂]
	if abs(Δμ[i₁]) > abs(Δν[i₂])
	if Δμ_max > Δν_max

	u[i₁] = μ[i₁]/ (K[i₁,:] ⋅ v)
	u[Δμ_max_idx] = μ[Δμ_max_idx] / dot(K[Δμ_max_idx, :], v)

	G[i₁, :] = u[i₁] * K[i₁,:] .* v
	G[Δμ_max_idx, :] .= u[Δμ_max_idx] .* K[Δμ_max_idx, :] .* v

Implementing the Greenkhorn #159

Are you sure you want to change the base?

Implementing the Greenkhorn #159

Conversation

davibarreira commented Jan 16, 2022

coveralls commented Jan 16, 2022 • edited Loading

Pull Request Test Coverage Report for Build 1704769432

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

codecov-commenter commented Jan 16, 2022

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Jan 16, 2022 •

edited

Loading