Skip to content
This repository has been archived by the owner on Aug 2, 2021. It is now read-only.

forky: benchmark badger vs forky #2143

Open
wants to merge 11 commits into
base: fcds-teenage-mutants
Choose a base branch
from
Open

Conversation

jmozah
Copy link
Collaborator

@jmozah jmozah commented Mar 24, 2020

This PR is a replacement of fcds most latest branch with badger to do benchmarkings.
Below is the benchmarking for writing, reading, deleting 1 million chunks in seconds

                                                 Badger           fcds-tenage-mutants
-----------------------------------------+--------------------+---------------------+
BenchmarkWriteOverClean                       17.2                 37.7               
-----------------------------------------+--------------------+---------------------+
BenchmarkWriteOver1Million                    20.2                46.1              
-----------------------------------------+--------------------+---------------------+
BenchmarkReadOver1Million                     5.1                  31.4               
-----------------------------------------+--------------------+---------------------+
BenchmarkDeleteOver1Million                  9.5                  25.0
-----------------------------------------+--------------------+---------------------+
BenchmarkWriteReadOver1Million                59.1 / 29.1         99.8 / 46.4
-----------------------------------------+--------------------+---------------------+
BenchmarkWriteReadDeleteOver1Million      91.7 / 38.3 / 119.4  122.5 / 70.8 / 110.5 
-----------------------------------------+--------------------+---------------------+

@jmozah jmozah changed the base branch from fcds-teenage-mutants to master March 24, 2020 11:05
@jmozah jmozah changed the base branch from master to fcds-teenage-mutants March 24, 2020 11:05
@jmozah jmozah self-assigned this Mar 24, 2020
@jmozah jmozah requested review from janos, acud and zelig March 24, 2020 11:31
@janos
Copy link
Member

janos commented Mar 24, 2020

Can you send instructions how you managed to run benchmarks that you referenced in the description on fcds-tenage-mutants branch? These benchmarks are not existing there, and a lot of testing code is changed in this pr.

I think that it is important for a reviewer to be able to reproduce the results you presented.

@janos
Copy link
Member

janos commented Mar 24, 2020

Could you make this code buildable? It fails with the following errors when running go run build/ci.go install:

go run build/ci.go install                                                                                                  (git)-[fcds-badger] 
util.go:169: package listing failed: exit status 1
go: inconsistent vendoring in /Users/janos/go/src/github.com/ethersphere/swarm:
        github.com/golang/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/golang/[email protected]
        github.com/pkg/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/pkg/[email protected]
        golang.org/x/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates golang.org/x/[email protected]
        golang.org/x/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates golang.org/x/[email protected]

run 'go mod vendor' to sync, or use -mod=mod or -mod=readonly to ignore the vendor directory
storage/localstore/mode_put.go:146:14: assignment mismatch: 2 variables but db.data.Put returns 1 values

@jmozah
Copy link
Collaborator Author

jmozah commented Mar 25, 2020

Can you send instructions how you managed to run benchmarks that you referenced in the description on fcds-tenage-mutants branch?

cd to storage/fcds/leveldb directory
go test -bench=. -timeout=60m

For every test the benchmark is there for 10,000, 100k and 1Million. I have commented 10K and 100k tests for brevity. If you want enable them too and run them to see the performance for smaller sets.

@jmozah
Copy link
Collaborator Author

jmozah commented Mar 25, 2020

go: inconsistent vendoring in /Users/janos/go/src/github.com/ethersphere/swarm:
github.com/golang/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/golang/[email protected]
github.com/pkg/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates github.com/pkg/[email protected]
golang.org/x/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates golang.org/x/[email protected]
golang.org/x/[email protected]: is explicitly required in go.mod, but vendor/modules.txt indicates golang.org/x/[email protected]

You have to do a go get -u github.com/dgraph-io/badger for getting all the dependency packages of badger in to your vendor to get rid of this errors

storage/localstore/mode_put.go:146:14: assignment mismatch: 2 variables but db.data.Put returns 1 values

Fixed it

@jmozah
Copy link
Collaborator Author

jmozah commented Mar 25, 2020

@janos This PR is just for benchmarking and not for merging. So please ignore the conflicts.

@acud acud changed the title replaced forky-teenage-mutants with badger forky: benchmark badger vs forky Mar 25, 2020
@janos
Copy link
Member

janos commented Mar 25, 2020

cd to storage/fcds/leveldb directory
go test -bench=. -timeout=60m

For every test the benchmark is there for 10,000, 100k and 1Million. I have commented 10K and 100k tests for brevity. If you want enable them too and run them to see the performance for smaller sets.

@jmozah

  • in fcds-teenage-mutants branch there are just no Benchmarks in storage/fcds/leveldb directory, and I cannot find these benchmarks in any other directory
  • in fcds-badger there is no storage/fcds/leveldb directory everything is removed and benchmarks are in storage/fcds

I am sorry, but this PR is not reviewable and benchmarks are not reproducible without your direct assistance.

If now can say that I trust your measurements even if I cannot reproduce them.

@acud
Copy link
Member

acud commented Mar 25, 2020

It is also unclear to me how the benchmarks were done. It's very difficult to compare head to head as usually is with benchmarks (check-out different branches and run the same benchmarks in the same directory).

Also, from the benchmarks that I ran on the branch I saw that the actual benchmark analysis of operation per timeframe is not used but some other CLI printouts say how long it took to insert/read/do some operation. I'm guessing you just divided that by the number of chunks that you were measuring within that run (this was also evident since the results you've pasted were not in the golang benchmark tool output format). Having the benchmark tool measure how long per operation has more significance in my opinion.

@jmozah
Copy link
Collaborator Author

jmozah commented Mar 25, 2020

in fcds-teenage-mutants branch there are just no Benchmarks in storage/fcds/leveldb directory, and I cannot find these benchmarks in any other directory

oops.. My mistake. I pushed the benchmarks in fcds-teenage-mutants branch now. Please check.

in fcds-badger there is no storage/fcds/leveldb directory everything is removed and benchmarks are in storage/fcds

Yes, You don't need any of the leveldb stuff you used to store meta. Badger stores both meta and chunks in different places similar to your forky implementation. Thats why you find only badger related stuff and i have removed everything related to forky.

The benchmarks (like in fcds-teenage-mutants) is there in fcds-test.go file. This benchmark file is exactly same to the one in fcds-teenage-mutants branch benchmark file.

@janos
Copy link
Member

janos commented Mar 25, 2020

Thanks @jmozah. It would be very nice that all required code and instructions have been shared so that reviewers do not waste time figuring out how and why something is measured or not working.

@jmozah
Copy link
Collaborator Author

jmozah commented Mar 25, 2020

It is also unclear to me how the benchmarks were done. It's very difficult to compare head to head as usually is with benchmarks (check-out different branches and run the same benchmarks in the same directory).

I am not sure i understand this. I checked out 2 branches ran the same benchmarks on them.

Also, from the benchmarks that I ran on the branch I saw that the actual benchmark analysis of operation per timeframe is not used but some other CLI printouts say how long it took to insert/read/do some operation. I'm guessing you just divided that by the number of chunks that you were measuring within that run (this was also evident since the results you've pasted were not in the golang benchmark tool output format). Having the benchmark tool measure how long per operation has more significance in my opinion.

The benchmarks contains other prepping items like adding base 1 Million items before starting the benchmark. Since i wanted to avoid skewing of the benchmarks by those preps, i calculated the time myself.

BTW, The actual benchmark tool also outputs the time per operation and you can check that too.

@acud
Copy link
Member

acud commented Mar 25, 2020

The benchmarks contains other prepping items like adding base 1 Million items before starting the benchmark. Since i wanted to avoid skewing of the benchmarks by those preps, i calculated the time myself.

Having looked briefly at the benchmarks - the measurement now includes the setup stage, this should be mitigated by stopping and starting the benchmark timer again after the setup stage.

@acud
Copy link
Member

acud commented Mar 26, 2020

how to benchmark:
checkout the fcds-badger branch. go to the fcds directory and run go test -bench .

compare to latest forky on forky-teenage-mutants:
checkout the fcds-teenage-mutants, go to fcds/leveldb directory and run the benchmarks using the same command.

measurements were done on a general purpose digital ocean droplet with 32gb ram and 100gb ssd that demonstrated a steady throughput of 1 GB per second when executing dd if=/dev/zero oflags=direct of=/tmp/test
results:

fcds (original branch):

goos: linux
goarch: amd64
pkg: github.com/ethersphere/swarm/storage/fcds/leveldb
BenchmarkWrite/baseline_10000/add_10000-8                      5         207771471 ns/op
BenchmarkWrite/baseline_10000/add_20000-8                      3         390771814 ns/op
BenchmarkWrite/baseline_10000/add_50000-8                      1        1082049034 ns/op
BenchmarkWrite/baseline_10000/add_100000-8                     1        2522677795 ns/op
BenchmarkWrite/baseline_100000/add_10000-8                     4         284383422 ns/op
BenchmarkWrite/baseline_100000/add_20000-8                     2         534588746 ns/op
BenchmarkWrite/baseline_100000/add_50000-8                     1        1419511730 ns/op
BenchmarkWrite/baseline_100000/add_100000-8                    1        2991614321 ns/op
BenchmarkWrite/baseline_1000000/add_10000-8                    3         351387130 ns/op
BenchmarkWrite/baseline_1000000/add_20000-8                    2        1244932378 ns/op
BenchmarkWrite/baseline_1000000/add_50000-8                    1        1912965611 ns/op
BenchmarkWrite/baseline_1000000/add_100000-8                   1        3549441460 ns/op
BenchmarkRead/baseline_10000/read_10000-8                     49          23632382 ns/op
BenchmarkRead/baseline_100000/read_10000-8                    28          45459948 ns/op
BenchmarkRead/baseline_100000/read_100000-8                    3         355956945 ns/op
BenchmarkRead/baseline_1000000/read_10000-8                   22          47685787 ns/op
BenchmarkRead/baseline_1000000/read_100000-8                   3         467433824 ns/op
BenchmarkRead/baseline_1000000/read_1000000-8                  1        5064705218 ns/op
PASS
ok      github.com/ethersphere/swarm/storage/fcds/leveldb       483.645s

fcds-badger:

go: finding github.com/dustin/go-humanize v1.0.0
goos: linux
goarch: amd64
pkg: github.com/ethersphere/swarm/storage/fcds
BenchmarkWrite/baseline_10000/add_10000-8         	       5	 218338108 ns/op
BenchmarkWrite/baseline_10000/add_20000-8         	       2	 542248470 ns/op
BenchmarkWrite/baseline_10000/add_50000-8         	       1	1385523605 ns/op
BenchmarkWrite/baseline_10000/add_100000-8        	       1	2683787806 ns/op
BenchmarkWrite/baseline_100000/add_10000-8        	       4	 332439304 ns/op
BenchmarkWrite/baseline_100000/add_20000-8        	       2	 519824904 ns/op
BenchmarkWrite/baseline_100000/add_50000-8        	       1	1261284643 ns/op
BenchmarkWrite/baseline_100000/add_100000-8       	       1	2652950044 ns/op
BenchmarkWrite/baseline_1000000/add_10000-8       	       4	 316898512 ns/op
BenchmarkWrite/baseline_1000000/add_20000-8       	       2	 513771005 ns/op
BenchmarkWrite/baseline_1000000/add_50000-8       	       1	1185518167 ns/op
BenchmarkWrite/baseline_1000000/add_100000-8      	       1	2617738577 ns/op
BenchmarkRead/baseline_10000/read_10000-8         	      30	  37316037 ns/op
BenchmarkRead/baseline_100000/read_10000-8        	      21	  53231185 ns/op
BenchmarkRead/baseline_100000/read_100000-8       	       2	 547170612 ns/op
BenchmarkRead/baseline_1000000/read_10000-8       	      18	  64539537 ns/op
BenchmarkRead/baseline_1000000/read_100000-8      	       2	 659616124 ns/op
BenchmarkRead/baseline_1000000/read_1000000-8     	       1	7065391462 ns/op
PASS
ok  	github.com/ethersphere/swarm/storage/fcds	478.670s

fcds-teenage-mutants:

goos: linux
goarch: amd64
pkg: github.com/ethersphere/swarm/storage/fcds/leveldb
BenchmarkWrite/baseline_10000/add_10000-8         	       4	 352686643 ns/op
BenchmarkWrite/baseline_10000/add_20000-8         	       2	 720940594 ns/op
BenchmarkWrite/baseline_10000/add_50000-8         	       1	1587398136 ns/op
BenchmarkWrite/baseline_10000/add_100000-8        	       1	3548349808 ns/op
BenchmarkWrite/baseline_100000/add_10000-8        	       3	 409889821 ns/op
BenchmarkWrite/baseline_100000/add_20000-8        	       2	 801824238 ns/op
BenchmarkWrite/baseline_100000/add_50000-8        	       1	2071715877 ns/op
BenchmarkWrite/baseline_100000/add_100000-8       	       1	3826236892 ns/op
BenchmarkWrite/baseline_1000000/add_10000-8       	       3	 411788621 ns/op
BenchmarkWrite/baseline_1000000/add_20000-8       	       2	 749673024 ns/op
BenchmarkWrite/baseline_1000000/add_50000-8       	       1	2118999953 ns/op
BenchmarkWrite/baseline_1000000/add_100000-8      	       1	4211655511 ns/op
BenchmarkRead/baseline_10000/read_10000-8         	      69	  24283470 ns/op
BenchmarkRead/baseline_100000/read_10000-8        	      36	  38975640 ns/op
BenchmarkRead/baseline_100000/read_100000-8       	       3	 343036374 ns/op
BenchmarkRead/baseline_1000000/read_10000-8       	      15	  73963726 ns/op
BenchmarkRead/baseline_1000000/read_100000-8      	       2	 845639862 ns/op
BenchmarkRead/baseline_1000000/read_1000000-8     	       1	8387259867 ns/op
PASS
ok  	github.com/ethersphere/swarm/storage/fcds/leveldb	606.165s

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants