Optimize checking gates #172

up2jj · 2024-04-05T06:37:57Z

We use actor-based gates pretty extensively during new feature rollup process and noticed significant slowdowns when flag contains large number of gates (>20k) and is checked multiple times.

Current approach relies on intermediary list created by Enum.filter/2. With new approach only reduce is used.

Benchmark below (5k gates):

--------------------------------------------------------------
$CACHE_ENABLED=true
$PERSISTENCE=ecto
$RDBMS=postgres
$PUBSUB_BROKER=phoenix_pubsub
$CI=
--------------------------------------------------------------
Elixir version:        1.16.2
Erlang/OTP version:    Erlang/OTP 26 [erts-14.0] [source] [64-bit] [smp:10:10] [ds:10:10:10] [async-threads:1] [jit]
Logger level:          :error
Cache enabled:         true
Persistence adapter:   FunWithFlags.Store.Persistent.Ecto
RDBMS driver:          Ecto.Adapters.Postgres
Notifications adapter: FunWithFlags.Notifications.PhoenixPubSub
Anything using Redis:  false
--------------------------------------------------------------
Excluding tags: [:redis_pubsub, :redis_persistence]

...............................................................................................................................................Operating System: macOS
CPU Information: Apple M1 Max
Number of Available Cores: 10
Available memory: 64 GB
Elixir 1.16.2
Erlang 26.0
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 2 s
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 18 s

Benchmarking new ...
Benchmarking old ...
Calculating statistics...
Formatting results...

Name                 ips        average  deviation         median         99th %
new        1.49 K      669.04 μs     ±1.22%      667.92 μs      692.98 μs
old        1.29 K      775.76 μs    ±12.13%      736.63 μs     1028.36 μs

Comparison:
new        1.49 K
old        1.29 K - 1.16x slower +106.73 μs

Memory usage statistics:

Name          Memory usage
new       468.84 KB
old       546.98 KB - 1.17x memory usage +78.14 KB

**All measurements for memory usage were the same**

tompave · 2024-04-05T10:04:44Z

Hey there, thank you for using the library and for the PR.

This seems a sensible change, thanks!
I'll run some local benchmarks. In the meantime, do you think it would make sense to extend this change to the group gates?

As an aside, I'm glad to hear that the library scales to over 20k actor gates 😆, but in that case would it make sense to use a group gate instead? I doubt that you're managing 20k actors with the web UI, you must be doing so programmatically (maybe with a remote console). In that case, you could also flip some attribute on your users to make them part of a beta-testers group.

mentero · 2024-04-05T13:35:08Z

Hey, we often use fun_with_flags to coordinate feature rollout. We have data migration tasks that go over our actors one by one, do the data migration necessary for the feature to work, and if the migration is successful it enables the flag.
After that, we usually go and create a boolean gate and start the code clean-up so we can remove it completely.

This usually worked fine, and we really enjoyed the fine-grained control over the release process, but in this particular case, the flag was heavily used inside a codebase, and we were taken by surprise by the accumulated performance hit of the cache lookup times as it was filled with more and more actor gates. This of course made a lot of sense when we discovered this was the issue and the hotfix was quite simple. We replaced all the actor gates with a single boolean gate (as the migration was done by that time).

We are working on a coldfix right now with a few ideas in mind such as moving the flag checks higher up the callstack or introducing yet another cache layer that will remember all the flags for the specified actor which should be a more suitable strategy for our case

up2jj · 2024-04-08T07:36:52Z

Hey there, thank you for using the library and for the PR.

This seems a sensible change, thanks! I'll run some local benchmarks. In the meantime, do you think it would make sense to extend this change to the group gates?

As an aside, I'm glad to hear that the library scales to over 20k actor gates 😆, but in that case would it make sense to use a group gate instead? I doubt that you're managing 20k actors with the web UI, you must be doing so programmatically (maybe with a remote console). In that case, you could also flip some attribute on your users to make them part of a beta-testers group.

@tompave taking into account that all these kinds of gates are stored in the same structure (list), large number of actors can affect checking of other gate types. I would like to take the same approach for groups, so will update the PR accordingly. Please let me know if there is anything else I can do

tompave · 2024-04-12T07:57:27Z

Hello, thank you for iterating on this.

I just wanted to align expectations on the timeline. This week I've been too busy to look at this, and I'll be travelling for the next few weeks. I'll look at this when I'm back, in May.

up2jj · 2024-05-22T07:29:47Z

Hello @tompave

I hope your trip went well 😄 Could I help you in any way with this issue?

Optimize checking gates

1857fbc

up2jj force-pushed the master branch from fc14fc2 to 1857fbc Compare April 8, 2024 09:18

up2jj changed the title ~~Optimize checking actor gate~~ Optimize checking gates Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize checking gates #172

Optimize checking gates #172

up2jj commented Apr 5, 2024 •

edited

tompave commented Apr 5, 2024 •

edited

mentero commented Apr 5, 2024

up2jj commented Apr 8, 2024

tompave commented Apr 12, 2024

up2jj commented May 22, 2024

Optimize checking gates #172

Are you sure you want to change the base?

Optimize checking gates #172

Conversation

up2jj commented Apr 5, 2024 • edited

tompave commented Apr 5, 2024 • edited

mentero commented Apr 5, 2024

up2jj commented Apr 8, 2024

tompave commented Apr 12, 2024

up2jj commented May 22, 2024

up2jj commented Apr 5, 2024 •

edited

tompave commented Apr 5, 2024 •

edited