Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase fork with upstream #4

Closed
wants to merge 45 commits into from
Closed

Rebase fork with upstream #4

wants to merge 45 commits into from

Conversation

eaguilera23
Copy link

@eaguilera23 eaguilera23 commented Oct 18, 2024

The commits we were using got merged into master

kafka4beam#588
kafka4beam#580
kafka4beam#578

zmstone and others added 30 commits June 15, 2024 12:37
Mostly due to Kafka 0.9 does not cope well when max_bytes is 0.
Also, it does not make sense to fetch less than 12 bytes because
it needs at least 12 bytes to figure out the remaining bytes
of a message set.
Approaches OTP 27 -based development
I noticed rebar.lock was outdated against rebar.config
format_status/1 was introduced in OTP 25

format_status/2 is now having brod compilation fail,
because of warnings_as_errors (in OTP 27)

This commit presents one possibility to solve
the issue: adopt the new format_status, which means
that effectively brod becomes >= OTP 25

Another possibility is to remove warnings_as_errors
and keep current behaviour until the function
is removed from OTP

I opted to remove format_status/2 from brod_supervisor/3
as it's doing mostly nothing relevant except "flagging"
state.module as "Callback"
I've observed a data race when starting a consumer from within another
consumer (using co-partitioned topics), where the call to
`get_partition_worker` fails due to badarg in `is_process_alive`.

It seems that 1f2290b ("Verify partition worker process is alive.",
2022-05-19) already tried to resolve the data race, but did not consider
the possibility that the lookup returns `undefined`.

Example exit report from Elixir logger:
```
Last message: {:EXIT, #PID<0.2613.0>, {:badarg, [{:erlang, :is_process_alive, [:undefined], [error_info: %{module: :erl_erts_errors}]}, {:brod_client, :get_partition_worker, 2, [file: ~c"/app/deps/brod/src/brod_client.erl", line: 496]}, ...MyCallChain]}}
```
…d_pid_in_get_partition_worker

Guard against undefined pid in get partition worker
Remove snappyer from default dependency
`brod:transaction` spec is:

```erlang
-spec transaction(client(), transactional_id(), transaction_config()) -> {ok, transaction()}.
```

and brod_transaction:init also has:

```erlang
init({Client, TxId, PropListConfig}) ->
  ClientPid = pid(Client),
```

but new and start_link were expecting only a pid. The client() spec
itself is:

```erlang
-type client() :: client_id() | pid().
```
Fix type spec in brod_transaction
`commit_fun` is already exposed, but `ack_fun` is not. This is useful
when acking does not happen inside the worker process.

E.g. We divide messages between multiple processes and later in a
separate process we create a transaction and commit a batch. Now we need
to let the consumer know that it's okay to fetch more messages.
Expose ack_fun through the init_info
Fix typo in brod_consumer documentation
indrekj and others added 15 commits August 12, 2024 13:16
The brod:fold/8 did not work correctly when the last message in the
message set a transaction commit.

The output looked like this:
```
%%% brod_consumer_SUITE ==> t_fold_transactions: FAILED
%%% brod_consumer_SUITE ==> {function_clause,
    [{lists,last,[[]],[{file,"lists.erl"},{line,228}]},
     {brod_utils,handle_fetch_rsp,7,
         [{file,"/home/indrek/gems/brod/src/brod_utils.erl"},{line,661}]},
     {brod_consumer_SUITE,t_fold_transactions,1,
         [{file,"/home/indrek/gems/brod/test/brod_consumer_SUITE.erl"},
          {line,443}]},
     {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1783}]},
     {test_server,run_test_case_eval1,6,
         [{file,"test_server.erl"},{line,1292}]},
     {test_server,run_test_case_eval,9,
         [{file,"test_server.erl"},{line,1224}]}]}
```

The issue was that `#kafka_message{offset = LastOffset} = lists:last(Msgs),`
was used, when the Msgs list was empty.

Now instead of trying to infer next offset from the kafka_message, we
pass the NextFetchOffset from the brod_utils:fetch_one_batch/4 function.

Fixes kafka4beam#588
Add sub-section to authentication section with a link to brod_oauth
for oauth bearer support.
…plugins-section

Add additional auth plugins section
ci: use upload-artifacts v4 remove coveralls
Introduced `share_leader_conn` consumer config option.
Set to `true' to consume less TCP connections towards Kafka,
but may lead to higher fetch latency. This is because Kafka can
ony accumulate messages for the oldest fetch request, later
requests behind it may get blocked until `max_wait_time' expires
for the oldest one
…e-own-connection

feat: add boolean flag 'share_leader_conn' in consumer config
Copy link
Member

@indrekj indrekj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ranvier does not use our custom fork anymore https://github.com/salemove/ranvier/blob/master/elixir/mix.exs#L32 so I don't think we need to do anything in this repo

@eaguilera23
Copy link
Author

Should we delete the fork? @indrekj

@eaguilera23 eaguilera23 deleted the ea-rebase-fork branch October 18, 2024 11:45
@indrekj
Copy link
Member

indrekj commented Oct 18, 2024

Depends, If all applications use ranvier that doesn't use the fork any more, then yes we can delete it. Otherwise no.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

8 participants