Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix line skipping issue in receive_lines method #4491

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

yugeeklab
Copy link

@yugeeklab yugeeklab commented May 10, 2024

Which issue(s) this PR fixes:
Fixes #4494

What this PR does / why we need it:
Before this patch, long lines could cause breakdowns in fluentd, potentially posing a vulnerability. With this patch, max_line_size will be integrated into the FIFO, enabling the system to skip lines exceeding the maximum size before executing receive_lines.

Docs Changes:

Release Note:

@daipom daipom self-requested a review May 13, 2024 02:54
@daipom
Copy link
Contributor

daipom commented May 13, 2024

@yugeeklab Thanks for this fix!
CI is currently unstable because of #4487. We will fix it. Sorry for the trouble.

I see the intent of this fix as follows.

  • In the current implementation, large lines that would eventually be discarded in receive_lines are temporarily held in IOHandler's @lines.
  • This is a waste of memory.
  • This PR resolves the waste.

Surely, such a fix would allow us to limit memory consumption by the max_line_size setting to some extent!

This PR would be effective to some extent, however I believe the problem of memory consumption will remain.
It would be possible that FIFO's @buffer becomes unlimitedly large if the @eol does not appear in the data.

Are these my understandings correct?

@yugeeklab yugeeklab marked this pull request as ready for review May 13, 2024 09:18
@yugeeklab
Copy link
Author

yugeeklab commented May 13, 2024

Hi, @daipom

I've just published an issue #4491 for more information.

This PR would be effective to some extent, however I believe the problem of memory consumption will remain.
It would be possible that FIFO's @buffer becomes unlimitedly large if the @EOL does not appear in the data.

When max_line_size isn't set, FIFO's @buffer can grow indefinitely. Or if max_line_size has large value, FIFO's buffer will be limited, but there's still a possibility of fluentd experiencing slowdowns.

Summary:

as-is: max_line_size helps you avoid buffer overflow configuring via buffer section.
to-be: max_line_size helps prevent buffer overflow by configuring the buffer section and also ensures FIFO's buffer size remains limited.

If you have any suggestions, such as the fifo_buffer_size parameter or any other ideas, please feel free to discuss them with me.

Thank you for your review!

@daipom
Copy link
Contributor

daipom commented May 13, 2024

@yugeeklab

I've just published an issue #4491 for more information.

Thanks so much!

as-is: max_line_size helps you avoid buffer overflow configuring via buffer section. to-be: max_line_size helps prevent buffer overflow by configuring the buffer section and also ensures FIFO's buffer size remains limited.

Now I understand!
The following understanding was not correct.

This PR would be effective to some extent, however I believe the problem of memory consumption will remain.
It would be possible that FIFO's @buffer becomes unlimitedly large if the @eol does not appear in the data.

This fix clears FIFO's @buffer when read_lines.
So, this fix ensures FIFO's buffer size remains limited.

If you have any suggestions, such as the fifo_buffer_size parameter or any other ideas, please feel free to discuss them with me.

Thanks!
Basically, it seems to be a very good idea to limit the FIFO's buffer.

@yugeeklab
Copy link
Author

Basically, it seems to be a very good idea to limit the FIFO's buffer.

Thank you for your comment!! @daipom

Please let me know if there's any feedback on my code or idea. I'll review and accept your feedback as soon as possible.

Thank you.

@daipom
Copy link
Contributor

daipom commented May 17, 2024

About CI failures, although #4493 has been resolved, we still need to resolve #4487.

@daipom
Copy link
Contributor

daipom commented May 17, 2024

@yugeeklab The CI issue has been resolved. Sorry for the trouble.
Could you please rebase this branch on the latest master?

Before this patch, long lines could cause breakdowns in fluentd, potentially posing a vulnerability. With this patch, max_line_size will be integrated into the FIFO, enabling the system to skip lines exceeding the maximum size before executing receive_lines.

Co-authored-by: yugeeklab <[email protected]>
Co-authored-by: moggaa <[email protected]>
Signed-off-by: been.zino <[email protected]>
@yugeeklab
Copy link
Author

Hi, @daipom

Rebase is done!!

Thank you for your review!!

@daipom
Copy link
Contributor

daipom commented May 27, 2024

Sorry for waiting.
I will review this soon.

Copy link
Contributor

@daipom daipom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yugeeklab Thanks for this fix!
This fix basically looks good to me!
I've commented on some minor details (about the following), please check!

  • Keeping the same debug log as before
  • Improving codes
  • Improving tests

lib/fluent/plugin/in_tail.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/in_tail.rb Outdated Show resolved Hide resolved
lib/fluent/plugin/in_tail.rb Outdated Show resolved Hide resolved
test/plugin/test_in_tail.rb Outdated Show resolved Hide resolved
test/plugin/in_tail/test_fifo.rb Outdated Show resolved Hide resolved
@@ -146,5 +146,36 @@ def create_watcher
assert_equal 5, returned_lines[0].size
assert_equal 3, returned_lines[1].size
end

test 'call receive_lines some times when long line(more than 8192) and max_line_size is 4096' do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like this test to be more clear about what is being tested.

This test does not seem to be intended to test if receive_lines is called more than once (some times).
(Actually, it is called only once in this test.)

Perhaps it would be better to make this test independent of the when limit is 5 sub test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

in_tail plugin can cause breakdowns in fluentd
3 participants