Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tlog is recording rsync data #327

Open
vincentwolsink opened this issue Apr 2, 2021 · 10 comments
Open

Tlog is recording rsync data #327

vincentwolsink opened this issue Apr 2, 2021 · 10 comments

Comments

@vincentwolsink
Copy link

When using tlog to record sessions, it will actually capture rsync data. This will seriously grow the log files, since it is effectively copying all the files being transferred to the logs.

tlog     23556 23555 86 08:02 ?        00:03:46 tlog-rec-session -c rsync --server --sender -logDtpre.iLsfxC . /data/analytic_events/authorization/logdate...
-rw------- 1 root root 642G Apr  1 19:20 /var/log/tlog.log

Using a non-recorded user for rsyncing is not really a feasible solution/workaround because of file ownership and permissions.

Is there any way in which tlog can detect this is not a shell session? Can we exclude sessions from being logged when a certain command is invoked when starting the session, in this case rsync --server for example?

@justin-stephenson
Copy link
Collaborator

Tlog intercepts, and forwards streams of I/O data across a pseudoterminal from an outsiders point of view. Handling the I/O is done in an abstracted way, by design. For this reason there is no functionality to have tlog to "filter" out certain commands. There could be some filtering allowed in tlog-play when reading back the recorded messages, but that does not help in this case.

You might check into rate limiting in man tlog-rec-session.conf(5) the section

   limit - Logging limit object

@vincentwolsink
Copy link
Author

vincentwolsink commented Apr 12, 2021

Thanks for your answer. I understand. Rate limiting does make things a bit better, but still a lot of data will end up in the logs. The thing I am suggesting is not "filtering" certain commands, but not logging the specific session at all.

In the example the -c flag will make the user shell run rsync and exit afterwards. So the entire session will be rsync data and can be ignored. Since this -c rsync —server ... argument is passed to tlog as a regular argument it should be easy to use this and decide wether to log or not I assume?

Of course the invocation arguments to exclude logging on should be chosen very carefully by the system administrator.

I might be able to craft some PR for this myself. But just discussing here to see if it makes any sense.

@justin-stephenson
Copy link
Collaborator

Thanks for your answer. I understand. Rate limiting does make things a bit better, but still a lot of data will end up in the logs. The thing I am suggesting is not "filtering" certain commands, but not logging the specific session at all.

In the example the -c flag will make the user shell run rsync and exit afterwards. So the entire session will be rsync data and can be ignored. Since this -c rsync —server ... argument is passed to tlog as a regular argument it should be easy to use this and decide wether to log or not I assume?

Of course the invocation arguments to exclude logging on should be chosen very carefully by the system administrator.

I might be able to craft some PR for this myself. But just discussing here to see if it makes any sense.

The most common case for tlog-rec-session is to be used as the user's shell, this will start the user's shell underneath tlog-rec-session in an interactive login session in which tlog has no way isolate file descriptor I/O by command or process(Only input, output, and window size changes). Sorry I don't think it is something that we can resolve on the tlog side.

@vincentwolsink
Copy link
Author

vincentwolsink commented Apr 13, 2021

I agree that what you describe is the most common use case for tlog. But if you enable session recording in SSSD, tlog will be in between every session. Not only interactive ones. And since things like rsyncing or scping (which suffers from exactly the same issue) files are also a very common use case of a linux system, in my opinion, they cannot be ignored.

I am not talking about interactive sessions where you need to isolate file descriptors or filter a stream, I understand that is very difficult. The issue is with non-interactive sessions where either rsync or scp is spawned directly without any tty. And the only purpose of the input/output is to transfer file data. These sessions should not be logged entirely.

Another example:

root     30256  0.0  0.0 180384  5640 ?        Ss   09:35   0:00  \_ sshd: user2481 [priv]
user2481 30276 11.9  0.0 180524  2568 ?        S    09:35   0:01      \_ sshd: user2481@notty
tlog     30277  3.4  0.0 227124  3812 ?        Ss   09:35   0:00          \_ tlog-rec-session -c scp -t /tmp/
user2481 30278  3.8  0.0 186676  2848 ?        S    09:35   0:00              \_ scp -t /tmp/

@justin-stephenson
Copy link
Collaborator

Sorry for the delayed response. If you have some solution to make a configurable list of commands (read at startup from the tlog config file) which can be ignored by tlog-rec-session in the '-c' invocation case, then feel free to submit a PR for that.

@bluikko
Copy link

bluikko commented Apr 9, 2022

This is quite an important limitation. I do not yet know anything about the architecture since I hit this issue while trying this system for the first time but could it possibly be resolved at some other layer, for example sssd?

@NeilHanlon
Copy link

NeilHanlon commented Dec 13, 2023

I'm also quite interested in a solution for this, as disabling output logging is the only reasonable solution, and that sorta.. defeats the purpose of tlog. In my case this delta is quite severe. ~4MB/s with output logging on, 230MB/s with logging off.

logging on:

file
     81,100,800   0%    4.82MB/s    0:34:29

no logging:

file
    715,587,584   6%  227.63MB/s    0:00:41

@kees-closed
Copy link

I'm also quite interested in a solution for this, as disabling output logging is the only reasonable solution, and that sorta.. defeats the purpose of tlog. In my case this delta is quite severe. ~4MB/s with output logging on, 230MB/s with logging off.

logging on:

file
     81,100,800   0%    4.82MB/s    0:34:29

no logging:

file
    715,587,584   6%  227.63MB/s    0:00:41

Does that work for both directions of an rsync? Because you can pull and push files, maybe that uses either input or output for those scenarios?

@NeilHanlon
Copy link

Does that work for both directions of an rsync? Because you can pull and push files, maybe that uses either input or output for those scenarios?

Good question. It does not seem to matter if I am pushing or pulling, as long as the target machine has tlog disabled.

@kees-closed
Copy link

Does that work for both directions of an rsync? Because you can pull and push files, maybe that uses either input or output for those scenarios?

Good question. It does not seem to matter if I am pushing or pulling, as long as the target machine has tlog disabled.

A different trade off is that by enabling input and not output anymore, you log passwords as well.

#77 (comment)

So maybe the sweet spot is to only log the terminal size and output and apply congestion control. Of course then the trade off is that you can saturate the session and type stuff without it being logged. But with my experiments I conclude there is no setup possible with tlog that solves all security risks and performance issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants