-
Notifications
You must be signed in to change notification settings - Fork 26
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
6 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
[package] | ||
name = "termchat" | ||
version = "1.0.0" | ||
version = "1.1.0" | ||
authors = ["lemunozm <[email protected]>"] | ||
description = "Distributed LAN chat application in the terminal" | ||
edition = "2018" | ||
|
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, nice refactor!
There are a couple of issue on this release:
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sigmaSd
You are right 🤦🏻♂️ , I left that. I had tested with the stderr redirected and it silently survived.
The second error is more serious. I will fix it, thanks!!
I hope that with the new command-action interface, the addition of new features and command will be simpler :)
Thanks for notifying this!
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I test the receiver and in my case it is not blocked...
I run an intensive test: 1GB file in both directions at the same time, and writing in the meantime.
What is exactly your test? To be sure, all termchat instance use the same version?
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes its the same git version, Just sending any file block the receiver, I'll debug it when I have time and report back.
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm... You let me crazy :)
The only place where other entity can lock the controller at the same time is in: https://github.com/lemunozm/message-io/blob/master/src/network_adapter.rs#L301
The only case I can imagine is that the receiver is always reading from the network.j
Could you confirm that when the file transfer is completed for the receiver (the file are entirely downloaded in
/tmp
directory), the controller unlocks and the app can be usable again?d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is rare because this means that you are reading a file and sending the data in one application faster than only the action of reading from the socket in the other app.
I will investigate, Thanks!
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I can confirm that its like you described, it unlocks and resumes after the sending is finished.
One more issue that I noticed, the files sent are modified, diff file /tmp/termchat/user/file show it differs, apparently termchat somehow appends zeroes at the end, with hexdump I can see this at the end
0000 0000 0000 0000 0000 0000 0000 0000
which does no exist on the original filesd873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note you need to remove the file after each send, so the send doesn't append to it, this is ok its an expected behaviour we can check for it later and maybe error if a file exist.
But the error I'm describing is different termchat is actually appends 0 bytes to the end of the original file.
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the information & help, @sigmaSd, I will analyze :)
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the appending 0s issue https://github.com/lemunozm/termchat/blob/master/src/commands/send_file.rs#L65 needs to be data[..bytes_end]
But there is another issue, when sending a file now sometimes its not totally received, its like some messages are lost:
If I add dbg!("send") here https://github.com/lemunozm/termchat/blob/master/src/commands/send_file.rs#L65
and dbg!("recv") here https://github.com/lemunozm/termchat/blob/master/src/application.rs#L135
I get send a lot more then recv
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/lemunozm/termchat/blob/master/src/commands/send_file.rs#L76 using unwrap here instead of ok reveled the issue
Side note: those .ok() makes debugging harder, you should use expect so you don't silently ignore the error
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It errors here to be exact https://github.com/lemunozm/message-io , maybe you should handle wouldblock
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the deep investigation :)
The "WouldBlock at send" is a pending task... It is hard to debug it because very rarely I get it on my PC. I have to try to develop an integration test that can cause it in message-io.
The idea behind with the
send.ok()
andsend_all.ok()
is that this function should never fail (In the future, even there is no Result). The user should no control those errors at that point. Instead, a Disconnection event (with maybe some kind of extra information) should be dispatched. Leaving the API cleaned to the user. And only treating the errors on one side.Regarding the solution, It is not trivial: When Wouldblock appears, an internal buffer should be created and the remaining data content should be copied to it and try to send later. Along with it, a POLLIN event for trying to send later should be added to the internal message-io poll.
As a temporal fix... only I can found to put the send() send_all() methods into a loop until the result was Ok, or maybe increasing/reducing the chunk size.
Regarding it, the only difference with the previous termchat version is that the CHUNK size is a little bit smaller (in order don't split the message in the messge-io receiver input buffer and increase the performance). Maybe this size means something in your computer and generates those errors that I have not.
Thanks!!
Is the link ok?
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the link https://github.com/lemunozm/message-io/blob/master/src/network_adapter.rs#L249
I think looping and checking for would block is not that bad and its a good idea till this is handled in message-io at some-point,
Regarding ok() I understand what you're going for, but I'm just suggestion to change it to expect till message-io make it not return a result, since as you can see in this issue sometime it does fail, and those errors becomes harder to debug when they're ignored
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I am going to make the fix (https://github.com/lemunozm/message-io/tree/fix/fast_fix_send_would_block) and update termchat to work with it.
When I have time, I want to investigate to create a better implementation.
I agree with the
send()
method. The problem is withsend_all()
, that maybe one endpoint fails but others could still receiving data. Anyway, with the previous fix, no errors will be generated from send() (If it occurs even with the internal send loop, a disconnection event will be generated).d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sigmaSd ,
I think this PR can solve your issue: lemunozm/message-io#12
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that fixes the panic when sending,
And I saw that you fixed the data sent in termchat
Now the issue that remains is the blocking of the receiver
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The blocking is here https://github.com/lemunozm/message-io/blob/master/src/network_adapter.rs#L342
I'm not receiving would block while the file is being sent, so that thread never yield the controller lock.
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why it is blocked but not why it happens:
From my point of view, the sending process should be slower than the receiving process because of reading the disk. What kind of hard drive do you have? 😄
I restored the original chunk size to 65535, maybe this implies something because it is the only real difference from the previous version.
The solution to this error implies an architectural modification (non-trivial) in message-io to allow simultaneously write-read, which should exist.
Thanks to these problems
message-io
get better :Dd873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well increasing the chunk size works like magic (I tested with your commit on maser)
I have an old pc with an HDD drive, maybe that explains it
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is dark magic that we are not ready to understand yet... 😃
I hope it can work until I fix message-io. Thanks for your help!
d873968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I upload message-io 0.4.6 and when point to that the issue comes for me and blocks the receiver :|
I reduced, even more, the chunk to make the reading file process relatively more expensive.
I will fix the problem in newer message-io versions.
Added to termchat v1.1.2