-
Notifications
You must be signed in to change notification settings - Fork 86
add async support using bisync crate #176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
d7ba16b
to
8c5f264
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An interesting approach! I want to dig into the docs in more detail but it seems workable - modulo that AFIT feature-flag that has appeared.
It might also be good to move the example import renames to another PR to keep this one cleaner so we can see precisely only that which is affected by the async additions. |
That goes towards a bigger question: do you want to break the public API by moving the existing API into a new There's the whole blocking API of the crate, then the I think this would be rather confusing. It's difficult to even tell that I clicked a link and navigated to another page because it looks so similar to the documentation's front page. By moving the existing API into a |
I've avoided running |
I had to use the The other places I used |
Yes, leaking a file handle means you can't reopen it, and handles are a finite resource. Plus certain context is only written to disk on file close so the disk ends up needing a fsck/scandisk pass. As long as that only affects the async users that's fine. Moving the blocking API into the blocking module is fine with me. |
Can you rebase? #177 might make this smaller. |
Is it possible to give the async version a blocking drop? Better than doing nothing. |
d75eb0f
to
de430ba
Compare
Okay, rebased.
Unfortunately not. That would require this crate to bring in its own async executor and set that up just within |
Is there no no_std block_on! ? Can't you just spin polling the Future? |
I just found embassy_futures::block_on, however, I don't think it's a good idea to use it to impl Drop for async Files/Volumes. We can't know if a user would want that and there's no way to opt out of Drop::drop getting called if it's implemented. For example, the user could have strict power consumption requirements that require putting the MCU to sleep on every |
But it's probably better than a corrupt file system, and any correct program will close them asynchronously before dropping. What do other async File APIs do? |
impl Drop for File in async-std: impl Drop for File {
fn drop(&mut self) {
// We need to flush the file on drop. Unfortunately, that is not possible to do in a
// non-blocking fashion, but our only other option here is losing data remaining in the
// write cache. Good task schedulers should be resilient to occasional blocking hiccups in
// file destructors so we don't expect this to be a common problem in practice.
let _ = futures_lite::future::block_on(self.flush());
}
} Suggestion: impl Drop for async File/Volume using |
3c49859
to
7b15cf5
Compare
Nevermind, I forgot |
Some blogs about async drop: TL;DR: It's complicated and Rust is a long way from async drop being implemented. Probably not for some more years. Blocking in Drop is an ugly workaround, but I think it's sufficient for now. |
You'll need to handle dropping Directories too. |
Directory::drop doesn't make any async calls, so there's nothing to do there. The existing implementation is already enough. |
That would have definitely been helpful and save quite some time. I would have never thought of the close if I didn't go through this discussion before and even with that I initially missed the await, only the warning made me notice that. Other than that, could missing the proper close be the cause for the failures I saw about CardNotFound? The drops come later so I wonder if the failures were due to not closing properly on earlier executions but I think it took place also right after power up of the device, only it doesn't any longer. |
Unlikely. Closing a file just involves writing out a new directory entry with the updated file size. But the card doesn't care what you write to it - it just sees block reads and block writes. |
What about the Volume? Could the missing closure of the Volume cause what @yanshay encountered? |
Nope. Nothing is written to the disk when you close a volume. The open volume just caches some filesystem metadata to save us looking it up. You can verify this by looking at the code that is executed when you close a volume. |
SD Card initialisation is messy, and I've only tested this crate (and only in sync mode) on a limited number of cards. My recommendation is to strip everything down to the bare minimum, and put a logic analyser on the bus so you can see what's going to and from the card. And try the hardware with a known good SD Card implementation like fatfs. If fatfs works and this crate doesn't, we can look at the traces to try and work out why. I speak from ~25 years of experience when I say that complex issues are not usually solved by trying things at random. You have to go back to the most basic state you can until you hit something solid, and then build up from there. If nothing seems solid, go lower down. |
The thing is that now it's working, not sure what changes I did got it working. So are you basically saying that the SDCard and not the Board hosting the SDCard is meaningful? |
Definitely. Each SD Card is a tiny computer running its own firmware, and they all do things slightly differently. |
Yes. My motivation for working on this is rewrite the firmware for a little music player gadget in Rust. At first I just wanted to do this for fun, but I've been struggling with bizarre, extreme slowdowns in read speeds with the C++ firmware, but only with one particular SD card. After trying everything I can with that firmware, my only idea left before concluding some interaction between the board and my particular SD card is cursed is trying a completely different software stack. |
I think I ran into another problem or limitation. My SD card shares the SPI bus with another device using I haven't had the time to probe it with a logic analyzer yet so this is all still speculation at this point. I put together this repoducer that triggers the error. The blocking version works fine. |
let mut delay = Delay::new_command();
loop {
let result = self.read_byte().await?;
if (result & 0x80) == ERROR_OK {
return Ok(result);
}
delay
.delay(&mut self.delayer, Error::TimeoutCommand(command))
.await?;
} |
Are you sure the if self.card_command(CMD13, 0).await? != 0x00 {
return Err(Error::WriteError);
}
if self.read_byte().await? != 0x00 {
return Err(Error::WriteError);
} |
Yes. I added some logs like this if self.card_command(CMD13, 0).await? != 0x00 {
warn!("CMD13 failed");
return Err(Error::WriteError);
}
if self.read_byte().await? != 0x00 {
warn!("Read byte oyther than 0x00");
return Err(Error::WriteError);
} and it's consistently the the first one. |
I forgot to mention that the reproducer occasionally shows a |
Is it reproducible with multiple SD cards? |
I'm seeing the same error with a SanDisk Extreme U3 32GB and an off-brand U1 16GB card. Presumably the SanDisk card is of decent quality. |
Interesting. Yeah, if the task running the SD card can be stalled for an arbitrary amount of time, we'll need to make sure we're robust to timeouts and make sure we retry things appropriately. That hasn't been an issue for me at all, because I'm single-tasking and using blocking mode. |
I found a solution to the According to this Stack Overflow answer (which gave me this idea), cards always start sending their responses within 1-8 bytes after the command bytes are sent. This puts an upper bound on the number of bytes in the transfer (19 to be precise). I have a commit here that demonstrates this approach. I confirmed that this solves the problem shown in the reproducer. The blocking version also still works fine. The way I changed the I can make a PR against this branch or feel free to cherry pick the commit. |
That SO answer is fantastic. I am worried about the advice to send a dummy 0xFF byte with CS high in order to clear the output register on buggy cards. Perhaps we should own two SpiDevice handles - one for the card and one for 'not the card'. Then we can bring the dummy byte clocking at start up back in-house. |
It's slightly annoying that the
I guess that could work but would that require the user to use a physical pin for the the "not the card" CS? That might not always be practical.
In |
I gave the dummy spi device approach a try but I encountered two problems:
|
See #126. If you take the whole bus, no one else can use it? Edit: oh, I see. They borrow it and only to send the 74 clock cycles. That seems fine, but can you do that whilst an SpiDevice also exists? |
Yeah, that function only serves as a convenient way to send the 74 clock cycles. It doesn't solve the problem that we need to do stuff outside of the Btw, I'm not sure if I've ever seen the lack of |
If this is working with multiple SPI devices on a bus without doing that, do we need to bother implementing that? |
A nice thing about this API is that it allows the caller to get back a mutable reference to the SPI device to change the clock speed after initialization, which the embedded-hal[-async] traits do not provide a means to do. |
if command == CMD12 { | ||
let _result = self.read_byte()?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't add this check to the buffer scan. Do we know what the value of this byte is? I can't find anything about it in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my reading this is not an official spec but it is commonly discarded
i.e. http://elm-chan.org/docs/mmc/mmc_e.html
"The received byte immediataly following CMD12 is a stuff byte, it should be discarded prior to receive the response of the CMD12"
Seems unlikely that it will contain 0x80 and it seems other implementations allow this byte to be check anyway like micropython sdcard.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, then this is probably fine. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually sorry, that example does show that byte being skipped with self.cmd(12, 0, 0xFF, skip1=True). Perhaps this is important for some cards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure that when issuing CMD12 it is checking for errors well (here or main). If a transfer is not actually stopped then it may get stuck in block reading unexpectedly and not pass an error. But checking for a valid response from CMD12 may not really be an issue here since it will just read a bunch of bytes that should get the card back to a r1 state anyway.
This moves the existing API into a
blocking
module and duplicates the API into a newasynchronous
module, usingasync fn
where applicable. This uses the bisync crate rather than maybe_async used by #154. This involved moving everything in the src directory into a new src/inner directory, then exposing blocking/async versions via how they are imported in src/lib.rs.I think this resolves the concerns brought up in #154: sync and async can both be built at the same time, usage of bisync by other crates in the dependency graph doesn't affect this, and changes to the example and test code are minimal (just adjusting import path).
I moved submodules that have no difference between blocking & async to a new
common
module. That code gets re-exported under theblocking
/asynchronous
modules in the same relative path in the module hierarchy as before. There may be opportunities to split more code into that common module, but I have tried to keep changes to a minimum for now to make it easier to review.I have not added Cargo features for sync/async; they're both built. If requested, I could put each behind a feature gate, though I don't think it's necessary.
Fixes #50