Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for File Provider file access #629

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

2xsaiko
Copy link

@2xsaiko 2xsaiko commented Feb 27, 2024

Reading files from File Provider file systems such as iCloud using the normal blocking API will fail with EDEADLK 'Resource deadlock avoided.' if the file is not currently locally downloaded. Therefore, do file access asynchronously inside NSFileCoordinator coordinate(with:queue:byAccessor:) to avoid this.

@ACTCD
Copy link
Collaborator

ACTCD commented Feb 29, 2024

Interestingly I've also recently been investigating asynchronous processing of native messages.

I was wondering if you have tested your branch? Because I found it doesn't compile in Xcode.

The reason is that await is missing here:

if let files = getAllFiles() {

I haven't reviewed your changes carefully.

But I have a curious question, if we wait for the download of files in iCloud through an asynchronous method, will this waiting significantly delay the loading of the extension UI and lead to even later script injection?


Task {
guard
let request = context.inputItems[0] as? NSExtensionItem,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capture of 'context' with non-sendable type 'NSExtensionContext' in a @Sendable closure

There is a warning here, but I believe this is a difficult issue to reconcile in Swift's asynchronous.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was looking for something like Rust's block_on but it seems like Swift really doesn't want you to do something like that, and it also seems like you can't control which thread an async task runs on. Hence this awful code. It should be safe though, considering the semaphore should ensure the object is only handled by one thread at a time.

Copy link
Collaborator

@ACTCD ACTCD Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem as safe as one might think: https://stackoverflow.com/a/71971635
But there seems to be no obvious problem in this scenario, since there are no repeated calls in the logic, and there should be no problem even without using a semaphore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this answer seems to me to specifically be about calling a blocking function inside an async context (from a thread used by the async runtime). Doing that can lead to a deadlock in probably any language with async functions, including Rust, if you have limited threads. I'm pretty sure beginRequest(with:) here is not called from an async context, though, so none of the async runtime threads will ever be blocked.

The semaphore is just in case the function calling beginRequest does something else with the NSExtensionContext afterwards. I don't assume so, it did work without it but dealing with data races in the past has made me careful :P

@2xsaiko
Copy link
Author

2xsaiko commented Feb 29, 2024

I was wondering if you have tested your branch? Because I found it doesn't compile in Xcode.

Oh oops, that's iOS code. I only tested the macOS build so I didn't catch that.

But I have a curious question, if we wait for the download of files in iCloud through an asynchronous method, will this waiting significantly delay the loading of the extension UI and lead to even later script injection?

Yes, right now this is basic support, this will absolutely block loading the script list until the files are completely downloaded. It behaves the same as over any potentially slow FS (such as webdavfs or nfs)

In fact, those are probably orders of magnitude worse since getAllFiles gets called a lot and at least I'm pretty sure nfs will read the file from the server every time. I recommend proactively loading the files and watching that directory for changes, only then reloading, instead of re-reading every time.

Reading files from File Provider file systems such as iCloud using
the normal blocking API will fail with EDEADLK 'Resource deadlock
avoided.' if the file is not currently locally downloaded. Therefore,
do file access asynchronously inside NSFileCoordinator
coordinate(with:queue:byAccessor:) to avoid this.
@ACTCD
Copy link
Collaborator

ACTCD commented Feb 29, 2024

Yes, right now this is basic support, this will absolutely block loading the script list until the files are completely downloaded. It behaves the same as over any potentially slow FS (such as webdavfs or nfs)

I think this might not be an appropriate way to implement it in this extension.
We should not cause UI blocking or unexpected script injection delays.

I think the appropriate way to do this might be to notify iCloud to download those needed files and return the existing ones. Include the contents of those newly downloaded files the next time if they complete the download.

Maybe use this API: startDownloadingUbiquitousItem
I'm currently not sure if this requires the App to enable iCloud related support.

@ACTCD
Copy link
Collaborator

ACTCD commented Feb 29, 2024

In fact, those are probably orders of magnitude worse since getAllFiles gets called a lot and at least I'm pretty sure nfs will read the file from the server every time. I recommend proactively loading the files and watching that directory for changes, only then reloading, instead of re-reading every time.

Yes, this is a known issue and one of the main back-end refactoring faced.
We documented related ideas and improvement options in issues #451 #452.

@2xsaiko
Copy link
Author

2xsaiko commented Feb 29, 2024

I think this might not be an appropriate way to implement it in this extension.
We should not cause UI blocking or unexpected script injection delays.

That's what already happens right now though. None of the behavior changes (except for concurrent reads which I added because it was straightforward), it just additionally makes it work for File Providers. It will currently also block if the file system is slow for whatever reason, that's kinda the nature of blocking I/O function calls.

I think the appropriate way to do this might be to notify iCloud to download those needed files and return the existing ones. Include the contents of those newly downloaded files the next time if they complete the download.

With that, a script may not be injected at all, as opposed to injected maybe a couple seconds late (which may, of course, happen for other unrelated reasons as well including simple system overload). How is never better than late?

I'm absolutely not saying this is best solution, it was intended to be a quick fix so that I could put the scripts on iCloud without having to be paranoid about them disappearing. The real fix requires re-designing the script loader to cache scripts in memory.

@ACTCD
Copy link
Collaborator

ACTCD commented Feb 29, 2024

That's an unfair idea, please calm down and think about it carefully.

I was excited to see your solution and even spent several more hours investigating the async issue.

But then I found that question that made me curious, if adding async support would further increase unexpected latency, which may not be what we want.

I don't mean async concurrent reading of the files, I'm not sure what the energy efficiency change would be.
I'm referring to the waiting time for iCloud network downloads.

All asynchronous processing will add uncontrollable latency, including of course those I/O processes.

But to be fair, reading files from a local hard drive (usually an SSD) is pretty fast, at least much faster than downloading from iCloud (depending on the network).

Like you said, reading the file from the local hard drive could be synchronous (even if it's in the iCloud path but has been downloaded), whereas waiting for the system to download from iCloud requires asynchronous support, which could be a significant time difference.

The crux of the issue lies in iCloud's eviction mechanism.
We just need to make sure those files are downloaded to the local drive instead of waiting for the system to download them from iCloud at any unexpected time. That's why I came up with that idea.

How is never better than late?

To be more clear, if there are 5 scripts in the iCloud path, 3 of them are not local due to the iCloud eviction mechanism.
I'd rather just read these 2 this time than wait for the delay of downloading another 3 scripts. They will be read on the next request after the download is complete.

That's what I'm trying to say. I'm not talking about extreme efficiency (using memory), And I'm not sure the complexity is worth it.
At the same time we know there are a lot of processes there that are worth redesigning, that's what we need to face when refactoring.

@2xsaiko
Copy link
Author

2xsaiko commented Feb 29, 2024

I'm sorry if I came across as being mad. I just think what you're proposing is a trade-off that would bring back the original issue that I have (though at least in a less annoying way), and also doesn't ensure that injection never delays because of FS access, so it's not really something I'd want to implement. These are kinda my two points:

  1. You cannot (including currently) rely on filesystem access times. For example, macOS comes with webdavfs for mounting WebDAV remote file systems, which from what I can tell behaves the same as iCloud (downloads files when opened first, then caches them on the local disk for the duration of the mount). And for FUSE file systems (think rclone mount) you can't make any guarantees whatsoever. Both of these look like local file systems however. So, if scripts should be injected as fast as possible, it should be handled for those as well.
  2. Skipping files that aren't yet locally available makes it unreliable, and this is the same situation as right now. IMO telling iCloud to materialize the file just automates me opening Finder and clicking "Download Now", but doesn't really solve the actual issue.

In my opinion, the preferable way to fix both of these at once is by not putting disk I/O in the script injection code path.

I can definitely solve this though, it might just take a bit longer. What I have in mind shouldn't be that complex, I just need to get familiar with Swift and the OS APIs. I'll mark this as draft in the meantime.

@2xsaiko 2xsaiko marked this pull request as draft February 29, 2024 15:50
@ACTCD
Copy link
Collaborator

ACTCD commented Feb 29, 2024

I basically agree with you. And I have no experience with Rust, so I assume you are more expert than me in some aspects.

It should be noted that we are not need to solve all related issues/processes in this issue/PR, It's just about iCloud access.

My main point with this PR is that putting the iCloud download process into the UI loading/script injection code path creates new delays. Yes, although it seems solves the issue of iCloud file access.

And due to the impact of uncertain network quality, this delay can be very terrible. We never want to see it spinning in circles for 5 seconds or more when opening the UI while waiting for iCloud to download, right?

So combined with that top-level async problem that can't be solved perfectly, I don't think we have full confidence in adopting this implementation for now. Do you think so?

This is also why I thought when I first raised the issue that we might need to consider fully supporting iCloud and redesigning the synchronization process to better address this issue.

Because using CloudKit apis we can control the iCloud sync process and avoid eviction of required documents. (I haven't investigated in depth and am not sure if the same with iCloud Drive)

@ACTCD
Copy link
Collaborator

ACTCD commented Feb 29, 2024

Because using CloudKit apis we can control the iCloud sync process and avoid eviction of required documents.

It seems that things are different from what I imagined.
There doesn't seem to be an API available there to prevent iCloud from evicting a specified file.
As always with Apple, everything is hosted and decided by the system, not the user nor developer.

It can be seen that the correct and feasible approach may be to copy the files to the App Sandbox document and update them through some synchronization mechanism.
I presume this is how Apple wants you to use with the file provider.

@ACTCD
Copy link
Collaborator

ACTCD commented Feb 29, 2024

I don't know if this could be a possible simple fix with no noticeable side effects, just check via isUbiquitousItem(at:) if it's in the iCloud path and call startDownloadingUbiquitousItem every time anyway.

Based on frequency of use, it may be less likely to be evicted from the system, and it also facilitates synchronization with the cloud version.

If a cloud-based file or directory has not been downloaded yet, calling this method starts the download process. If the item exists locally, calling this method synchronizes the local copy with the version in the cloud.

While this by no means completely solves the issue, and may not appear in the extension during the first few requests. However, it may significantly reduce the edge case of complete unavailability due to eviction for lack of space.

Of course, the better approach would still be to completely redesign the storage and synchronization logic, which would be a huge development and refactoring effort.

And I can see that the correct storage and synchronization mechanism does not require us to have a top-level asynchronous architecture, because we do not need to wait for the data returned asynchronously.

If you're so inclined, try this out and see how it performs in daily use. Since I don't currently use iCloud for storage and syncing, I can't observe and test its actual performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants