-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use sys_get_temp_dir() instead of vendor-dir as temp dir for downloading and extracting #10361
base: main
Are you sure you want to change the base?
Conversation
This caused issues in the past, when the temp directory and the vendor directory are not on the same drive, when attempting to move the directory.
PHPUnit 10 is not released yet, so I doubt that this is the case. |
If place for the temp dir is different from one system to an other, and that can't be detected automatically, what about a new config? |
What kind of issues? I didn't see any tests for this. Also, why not extract to the desired path right away, instead of extracting to a temp dir and then moving?
This is on the latest stable (v9.5), and that's why they are warnings and not fatal errors:
|
But this is set in the PHP config, so Composer doesn't need to determine where this is. |
To expand on this, we could just download the file straight to the cache dir and extract it straight to the final dir in the vendor dir (without moving). This would remove the need for a temp dir in both cases, and fix the filesystem delay issues. |
We cannot extract directly in the right location. For instance, the zip archive generated by GitHub (which are the majority of zips downloaded by composer, as the majority of packages registered on packagist.org are on GitHub) have a top-level folder wrapping the content of the repo. So we need to extract the archive into a temp location and then move the subfolder to the final location. |
Yes, on the level of move this makes sense (it may not be possible to move files across different file-systems). But on the level of copy (not move) it should not cause any issues. As the directory is temporary, a copy operation looks more correct in my eyes. The temporary store can be lightened up during the copy transaction or afterwards. I'm aware this might not be as simple as it sounds on paper.
Sounds reasonable but still thought most unzipping utility should have that functionality without dumping into a temporary directory. If it helps I'd say perhaps here is one point where an improvement can be achieved: if it helps removing the temporary location from the equation overall. I can imagine this depends on different zip adapters (unzip(1), ext-zip etc.) so again might not be as simple as it sounds, just saying. |
So then, how about an option to set the tmp folder used for extracting? |
Just to correct this statement, what really causes this problem is VirtualBox and co shipping broken filesystem drivers which do not allow you to read a file right after writing it. Composer is merely trying to work around these issues.
It seems like I am overall not so sure what to do here. What we have works as long as the filesystem isn't borked.. @Sarke have you tried setting |
I'm also a bit puzzled and still wondering. And if it helps and this patch with
Just tried it with composer in a development setup, this is a working example on linux, e.g. you should be able to test it within the guest. I could imagine such a procedure can be made more portable with a composer script and then you have it also directly in the project configuration (thanks to Same procedure can be applied for |
Installing outside the project tree isn't as simple as it looks, with things like composer/installers, packages get installed outside the vendor dir, blended with existing project files etc. It's easy to come up with happy path solutions but those don't cover all cases. |
Someone mentioned they had trouble with a NFS volume as well. Isn't the issue that any filesystem with enough lag will have problems? Also, the cause can be out of your hands, but the reason for using the vendor dir as a temp dir is not.
Yes, and I linked the issue detailing that solution. I also mentioned that the delay is arbitrary and will change from system to system; so how do we know if the delay is long enough?
Perhaps an option to set the desired unziper would be enough? |
One question is whether forcing ZipArchive usage isn't enough to fix it. My understanding (which is really just stabbing-in-the-dark understanding) is that it is not only a timing issue but one of forking a new unzip process doing the extraction of the files, and then the files aren't available to the parent process for a while.. But I would imagine if the PHP process does the file writing it might be directly available to it. Otherwise I can't see how these filesystems could be used in any real world conditions. As I don't have a setup to repro this, it'd be great if you can try it @Sarke. Essentially a patch would look like:
Turned into:
Then set the env and see if it helps? |
Sure, I will try in a bit. Of note though, it's not just the files being available after reading, but lag when deleting as well. If trying to delete a file that was just written, it will error. |
Right, the Composer\Util\Filesystem::removeDirectory has a removeDirectoryPhp fallback that could be used too if using external processes is an issue, this is similar as it uses |
Yes, the working directory remains unchanged. Therefore those installers and plugins that blend into it run into the same issue as composer install before when extracting into the projects'
Yes, the general problem could be described as lag or that the file-system is out of sync. These configurations easily cause problems with various utilities, process managers and (application) servers. Sometimes reloading the configuration already helps, sometimes you need to restart the VM because the system can not recover on its own. Depends on the file-system activity and host/guest operating system. I once had a throughput optimized file system removing some barriers on a Linux and when doing a composer install with a magento or spryker project using virtual box, the host systems kernel panicked due to the file-system activity during install. lessons learned ;). So just saying, perhaps seeing lag sometimes is not that bad compared with trashing the VM host. And if this thread finds some conclusions turning such setups into slower but correct is perhaps the best way to deal with it. Thinking about this and as the file-system activity is high for download and extracting an option If the moving fails - or anything from the reasons to have it inside vendor in the first place - you can at least decide which death you'd like to die ;) |
It's a bit finicky to test because it doesn't always error. When I made the suggested change, it errored on composer/src/Composer/Util/Filesystem.php Lines 261 to 271 in 239638e
I also noticed that the unmodified Composer code errored on composer/src/Composer/Util/Filesystem.php Lines 319 to 323 in 239638e
Since I'm using a Linux VM on a Windows host, this might be changed to this? Since it's actually a Windows filesystem, but it's not detected as such. if (Platform::isWindows() || Platform::getEnv('COMPOSER_RUNTIME_ENV') === 'virtualbox') { I will continue testing. |
|
Yeah, I was just don't want the "It's VirtualBox's fault, not on us" to be the conclusion, since filesystem lag can happen for a multitude of reasons.
Since there seem to be several instances where a delay and/or retry is added (like the Windows delete issue above), perhaps a more robust filesystem retry could be used in all situations? If it fails, it failed, doesn't always have to be on Windows or on VirtualBox. Detect the failure and retry after a 10ms, then 250ms, then 1s. It will only slow down if it has an issue, so why not?
100% agree. |
Because it is a cascading workaround, those normally cause more (and then more) problems. They also often look nice on paper but are a nightmare in practice. I mean I'm in for all the fun but you normally don't want that in production, this needs real testing during development and continuously in the application life-cycle. Just saying, composer comes with no strings attached, you can do with it whatever you want, so feel free to wrap up an implementation ;) Jokes aside, you see yourself how hard these timings problems are to deal with just reproducing them is a game of luck and patience. I think the pointers Jordi gave are good and as you actually can reproduce sometimes you're most likely the person who is in the position to get the best insights. If I were you I'd go on with testing and learning about what that lag is in the first place and why the processes are out of sync. Or if not shelling out is the solution in your case as well etc. . |
I was just thinking out loud. Using the temp folder like this PR is the ideal solution for me, but it wasn't acceptable, so... |
Double checked and can confirm there is no such command line argument/switch for However: On a system that supports symbolic links I was able to extract a zip package from Github (1.10.24.zip) into a dedicated Standard unzip operation worked like a breeze, I also tested on a dated alpine container (3.3). This is how I tested. Instead of moving/renaming - if I understood the original problem right - this is creating a symbolic link before unzip and removing it afterwards. Unzipping then directly goes into the packages subfolder in vendor. This might not solve any fs latency issues directly but could spare to go into a temporary extract location and moving/copying from there. Does this sound right? |
@Sarke I am not strictly against changing the temp dir we use, but it has to keep working for other scenarios and not break things to fix the virtualbox use case. What @stof mentioned about different windows drive is one issue, which might be detectable. I wonder if there are other cases where things could go bad. open_basedir restrictions could be a problem.. I am not sure if permissions could be problematic on linux where permissions are inherited from the project dir right now as we extract in vendor, but if extracted in /tmp and then moved over the permissions might suddenly differ. It's all a rather complex issue touching many different environments, so it's hard to test for everything or even think of all the possible cases. @ktomk that's an interesting hack, might be workable to do this only on linux, and keep the current extract-then-move for other platforms. |
Be careful when using symlinks: this will change the relative location of the source code compare to the vendor folder, which will break cases where the code does things like |
The point is to use a symlink to trick unzip into the right path, then you remove the symlink and are left with the real files in vendor/foo/bar, so this is not a concern. |
That trick still requires knowing the name of the root folder, which is currently discovered by composer after unzipping. |
I have not yet looked much into the current implementation. Albeit for that it is certainly to extend my test case with an example how to do it on command-line beforehand (I experimented with infozip mode yesterday). From what I've seen, the test-case should also handle at least one of the other unzip utilities (e.g. 7z) as the unpackager entity as far as I've seen is looking for multiple and I'd like to better understand what this means in context of the hack. |
We could use A few other options are to prefer using the 7z binary if it's installed, or prefer |
Any news on that? |
@djirik Not from my end, sorry. I could not even update the test so far, but if, you could have seen that in my fork. Thanks for your interest though. |
So this is because I scratched my head several times regarding #9627, #9945, etc. This is a small merge that solved the problem.
I'm using Docker, inside Virtualbox, on a Windows host, with Dropbox syncing the shared folder going across all. Yes, not ideal, but the issue of the shared folder not keeping up, be it a VM shared folder, a network drive, or whatever.
The above mentioned bugs are marked fixed, but still some people are having issues (incl. myself). The fix was to add a short delay for the filesystem operations to allow it to catch up. This is quite arbitrary because the length of the required delay will vary from system to system.
But the problem is one layer deeper: why is Composer using the vendor dir as a temp dir? This is really what causes the problem: it will download a temp file, extract the temp file to a temp dir, delete the temp file, and move the temp dir to the actual install path.
Since many people are having issues due to the above mentioned scenarios (VMs, NFS drives, etc), this is not ideal. Composer uses
sys_get_temp_dir()
in other situations, so I am unsure of why it wasn't used for the downloader.So, I put
sys_get_temp_dir()
in two places that fixes this. I am new to the Composer code so I'm sure this merge can use some tweaks.One thought is why does FileDownloader not use the cache dir?
Another is that I only made the changes that affect me, but I see there are several different archives supported so some of them could be changed to use the system tmp dir as well. Perhaps a central temp dir helper would be useful, and this would also allow this to more easily be a configurable option (if so desired).
Unrelated, but I found setting up the test environment a bit of a hassle with the 100+ warning about deprecations in PHPUnit 10 on PHP 8.1. Perhaps a Docker test environment could be made available to ease the entry for potential contributors? Looking at the github CI files it looks like PHP 7.3 is the preferred version to test on?