-
-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I got it to run in a distributed manner, but it's only using 1 worker while I had 2 workers on #29
Comments
This is due to race condition between the three threads in master, I wrote a thread safely distribute task controller and updated it to |
Forget to commit some code. Try the newest branch! |
As I merge the code manually, and I currently don't have more than one machine to test the distributed code. Just keep this issue open if you have any problem. |
And also check you log in log directory, if you start the master and worker correctly, you should see the info like below:
Make sure your output has |
About the Feature Extraction Error with the Gerrard Hall Dataset from colmap: https://colmap.github.io/datasets.html#. The Feature extraction process killed itself. |
Try to use the GPU version feature extraction. Or it would require much time for large scale datasets, and could be killed by operating system. |
I met the same problem. I have 3 computers in total, I set one of them as master, the other two as workers. But when master started, both workers' status are IDLE. Cluster Id IP Worker Status Progress Task Status Time Have you solved this problem? After approximately 10 minutes, worker 0 start running. However, worker 2 remains IDLE. |
Could you show me the running information of workers? Make sure the command from the master has been sent to workers, and workers received the command. From the |
My config.txt:
running information of workers:
|
It seems data is not sent to workers, since the status of worker is |
After approximately 15 minutes, the worker started to reconstruct:
I have 133 images in total, which is divided into 2 clusters. The first cluster has 89 images. I set the |
I'm busy recently, so it would not be a short time for me to reproduce this issue. You're encouraged to debug the code, and feel free to fix this issue. |
Okay, I will try to debug the code and my settings. Thank you! |
The problem solved after I set the --transfer_images_to_server to 1. I am wondering how to save the images transferring time using storage server's share folder? |
I used 50 images per cluster with 100 images. it group to into clusters of 66 images, 2 clusters. But the clusters only ran on 1 worker not both workers in parallel.
The second worker is idle in the picture.
Stuck at 0% for 2 workers, with 1 worker on localhost max time was 40 minutes. This lasted longer that 40 minutes if we sum both worker's time.
The text was updated successfully, but these errors were encountered: