Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SIFT GPU support #623

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open

Add SIFT GPU support #623

wants to merge 31 commits into from

Conversation

cojosef96
Copy link

This PR adds SIFT GPU feature detection and matching to improve performance drastically with GPU.

@YanNoun
Copy link
Member

YanNoun commented Jul 31, 2020

Hi @cojosef96 !

Thank you for your PR, it's great to try to speed-up the OpenSfM pipeline by introducing GPU image processing. This has been requested quite sometimes by people over the years.

As it is now, the PR would add a bit of code duplication that seems not needed. It would be better if :

  • The GPU extractor would be just another feature type, as you already did.
  • Same for the matching, it would just another another matching type (that one is to add). So other feature types would benefits the GPU matching
  • From the above, then we wouldn't need the duplicated matching code.
  • There is also some debug stuff remaining (config override, ptyhon2.7 enforce, depth command override).

Let me know if you can rework this, or when we'll find some time, we can also do the changes on our side.

Yann

@JakeSmarter
Copy link
Contributor

While I appreciate GPU support for OpenSfM, I am not sure that this PR is structurally suited to be the way to go. It adds yet another dependency to an already over boarding long list of OpenSfM's dependencies. If anything the SIFT GPU extractor should be a weak dependency (hence plug-in like).

@cojosef96
Copy link
Author

Hey, @YanNoun
Thank you for reviewing my PR.
I will gladly help to change the PR to be more suitable for the OpenSfM Master Branch.
The Matching functions were copied and changed because the GPU matching does not work with parallel computing. There are two ugly ways and one clever way to solve this as I see it.

  1. Change the number of processes to 1 when GPU matching is used.
  2. Initialize the gpu_sift_matching for every process (this takes a long time so I don't recommend it)
  3. A more complicated but clever way is to create N sift_gpu_matching functions (N is the number of processes) and assign sift_gpu_matching function 1 to process 1 and the same with the others.
    Another Problem with the matching is the structure of the keywords is NumPy record array. In order to use the GPU matching on None sift_GPU descriptors, I need to create a function that transfers pos, features, color to keypoints NumPy record array.
    As for now, I will change the number of processes to 1 when gpu_matching is on. And I will create a function to change the p, f, c to np.recarray.
    After these changes I will try to fix some debug stuff, please explain specifically what debugging stuff is needed.
    If there are some tasks or changes you guys want to do yourselves feel free to do so.
    Best Regards,
    Yossi

@cojosef96
Copy link
Author

Hey, @gitne
In order to run a script on GPU, we need low-level GPU programming language. Silx uses pyopencl which is a bridge between python to the OpenCL GPU programming language.
If GPU usage is needed than we have to add some dependencies that can use GPU to the project.
If you have a suggestion for good dependency. Please tell me and I will consider using it.

@JakeSmarter
Copy link
Contributor

In order to run a script on GPU, we need low-level GPU programming language. Silx uses pyopencl which is a bridge between python to the OpenCL GPU programming language.

I know, I am aware of that. And, I am also all for GPU support. So, this is not the point.

The point with your current implementation is that one cannot run OpenSfM unless one also has the Silx dependency, even if one configures OpenSfM to use a pure CPU or pure Python features extractor. Python's import system requires one to satisfy the silx dependency, even or especially if you do not need it.

So what I am saying is that silx should be a weak/late binding/lazy loading/lazy binding—what ever you want to call it, Python calls it programmatic importing—dependency. See Python's importlib module. Structurally, there are basically two ways to do this in Python, on which I am not going in to detail here because good examples are available on the web. One big issue with native code is that sometimes it is not readily available for all platforms. So, users on those platforms are stuck on a version of OpenSfM until these particular native code dependencies get ported to a platform the users run.

@@ -3,7 +3,7 @@
set -e

DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
PYTHON=${2:-python3}
PYTHON=python2.7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not look right. Looks like debug stuff.

$PYTHON $DIR/opensfm undistort $1
$PYTHON $DIR/opensfm compute_depthmaps $1
#$PYTHON $DIR/opensfm undistort $1
#$PYTHON $DIR/opensfm compute_depthmaps $1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like debug stuff too.

@edgarriba
Copy link
Contributor

Hi guys, not sure about your timings here but as part of a GSoC project, by the end of the summer OpenCV is going to introduce back sift to the main repo after its patent expiration.

@JakeSmarter
Copy link
Contributor

@edgarriba Do you know if OpenCV's implementation is also going to have GPU support?

…xternal functions. added conversion function from pos, desc to GPU keypoints meaning we can now run gpu matching with different features, not only SIFT_GPU features. Tested Import Error of Silx.
@cojosef96
Copy link
Author

Hey,
I fixed the issues discussed above, please check them and see if there are more issues to fix, feel free to let me know and I will do my best to fix them.

@JakeSmarter
Copy link
Contributor

JakeSmarter commented Aug 6, 2020

So far, looks great to me.

66168c3
Looks like still some debug remnants. Please also squash/rebase off any debug commits because otherwise they will end up in the git index, especially the csfm.so binary, which is quite big and thus will bloat the index.

@pierotofy
Copy link
Contributor

pierotofy commented Aug 7, 2020

Hey 👋 all, noticed this PR. I've also been researching ways to add GPU support to OpenSfM. So far I've only added the feature detection part, but the work I started on this repository https://github.com/uav4geo/pypopsift/ can be used to swap the SIFT feature extractor. pierotofy@00bb120

It's a WIP and performance is not ideal though.

@edgarriba
Copy link
Contributor

@gitne not for this coming release, but if there's special interest into this feature probably the dev team could take a look at it. /cc @vpisarev

@cojosef96
Copy link
Author

Hey, I fixed all the issues noted above. I hope that this PR will be helpful to you.
Please let me know if there are more things needed to add/fix.

@YanNoun
Copy link
Member

YanNoun commented Aug 10, 2020

Hi @cojosef96 ,

I reworked a bit your branch here : master...feat-polish-sift-gpu
More specifically, I refactored a few things :

  • Unified the Root-SIFT normalisation taking best-of-both-worlds (especially the zero-norm case), so we have one function
  • Removed the GPU-specific storage, so all features type are stored the same way : in the end, everything ends-up in a numpy array. Greatly simplify the code : no more need for the keypoin' arg/return.
  • Following the above, create_gpu_keypoints_from_features does proper normalisation of the descriptor
  • I added Lowes ratio, peak/edge threshold to be passed to Silx (and added in config.yaml), so we have consistent results between all matching types. I tested it a bit and matching results were slightly better with GPU_MATCHING due to the fact it is exact NN (and not approximate as for FLANN). Peak/edge threshold is a bit capricious in Slix, # feature can vary hugely, I might add some bucketing.
  • Removed the testing dataset : we want to keep the repo light.
  • Adding slix as requirements is still in discussion, we might not enforce it. Following that, we'll rework the documentation.

After benchmarking on a 16-inches Macbook (16 logical cores), feature extraction is 2x faster, but matching is much slower than the CPU-16-cores one.

Let me know what you think of the following changes.

Yann

@cojosef96
Copy link
Author

cojosef96 commented Aug 10, 2020

Hey @YanNoun ,
I am glad you used my PR for the OpenSFM project.
The changes you made seem good to me.

  • I saved the keypoints format in order to save time in the loading. but one saving format is cleaner and more logical.
  • The berlin_gpu folder is for running the demo and not necessary.
  • The root_Sift normalization looks good to me.
  • The function create_gpu_keypoints_from_features doesn't look so efficient to me (but I didn't check ) so I used both the keypoints and the regular features form.

About the running on Mac:

  • The GPU implementation doesn't work efficiently when running using parallel computing.
    when parallel computing is done, the gpu_sift feature extractor and the gpu_sift_matching run the initializing process every time.
    Please check the code when using 1 process, I hope it will run faster.
  • As I wrote in the ReadMe, the matching process without initializing took 0.025 sec and the feature extraction without the initializing took 0.28 sec on Nvidia 1080-Ti.

This parallel problem can be fixed by demanding 1 process when using GPU functions.

I hope you will be able to merge the GPU implementation to the master repository.

Best Regards,
Yossi

@facebook-github-bot
Copy link
Contributor

Hi @cojosef96!

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@cojosef96
Copy link
Author

Hey,
I sign the facebook contributor license agreement,
there is the ReadMe.md conflict we should fix, my readme contains info about the GPU update which is not necessary for the master repository so you can use the basic readme.
Best Regards,
Yossi

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@pierotofy
Copy link
Contributor

pierotofy commented Feb 11, 2021

feature extraction is 2x faster, but matching is much slower than the CPU-16-cores one.

It's worth pointing out that feature extraction using the proposed GPU method does not account for feature_min_frames (and often times the CPU implementation does need to recompute SIFT multiple times by lowering the peak threshold to achieve the desired number of keypoints).

Feature extraction is nonetheless quite fast using the GPU, if feature_min_frames is not needed.

Feature matching speed can be improved a bit if one removes the code to load the image data (check_gpu_initialization(config, im1, data)), which is unnecessary. After these changes, the GPU implementation is slightly faster on a one-to-one process comparison (using only one core), but obviously doesn't scale, whereas the CPU implementation does.

@YanNoun YanNoun mentioned this pull request Mar 15, 2021
@theUpsider
Copy link

Is there going to be a merge in the near future? I think many people are waiting for this to be in an official release.

@edgarriba
Copy link
Contributor

edgarriba commented Aug 25, 2021

@cojosef96 you might consider using kornia implementation (based on pytorch) or even more powerful features like HardNet /cc @ducha-aiki

@pierotofy
Copy link
Contributor

Temporarily you can also check out this fork/branch https://github.com/OpenDroneMap/OpenSfM/tree/261 that has GPU support merged (mostly adapted from this PR).

@dlebauer
Copy link

@pierotofy would https://github.com/OpenDroneMap/OpenSfM/tree/261 make this PR obsolete? Would it make sense to close this PR and open yours?

@pierotofy
Copy link
Contributor

pierotofy commented Sep 14, 2021

No, 261 has several features/changes that are probably outside of the scope/general usage of OpenSfM. You would have to cherry-pick the relevant changes that pertain to GPU support and create a new branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants