-
-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to bucket sort in igraph_i_is_bigraphical_simple
#2546
Comments
I think the problem is that we need to cater for cases when the degree distribution is extremely skewed (like, many vertices with degree 1 and one vertex with a high degree). This would need lots of empty buckets, which might be prohibitive in terms of memory usage. One thing that I was considering for a while now though is to switch from the built-in comparison-based sort of the standard C library to timsort. Or, sort routines on integer vectors could use the difference between the minimum and maximum of the vector as a proxy to decide whether a bucket sort would be feasible, and fall back to timsort if needed. |
When considering graphs without multi-edges, the max degree is no greater than the vertex count. Thus for a bucket sort we only need an array of the same size as the degree vector (we only really need the degree counts, as vertex indices are irrelevant). This does double the memory usage compared to an in-place sort, but perhaps that's not so bad. Some of these graphicality checks can in fact be implemented by working directly with an array containing degree counts. We don't need to keep track of the vertices' identities, just how many vertices are there of degree 1, 2, etc. Reordering the degree sequences doesn't affect their graphicality. (This is not true for directed graphs where we do need to keep track of which in-degree belongs to which out-degree.) This is indeed the approach currently used in |
#2286 is related: when we do need to sort vectors of plain numbers (as opposed to more complex objects), we can make an improvement by (1) using better algorithms such as timsort or (2) inlining the comparator for special cases such as integer vectors. |
Actually we aren't even using an in-place sort, we're copying the degree vector (which is |
igraph/src/misc/graphicality.c
Lines 770 to 776 in 2728866
Switching the sort here from a comparison based sort to a bucket sort improves the time complexity to$\mathcal{O}(N_1 + N_2)$ .
Then all the implemented graphicality checks run in linear time, as far as I can tell.
The text was updated successfully, but these errors were encountered: