Adjacency matrix for only some vertices #2196

szhorvat · 2022-09-11T11:01:24Z

What is the feature or improvement you would like to see?

Currently there is igraph_get_adjacency(_sparse)() to construct the adjacency matrix of a graph. There is also igraph_get_incidence() for a bipartite incidence matrix.

I propose a generalization of both where the vertices corresponding to the matrix rows/column can be specified:

igraph_error_t igraph_get_adjaceny_submatrix(
        const igraph_t *graph, igraph_matrix_t *res,
        const igraph_vs_t rows, const igraph_vs_t cols,
        const igraph_vector_t *weights, igraph_loops_t loops);

This would generate the submatrix corresponding the specified row-vertices and column-vertices.

There should be a sparse version as well.

Use cases for the feature

Fast way to obtain the adjacency matrix of a small subgraph (when rows and cols are the same).
A single function handles both the adjacency matrix and bipartite incidence matrix cases.
Obtain connectivity between different blocks, e.g. two clusters out of many, obtained with a community detection algorithm.
There are matrix-based algorithms that must be applied per-connected-component, such as Seidel's shortest path algorithm. This is a convenient way to get a matrix for each component.

One of the purposes of this function would be to offer better performance and better memory usage than generating the entire adjacency matrix first then subsetting it.

References

—

The text was updated successfully, but these errors were encountered:

GroteGnoom · 2023-07-23T10:10:59Z

This has the same snag as subset betweenness: partially overlapping vertex selections for the undirected case need special care.
(see #2352 (comment))

szhorvat · 2023-07-23T11:02:40Z

Does it? I believe this is a simple matrix subsetting operation. I.e. the result should be the same as taking only some rows and only some columns in the adjacency matrix.

The fact that in the undirected partially overlapping case some connections will appear twice (below and above diagonal) and some once should not be a problem here.

Let me know if I misunderstood something.

GroteGnoom · 2023-07-23T11:42:55Z

Doesn't that take an unnecessary amount of time? igraph_get_adjacency() loops through all edges to take their to and froms.

For the submatrix variant I think going through all edges and then check if their to and froms are in the vertex selectors is not the most efficient way of handling this. I want to go through the vertex selectors and get the edges between them. Then you hit this problem.

szhorvat · 2023-07-23T14:01:51Z

Doesn't that take an unnecessary amount of time?

I didn't mean that it should be implemented by subsetting the complete matrix. That makes no sense, as there would be no performance gain in using this function. I simply meant to describe the expected result.

I want to go through the vertex selectors and get the edges between them. Then you hit this problem.

I see what you mean.

How about this?

Create a bool array is_target where is_target[v] is true is v is a to vertex
For each from vertex s:
- Get the neighbour list, and for each neighbour t:
  - If is_target[t] then matrix[s, t] += 1.

I think this gives the correct result both for directed and undirected graphs. However, its performance characteristics are not invariant to the exchange of to and from. I'm not sure if this is an issue in practice. If we wanted to try to squeeze more performance, we have two version, one where the outer loop is on from and one where it is on to. Then choose the version where the outer loop is shorter.

An alternative implementation will not construct is_target. Instead, it creates a sorted list of to vertex IDs. Then it takes the intersection of this sorted list with the neighbour list. Note that the neighbour list is guaranteed to be sorted.

We already have such sorted intersection functions,

igraph_i_neisets_intersect in cocitation.c
In the cycle basis calculation I computed a symmetric difference in a similar way

Maybe we should have helper functions for union, intersection, difference, symmetric difference for sorted integer lists.

I suspect that one of these two approaches should be faster than starting with a source and target vertex ID, then getting the edges inbetween.

But this is difficult to predict ...

What do you think @GroteGnoom ?

szhorvat · 2023-07-23T14:15:38Z

These are the two common ways to intersect vertex sets, and I'm never sure which is faster, or under what conditions each is faster. E.g. igraph_ecc() uses the sort method and igraph_i_local_relative_density() uses the Boolean mask method.

Fixes igraph#2196

GroteGnoom · 2023-07-29T07:18:50Z

How about this?

Create a bool array is_target where is_target[v] is true is v is a to vertex

For each from vertex s:

Get the neighbour list, and for each neighbour t:

If is_target[t] then matrix[s, t] += 1.

Lets use an example to clear things up. I think the simplest case where we have problems is a 3-vertex full graph with no loops, from = [0,1,2], to = [0,1].

The full GET_ADJACECNY_BOTH adjacency matrix for the whole graph would look like this:

0	1	1
1	0	1
1	1	0

we have 3 selected edges:
0-1
0-2
1-2

First, GET_ADJACENCY_LOWER and GET_ADJACENCY_UPPER don't always make sense. The 'from 0 to 2' and 'from 1 to 2' elements don't exist in the submatrix, so GET_ADJACENCY_UPPER will miss edges.

GET_ADJACENCY_LOWER would look like this:

0	0
1	0
1	1

GET_ADJACENCY_UPPER would look like this:

0	1
0	0
0	0

And this is GET_ADJACENCY_BOTH

0	1
1	0
1	1

... I'll have to think a bit.

szhorvat · 2023-07-29T07:31:31Z

For this application, there should be no igraph_get_adjacency_t type parameter. As you point out, it makes no sense.

GroteGnoom · 2023-07-29T08:51:21Z

I think that just solves everything 🙂
There could be performance improvements, including your suggestion.

I think this also shows the betweenness_subset problem in matrix form: neither the upper triangle or lower triangle necessarily has all the selected edges, and the whole matrix sometimes has an edge twice, sometimes once.

szhorvat added this to the 0.11 milestone Sep 11, 2022

szhorvat added the wishlist Feature request that has not been chosen for implementation yet; vote or comment to prioritize it! label Sep 11, 2022

szhorvat mentioned this issue Mar 15, 2023

Deprecate the name get_incidence() #2326

Closed

GroteGnoom self-assigned this Jul 22, 2023

GroteGnoom added a commit to GroteGnoom/igraph that referenced this issue Jul 29, 2023

preliminary work on get_adjacency_submatrix

f415baf

Fixes igraph#2196

GroteGnoom linked a pull request Jul 29, 2023 that will close this issue

Add get_adjacency_submatrix #2379

Draft

szhorvat mentioned this issue Oct 27, 2023

Complete the set of biadjacency() functions #2414

Open

3 tasks

ntamas modified the milestones: 1.0, 1.1 Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjacency matrix for only some vertices #2196

Adjacency matrix for only some vertices #2196

szhorvat commented Sep 11, 2022

GroteGnoom commented Jul 23, 2023

szhorvat commented Jul 23, 2023 •

edited

GroteGnoom commented Jul 23, 2023

szhorvat commented Jul 23, 2023

szhorvat commented Jul 23, 2023

GroteGnoom commented Jul 29, 2023

szhorvat commented Jul 29, 2023

GroteGnoom commented Jul 29, 2023 •

edited

Adjacency matrix for only some vertices #2196

Adjacency matrix for only some vertices #2196

Comments

szhorvat commented Sep 11, 2022

GroteGnoom commented Jul 23, 2023

szhorvat commented Jul 23, 2023 • edited

GroteGnoom commented Jul 23, 2023

szhorvat commented Jul 23, 2023

szhorvat commented Jul 23, 2023

GroteGnoom commented Jul 29, 2023

szhorvat commented Jul 29, 2023

GroteGnoom commented Jul 29, 2023 • edited

szhorvat commented Jul 23, 2023 •

edited

GroteGnoom commented Jul 29, 2023 •

edited