-
Notifications
You must be signed in to change notification settings - Fork 67
Feature request: symbolic random matrices #372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have the same problem. I added the normal distribution generator c_random to keopscore/utils/code_gen_utils.py, but I am not good enough for developing C++ environment. I have no idea how to solve this problem. from ctypes import CDLL
import math
libc = CDLL("libc.so.6")
class c_random(c_variable):
# class to represent a C++ variable, storing its c++ name and its C++ type.
def __new__(self, dtype, list_string_id=None):
if isinstance(list_string_id, list):
return list(c_random(dtype) for _ in list_string_id)
else:
return super(c_variable, self).__new__(self)
def __init__(self, dtype, string_id=None):
mu, std = (libc.rand()%100)/100, (libc.rand()%100)/100
self.dtype = dtype if dtype != "int" else "float"
self.id = str(math.sqrt(-2 * math.log(std)) * math.cos(2 * math.pi * mu)) |
I met the same problem when I wanted to sample a 2D Gumbel noise with large dimensions (N, M) where N,M=65536. I have found a simple workaround for those who want to generate a symbolic random uniform array: def uniform_lazy_tensor_2d(dim0, dim1, device) -> LazyTensor:
"""Initialize a 2D lazy tensor with uniform random values."""
rand_x = LazyTensor(torch.rand((dim0, 1, 1), device=device))
rand_y = LazyTensor(torch.rand((1, dim1, 1), device=device))
rand_xy = (rand_x * 12.9898 + rand_y * 78.233).sin() * 43758.5453123
rand_xy_floor = (rand_xy - 0.5).round()
rand_xy_fract = rand_xy - rand_xy_floor
rand_xy_clamp = rand_xy_fract.clamp(0, 1)
return rand_xy_clamp It composes two random 1D dense arrays using hash-like computation commonly used in GPU shaders. I mimic the behavior of Test code: U = uniform_lazy_tensor_2d(65536, 65536, 'cuda')
mean = U.sum(0).sum() / 65536**2
variance = ((U - mean)**2).sum(0).sum() / 65536**2
print(f"mean: {mean}, var: {variance}") # mean should equal 1/2 and var should equal 1/12 Output:
|
Hi, it would be very useful to have large random matrices that can be stored symbolically, e.g. with a LazyTensor equivalent of
torch.rand()
.My use case:$N$ nodes, for each of which I want to sample $K$ neighbours (without replacement). For node $i$ , the probabilities of selecting each neighbour $j$ are given by an unnormalized probability vector $\vec{p}^T$ . These vectors are saved in a symbolic matrix $P = \left[ \vec{p_1} \ \cdots \ \vec{p_N} \right] ^T$ .
I am working with a variant of the K-nearest neighbours algorithm. I have
To sample without replacement, I want to use the Gumbel top-k trick, as in this paper. In short, how this works is: for each$i$ , sample a vector $\vec{q}\in[0,1]^N$ uniformly. Then take the indices of the $K$ smallest elements of $\log(\vec{p_i}) -\log(-\log(\vec{q}))$ .
How I would do this in the dense case:
Note, however, that$N$ is potentially huge (~10K-10M), so naturally I want to use symbolic matrices for both $P$ and $Q$ .
How I would like to do this with symbolic matrices:
I am quite new to this repository, so it is possible that I am overlooking a feature / workaround. If that is the case, some pointers would be appreciated!
The text was updated successfully, but these errors were encountered: