Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[node2vec_spark]Could somebody explain what this code is doing? #101

Open
Wang-Yu-Qing opened this issue Aug 28, 2020 · 1 comment
Open

Comments

@Wang-Yu-Qing
Copy link

In GraphOps.scala:

  def setupAlias(neighbours: Array[(Long, Double)]): (Array[Int], Array[Double]) = {
    val K = neighbours.length
    val J = Array.fill(K)(0)
    val q = Array.fill(K)(0.0)

    val smaller = new ArrayBuffer[Int]()
    val larger = new ArrayBuffer[Int]()

    val sum = neighbours.map(_._2).sum
    neighbours.zipWithIndex.foreach { case ((nodeId, weight), i) =>
      q(i) = K * weight / sum
      if (q(i) < 1.0) {
        smaller.append(i)
      } else {
        larger.append(i)
      }
    }

    while (smaller.nonEmpty && larger.nonEmpty) {
      val small = smaller.remove(smaller.length - 1)
      val large = larger.remove(larger.length - 1)

      J(small) = large
      q(large) = q(large) + q(small) - 1.0
      if (q(large) < 1.0) smaller.append(large)
      else larger.append(large)
    }

    (J, q)
  }

What does this function do? I know that the J, q is used for transition probability, but what is J and q exactly and how they are related with transition probability?

@Wang-Yu-Qing
Copy link
Author

Let me answer my own question. This is a sample method called "Alias Method", which has a O(1) time complexity for sample and O(n) time complexity for construct the alias table (a data structure needed for this sample method). More details: https://www.keithschwarz.com/darts-dice-coins/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant