Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling improvements #220

Merged
merged 11 commits into from
Dec 9, 2018
Merged

Profiling improvements #220

merged 11 commits into from
Dec 9, 2018

Conversation

matatk
Copy link
Owner

@matatk matatk commented Dec 9, 2018

Develop more consistent performance checks; Simplify node containment detection; Improve performance by skipping non-element nodes.

(The performance improvements here are to the landmark-scanning process (i.e. each sweep of the document) specifically, and are significant. However, there is a lot of potential investigation that could be done, and maybe performance improvements that may be made, for certain types of page that change a lot, as per #172.)

* Offer a range of a few sites.
* To make it more predictable, landmark scans are triggered by inserting
  an element (which also happens to be a landmark) into the page.
* The user can specify how many times this is to happen.
* Offer more sites to profile.
* Add a script that runs the manual test for injected landmarks both slowly and quickly, to see what happens when the gurading kicks in.
* Add better timeStamp coverage.
* Write delays in 10^blah form so they are clearer to read.
* Add the guarding-testing profile command to package.json.
* Sort package.json.
When running it, tell the profiling script how many runs to, erm, run.
This has shown, though, that it seems using the browser for performance
testing like this is not sufficiently reproducible. Was thinking of just
starting the trace later, and also tried letting the page take longer to
settle, in case it perturbed the cadence of adding the landmarks later,
but neither seemed to help.
This adds another mode to the profiling script that does not load the
extension, rather only the LandmarksFinder code, and runs it directly on
the page, a variable number of times. Hopefully this will provide more
consistent results.
* Use stats-lite to provide the mean and standard deviation of run
  times.
* Don't create a new LandmarksFinder on each repetition.
This doesn't make any significant performance improvement (as per the
results below), but it does simplify the code. The results below are a
control, then tests for both Node.compareDocumentPosition() and
Node.contains(). As the latter results in simpler code and performance
is the same (as one might expect), it is adopted.

==> r0-control-a.json <==
{
  "abootstrap": {
    "mean": 94,
    "standardDeviation": 20
  },
  "amazon": {
    "mean": 9.6,
    "standardDeviation": 3.3
  },
  "ars": {
    "mean": 5.4,
    "standardDeviation": 2.5
  },
  "bbcnews": {
    "mean": 8.8,
    "standardDeviation": 3.5
  },
  "googledoc": {
    "mean": 1.3,
    "standardDeviation": 0.76
  }
}
==> r1-control-b.json <==
{
  "abootstrap": {
    "mean": 100,
    "standardDeviation": 20
  },
  "amazon": {
    "mean": 12,
    "standardDeviation": 3.6
  },
  "ars": {
    "mean": 5.2,
    "standardDeviation": 2.5
  },
  "bbcnews": {
    "mean": 8.6,
    "standardDeviation": 3.3
  },
  "googledoc": {
    "mean": 1.4,
    "standardDeviation": 1.9
  }
}
==> r2-contains-a.json <==
{
  "abootstrap": {
    "mean": 95,
    "standardDeviation": 20
  },
  "amazon": {
    "mean": 14,
    "standardDeviation": 3.3
  },
  "ars": {
    "mean": 5.2,
    "standardDeviation": 2.2
  },
  "bbcnews": {
    "mean": 9.1,
    "standardDeviation": 3.4
  },
  "googledoc": {
    "mean": 1.4,
    "standardDeviation": 0.73
  }
}
==> r3-contains-b.json <==
{
  "abootstrap": {
    "mean": 100,
    "standardDeviation": 22
  },
  "amazon": {
    "mean": 10,
    "standardDeviation": 3.3
  },
  "ars": {
    "mean": 5.3,
    "standardDeviation": 2.6
  },
  "bbcnews": {
    "mean": 8.7,
    "standardDeviation": 3.1
  },
  "googledoc": {
    "mean": 1.3,
    "standardDeviation": 0.68
  }
}
==> r4-compareDocumentPosition-a.json <==
{
  "abootstrap": {
    "mean": 100,
    "standardDeviation": 21
  },
  "amazon": {
    "mean": 11,
    "standardDeviation": 3.4
  },
  "ars": {
    "mean": 5.5,
    "standardDeviation": 2.2
  },
  "bbcnews": {
    "mean": 8.7,
    "standardDeviation": 3.1
  },
  "googledoc": {
    "mean": 1.4,
    "standardDeviation": 1.6
  }
}
==> r5-compareDocumentPosition-b.json <==
{
  "abootstrap": {
    "mean": 98,
    "standardDeviation": 20
  },
  "amazon": {
    "mean": 10,
    "standardDeviation": 3.6
  },
  "ars": {
    "mean": 4.9,
    "standardDeviation": 2.2
  },
  "bbcnews": {
    "mean": 8.6,
    "standardDeviation": 3.4
  },
  "googledoc": {
    "mean": 1.4,
    "standardDeviation": 1.6
  }
}

Closes #125.
When scanning for landmarks, only consider nodes that are *element*
children of other nodes (previously all types of nodes were visited, but
non-element nodes were discounted). This also, and I expect
significantly, replaces the forEach function calls with a for loop,
which will be more efficient.

==> r0-control.json <==
{
  "abootstrap": {
    "mean": 93,
    "standardDeviation": 19
  },
  "amazon": {
    "mean": 9.2,
    "standardDeviation": 3.3
  },
  "ars": {
    "mean": 5.7,
    "standardDeviation": 2
  },
  "bbcnews": {
    "mean": 7.9,
    "standardDeviation": 2.8
  },
  "googledoc": {
    "mean": 1.3,
    "standardDeviation": 1.8
  }
}
==> r1-test.json <==
{
  "abootstrap": {
    "mean": 70,
    "standardDeviation": 17
  },
  "amazon": {
    "mean": 7.8,
    "standardDeviation": 3.4
  },
  "ars": {
    "mean": 4.6,
    "standardDeviation": 2.1
  },
  "bbcnews": {
    "mean": 7,
    "standardDeviation": 3.2
  },
  "googledoc": {
    "mean": 1.4,
    "standardDeviation": 1.9
  }
}

Fixes #126.
@matatk matatk merged commit cb91659 into master Dec 9, 2018
@matatk matatk deleted the profiling-improvements branch December 9, 2018 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant