Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading default DATABASE is extremely slow in WASM #26

Open
EndilWayfare opened this issue Oct 15, 2020 · 3 comments
Open

Loading default DATABASE is extremely slow in WASM #26

EndilWayfare opened this issue Oct 15, 2020 · 3 comments

Comments

@EndilWayfare
Copy link

For some reason, the initial deref of the lazy_static takes several seconds, locking up the page while the first phone number is either parsed or formatted. After that, it's really snappy. Using the Chrome Dev Tools profiler, it looks like the bincode deserializing only takes ~40ms and most of the time is spent in Database::from. The callgraph is dominated by closure calls. I can only guess that the WASM backend can't optimize away all the closures as aggressively as the x86 one can?
Screenshot_101420_063819_PM

Initializing the database in a web worker won't work, because it doesn't share memory with the main thread. The database is not Deserialize + Serialize, so you can't send it over manually either. I considered raw JS interop with one of the libphonenumber-based NPM packages out there, but there would be no way to convert the result to a phonenumber::PhoneNumber, since the struct is encapsulated such that the only way to make one is by parsing from a string. That leaves the most promising option as an unnecessary client type, something like

pub enum ClientPhoneNumber {
    PhoneNumber(phonenumber::PhoneNumber),
    DefinitelyAValidPhoneNumberIPromise(String),
}

and leaving the final say to the backend. This is certainly possible, but negates a lot of the benefit of having a unified client+server in full-stack Rust.

@EndilWayfare
Copy link
Author

Update: Ok, to be clear, that's the unoptimized dev build. Timing the release build is much less egregious, though still a noticeable hitch. I didn't think about this at first, since the x86 dev build is rust-iliciously fast.
Screenshot_101420_064533_PM
It's a little harder to tell what's going on, since the function names are gone. 0.5s is definitely better than 4.0s, but it still feels like there's room for improvement.

When I get a chance, I may fork and hack on Database::from to see if de-closuring has any effect. If successful, would an associated pull request be welcome? Is the current testing situation sufficient?

@yannleretaille
Copy link
Contributor

I had a look into this. The issue is that when the metadata gets initialized from the database, a syntax check on all regexps is performed (and theres a lot of them!). This is to prevent ugly runtime errors later on. The deserialization from the bincoded database itself is fast as you pointed out.
flamegraph-before

The fix is quite straightforward actually, as we do not need to perform the syntax check at runtime if we already validate the regexps during the build phase when we parse Googles XML using loader. As an added benefit, this also means that there will never be a error due to invalid regexps in Googles data as the crate will not compile in that case.
I already implemented it with good results on my machine and would expect the time to go down to less then 15ms in your wasm use case on your machine.
image

I will post the PRs to rust-phonenumber and regex-cache (which required a minor modification) later.

yannleretaille added a commit to yannleretaille/rust-phonenumber that referenced this issue Oct 15, 2020
…g Metadata from the Database

This has two benefits:
- it results in a ~97% reduction of initiliaztion time the first time
  a phone number is validated. This is crucial for client side
  applications.
- it prevents the crate from compiling in the unlikely case that Googles
  Metadata should ever contain invalid regexps (this was tested)
In the future it should be investigated if there is other data that could be
pre-validated during the build phase

fixes whisperfish#26
@EndilWayfare
Copy link
Author

Wow, @yannleretaille, you rock!

Gotta love compile-time guarantees. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants