Skip to content

Sync htmlparser changes from g3 #40261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 36 commits into
base: main
Choose a base branch
from
Open

Sync htmlparser changes from g3 #40261

wants to merge 36 commits into from

Conversation

amaltas
Copy link
Contributor

@amaltas amaltas commented Mar 28, 2025

This is an automatically generated pull request.

PiperOrigin-RevId: 540325424
@amaltas amaltas force-pushed the validator_engine_sync branch from a403830 to c4d78c1 Compare March 28, 2025 13:18
banaag and others added 28 commits March 28, 2025 13:18
PiperOrigin-RevId: 543822678
PiperOrigin-RevId: 555248479
PiperOrigin-RevId: 555259013
PiperOrigin-RevId: 558192832
PiperOrigin-RevId: 582334672
PiperOrigin-RevId: 598876608
PiperOrigin-RevId: 604352258
PiperOrigin-RevId: 610804847
…e names starting with an equals sign ("=").

PiperOrigin-RevId: 610820873
…ad of ABSL's macro.

PiperOrigin-RevId: 610875973
…es with duplicate attribute names.

PiperOrigin-RevId: 614818595
PiperOrigin-RevId: 617936431
PiperOrigin-RevId: 639907691
PiperOrigin-RevId: 644159337
JsonDict and JsonArray form members to a vector of an incomplete type, which is
not allowed. libc++ has historically turned a blind eye here, but an incoming
patch will cause these types to break. This commit intends to get in front of
that libc++ change by moving all function definitions that reference `items_`
to after `JsonObject` is defined.

There were also a few other opportunities for reducing complexity that I snuck
into this CL, such as correcting use of forward references and ensuring
`types.h` and `types.cc` include what they use.

PiperOrigin-RevId: 653431283
PiperOrigin-RevId: 691568917
PiperOrigin-RevId: 704346605
PiperOrigin-RevId: 704347005
…re is a large amount to unescape at once, this can slow down parsing significantly.

Before:
Benchmark                     Time(ns)        CPU(ns)     Iterations
--------------------------------------------------------------------
BM_UnescapedEntities        27208779620     27204638467             10

After:
Benchmark                     Time(ns)        CPU(ns)     Iterations
--------------------------------------------------------------------
BM_UnescapedEntities          57966588       57949693             10
BM_UnescapedEntities          58013403       58018622             10
BM_UnescapedEntities          57770261       57745786             10
BM_UnescapedEntities          58072494       58062626             10
BM_UnescapedEntities          57877059       57865834             10
BM_UnescapedEntities          57689735       57705073             10
BM_UnescapedEntities          57678852       57673476             10
BM_UnescapedEntities          57786888       57773828             10
BM_UnescapedEntities          58067170       58058291             10
BM_UnescapedEntities          57742296       57723050             10
BM_UnescapedEntities          57841470       57833843             10
BM_UnescapedEntities          58208952       58196325             10
BM_UnescapedEntities_mean     57892931       57883870            120
BM_UnescapedEntities_stddev     163659         162941            120
PiperOrigin-RevId: 738018062
…vectors.

Before:

Benchmark         Time(ns)        CPU(ns)        Allocs     Iterations
----------------------------------------------------------------------
BM_Parse         158029720      158862020        888548              1   55.001MB peak-mem
BM_Parse         149980188      151992674        888548              1   55.001MB peak-mem
BM_Parse         146430455      147999431        888548              1   55.001MB peak-mem
BM_Parse         146249045      147933061        888548              1   55.001MB peak-mem
BM_Parse         149886603      148022291        888548              1   55.001MB peak-mem
BM_Parse         145681736      143954218        888551              1   55.001MB peak-mem
BM_Parse_mean    149376291      149793949        888548              6   55.001MB peak-mem
BM_Parse_stddev    4235058        4672414             1.12           6   0.000B peak-mem

After:

Benchmark         Time(ns)        CPU(ns)        Allocs     Iterations
----------------------------------------------------------------------
BM_Parse         149049168      147654141        521765              1   51.583MB peak-mem
BM_Parse         141454742      139966494        521765              1   51.583MB peak-mem
BM_Parse         141263070      139947895        521765              1   51.583MB peak-mem
BM_Parse         141052560      144135480        521765              1   51.583MB peak-mem
BM_Parse         139456331      139946514        521765              1   51.583MB peak-mem
BM_Parse         139357311      135976601        521765              1   51.583MB peak-mem
BM_Parse_mean    141938864      141271188        521765              6   51.583MB peak-mem
BM_Parse_stddev    3287867        3701058             0.000          6   0.000B peak-mem

PiperOrigin-RevId: 740157446
…e. A recent Clang regression (caused by go/llvm-pr/114684) incorrectly allows this invalid syntax.

PiperOrigin-RevId: 740231216
Before

Benchmark         Time(ns)        CPU(ns)        Allocs     Iterations
----------------------------------------------------------------------
BM_Parse         151069991      151982943        521765              1   51.583MB peak-mem
BM_Parse         144951470      144099227        521765              1   51.583MB peak-mem
BM_Parse         156225543      155978946        521765              1   51.583MB peak-mem
BM_Parse         184920674      184011830        521765              1   51.583MB peak-mem
BM_Parse         144872684      144154218        521765              1   51.583MB peak-mem
BM_Parse         140119083      140093764        521770              1   51.583MB peak-mem
BM_Parse_mean    153693241      153386821        521766              6   51.583MB peak-mem
BM_Parse_stddev   14871254       14690058             1.86           6   0.000B peak-mem

After

Benchmark         Time(ns)        CPU(ns)        Allocs     Iterations
----------------------------------------------------------------------
BM_Parse         153492099      152312987        440241              1   51.583MB peak-mem
BM_Parse         140035079      140298303        440241              1   51.583MB peak-mem
BM_Parse         138184381      136143580        440241              1   51.583MB peak-mem
BM_Parse         139805091      140279666        440241              1   51.583MB peak-mem
BM_Parse         136864430      136069852        440241              1   51.583MB peak-mem
BM_Parse         139082381      140114802        440241              1   51.583MB peak-mem
BM_Parse_mean    141243910      140869865        440241              6   51.583MB peak-mem
BM_Parse_stddev    5579163        5440099             0.000          6   0.000B peak-mem

PiperOrigin-RevId: 740376026
@amaltas amaltas changed the title Sync for validator cpp engine and cpp htmlparser Sync htmlparser changes from g3 Mar 28, 2025
@amaltas amaltas requested a review from banaag March 28, 2025 13:39
@amaltas amaltas self-assigned this Mar 28, 2025
amaltas added 2 commits March 28, 2025 10:17
These are failing since long time and there is no active development in validator codebase.
@amaltas amaltas requested review from erwinmombay and banaag and removed request for banaag March 28, 2025 17:27
@@ -289,19 +289,6 @@ jobs:
name: '⭐⭐⭐ Bundle Size ⭐⭐⭐'
command: node build-system/pr-check/bundle-size.js
- teardown_vm
validator_tests:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@banaag can you give an LGTM on this part, or do we need it back?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amaltas why are we turning off the validator tests? is this temporary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amaltas says it's just temporary, we can reactivate after submission of the PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amaltas instead of disabling all validator tests, can you try changing this line here

if (isOwnersFile(file) || file.startsWith('validator/js/webui/')) {

you can try adding file.startsWith('validator/cpp/htmlparser/'). This will at least only exclude the htmlparser changes from triggering the validator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this didn't work.

Copy link
Member

@erwinmombay erwinmombay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving just to test the CI

@@ -289,19 +289,6 @@ jobs:
name: '⭐⭐⭐ Bundle Size ⭐⭐⭐'
command: node build-system/pr-check/bundle-size.js
- teardown_vm
validator_tests:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amaltas instead of disabling all validator tests, can you try changing this line here

if (isOwnersFile(file) || file.startsWith('validator/js/webui/')) {

you can try adding file.startsWith('validator/cpp/htmlparser/'). This will at least only exclude the htmlparser changes from triggering the validator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants