Handle MT issues in setVertexArrivals, fix non-determ crash#396
Closed
dsengupta0628 wants to merge 250 commits into
Closed
Handle MT issues in setVertexArrivals, fix non-determ crash#396dsengupta0628 wants to merge 250 commits into
dsengupta0628 wants to merge 250 commits into
Conversation
Signed-off-by: Harsh Vardhan <openroad@chez-vardhan.com>
fix: Rename variable to avoid collision with C++ 20 keyword (requires)
Sync to latest OpenSTA
setTopInstance/deleteTopInstance back to being public
…eyword Signed-off-by: Harsh Vardhan <openroad@chez-vardhan.com>
Latest OpenSTA code (fixes hang in flow tests)
Signed-off-by: Harsh Vardhan <openroad@chez-vardhan.com>
Small update to power numbers in one test
Add -Wp,-D_GLIBCXX_ASSERTIONS
Signed-off-by: Harsh Vardhan <openroad@chez-vardhan.com>
Signed-off-by: Harsh Vardhan <openroad@chez-vardhan.com>
Signed-off-by: Harsh Vardhan <openroad@chez-vardhan.com>
Signed-off-by: Harsh Vardhan <openroad@chez-vardhan.com>
Latest OpenSTA
Latest OpenSTA to fixes test issues
Latest STA code (performance fix)
Latest OpenSTA (fixed crash with thru exceptions).
Latest OpenSTA merge
Signed-off-by: Martin Povišer <povik@cutebit.org>
Signed-off-by: Martin Povišer <povik@cutebit.org>
Signed-off-by: Martin Povišer <povik@cutebit.org>
This was observed to cause crashes in write_timing_model. Signed-off-by: Martin Povišer <povik@cutebit.org>
…ull-rel_3.0 Pull 3.0 with multimode from parallaxsw
The tcl version in the bazel central registry supports MacOS and it can be used via MODULES.bazel instead of WORKSPACE. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>
On MacOS the MachineApple.cc must be used instead of MachineLinux.cc. I added a select statement. Signed-off-by: Friedrich Beckmann <friedrich.beckmann@tha.de>
Context PR: The-OpenROAD-Project/OpenROAD#9536 Signed-off-by: Henner Zeller <h.zeller@acm.org>
Before the multi-mode refactor the sort helpers wrapped stable_sort, switch to stable_sort again as the sorting influences which vertex is returned from Sta::worstSlack(). Signed-off-by: Martin Povišer <povik@cutebit.org>
Signed-off-by: Matt Liberty <mliberty@precisioninno.com>
…arallax-update Parallax update
…bison Use bison/flex starlark from //bazel package.
bazel/tcl: migrate from rules_hdl tcl to bazel BCR tcl version
bazel/macos: Changed bazel BUILD file to have MachineApple.cc for Macos build
…table_sort Use stable sort again
Use same location for StaConfig.hh output as in cmake.
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
… my fix Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
…ta_write_verilog_fix Fix write_verilog escape seq Issue 3826
Removed introductory section saying this is a fork and to file upstream. Signed-off-by: Matt Liberty <mliberty@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue 1
Rapidus hercules_idecode hits non-deterministic STA assertion · Issue #1537 · The-OpenROAD-Project-private/OpenROAD-flow-scripts
Non-deterministic segmentation fault in GRT
Issue 2
Rapidus hercules_is_int with M4D1 parasitics hits STA assertion · Issue #1526 · The-OpenROAD-Project-private/OpenROAD-flow-scripts
Non-deterministic STA assert in CTS (most-shallow stack same as above)
Both happen because setVertexArrivals() unconditionally mutates vertex->paths_ with no protection, while the multi-threaded BFS allows other threads to simultaneously read another vertex's path array through the CRPR clock path lookup (crprClkPath() → Path::vertexPath()). The two crashes represent the same data race manifesting at different points in the call:
Crash in Issue 1 fails immediately in vertexPath,
Crash in Issue 2 uses a transiently valid pointer that becomes dangling before findCrpr dereferences it, producing garbage that overflows the genclk_src_paths_ vector index.
To fix this, we defer deletion of old vertex path arrays until after visitParallel completes.
So in setVertexArrivals, instead of immediately deleting the old path array (which frees memory that concurrent CRPR/latch readers may still hold pointers to), save it to a per-search list and free it in deleteTagsPrev() — which is called only after all threads for a level have finished.
Why this fixes both crashes ?
Issue 1 (Path::vertexPath ← pruneCrprArrivals):
Thread B was freeing Vclk->paths_ via deletePaths()/makePaths() while Thread A read Vclk->paths_[path_index]. The old array is now kept alive until deleteTagsPrev(), so Thread A's pointer is always
valid within the parallel level.
Issue 2 ('_n < size()' ← findCrpr ← latchBorrowInfo):
Thread A obtained a const Path *src_clk_path via crprClkPath() pointing into Vclk->paths_. Thread B freed that array. Dereferencing the dangling pointer produced garbage rf/min_max indices, which overflowed the genclk_src_paths_ vector. With deferred deletion the returned pointer stays valid, so findCrpr reads correct data.