Skip to content
This repository has been archived by the owner on Sep 27, 2019. It is now read-only.

SerializableTransactionTests.ReadOnlyTransactionTest failing intermittently #1427

Open
tli2 opened this issue Jun 25, 2018 · 8 comments
Open

Comments

@tli2
Copy link
Contributor

tli2 commented Jun 25, 2018

Several complaints about this failing from time to time on Travis. Might be a Mac issue, might be that the test is broken, might be that we broke something in CC.

Investigation would be appreciated.

@mbutrovich
Copy link
Contributor

mbutrovich commented Jun 25, 2018

Here's an example of how it fails, and on rerun it typically passes. Maybe some nondeterminism in how stuff is being scheduled:

        Start  46: serializable_transaction_test



46: Test command: /Users/travis/build/cmu-db/peloton/build/test/serializable_transaction_test "--gtest_color=yes" "--gtest_output=xml:/Users/travis/build/cmu-db/peloton/build/test/serializable_transaction_test.xml"

46: Environment variables: 

46:  LSAN_OPTIONS=suppressions=/Users/travis/build/cmu-db/peloton/test/leak_suppr.txt

46: Test timeout computed to be: 9.99988e+06

46: Running main() from gmock_main.cc

46: �[0;32m[==========] �[mRunning 6 tests from 1 test case.

46: �[0;32m[----------] �[mGlobal test environment set-up.

46: �[0;32m[----------] �[m6 tests from SerializableTransactionTests

46: �[0;32m[ RUN      ] �[mSerializableTransactionTests.TransactionTest

46: �[0;32m[       OK ] �[mSerializableTransactionTests.TransactionTest (36 ms)

46: �[0;32m[ RUN      ] �[mSerializableTransactionTests.ReadOnlyTransactionTest

46: 2018-06-25 20:34:00 [test/concurrency/testing_transaction_util.cpp:192:CreateTable] INFO  - the database_id is 16777216

46: 2018-06-25 20:34:00 [test/concurrency/serializable_transaction_test.cpp:330:ExecuteNext] INFO  - Txn 0 commits: Success

46: /Users/travis/build/cmu-db/peloton/test/concurrency/serializable_transaction_test.cpp:90: Failure

46: Value of: scheduler.schedules[0].results.size()

46:   Actual: 0

46: Expected: 10

46: �[0;31m[  FAILED  ] �[mSerializableTransactionTests.ReadOnlyTransactionTest (366 ms)

@tli2
Copy link
Contributor Author

tli2 commented Jun 26, 2018

Also worth noting, on both occasions observed, this happens on mac builds only

@mbutrovich
Copy link
Contributor

I’ll spend some time on it today.

@mbutrovich
Copy link
Contributor

Haven't been able to reproduce it locally all day :(

@lmwnshn
Copy link
Contributor

lmwnshn commented Jun 26, 2018

I just tried running it 10-20 times and got it to fail:

[build] ASAN_OPTIONS=detect_container_overflow=0 ./test/serializable_transaction_test
Running main() from gmock_main.cc
[==========] Running 6 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 6 tests from SerializableTransactionTests
[ RUN      ] SerializableTransactionTests.TransactionTest
[       OK ] SerializableTransactionTests.TransactionTest (28 ms)
[ RUN      ] SerializableTransactionTests.ReadOnlyTransactionTest
2018-06-26 13:46:19 [test/concurrency/testing_transaction_util.cpp:192:CreateTable] INFO  - the database_id is 16777216
2018-06-26 13:46:19 [test/concurrency/serializable_transaction_test.cpp:314:ExecuteNext] INFO  - Txn 0 commits: Success
/Users/wanshenl/Documents/pavlo/peloton/test/concurrency/serializable_transaction_test.cpp:90: Failure
Value of: scheduler.schedules[0].results.size()
  Actual: 0
Expected: 10
[  FAILED  ] SerializableTransactionTests.ReadOnlyTransactionTest (217 ms)

Same failure. Didn't have to rebuild between runs.

@mbutrovich
Copy link
Contributor

Can you get a TRACE level log of it to get a better idea what's happening? I still can't get it to reproduce.

@mbutrovich
Copy link
Contributor

mbutrovich commented Jun 27, 2018

@lmwnshn helpfully ran it 1000 times on this commit just before I did all the TransactionContext refactoring in the last couple of weeks. It still reproduced. My suspicion right now is that in rare cases the bootstrapping process (which seems slower in the last month) takes longer than we wait in the test (1 epoch) and it triggers an abort on the scan. We have 2 experiments to try today:

  1. increase the wait time in the test on master
  2. go back to a commit a month or so ago before we had all the bootstrapping changes. I need to identify the exact commit.

I should add that we've also seen it on the Ubuntu boxes, so this is no longer a Mac-exclusive issue.

@tli2 tli2 changed the title SerializableTransactionTest might be failing spuriously SerializableTransactionTest flaky Jun 27, 2018
@mbutrovich
Copy link
Contributor

The test we occasionally fail was modified in #1322. I'll have to spend more time looking at it tomorrow to understand why we seem to have a race.

@mbutrovich mbutrovich changed the title SerializableTransactionTest flaky SerializableTransactionTests.ReadOnlyTransactionTest failing intermittently Jun 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants