-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] Fix data lost after when writing ledger and deleting legder execute concurrency #4462
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Patch is LGTM, but I have a concern about the intent.
Now we only forbid write data to a fenced ledger, but if the ledger(not fenced) is deleted, should we allow it to write data?
If the ledger has not been fenced before, it means only one client has the ledger handle, and if it was written after it was deleted, this means there is a bug on the client side. |
@Override | ||
public void ledgerDeleted(long ledgerId) { | ||
markIfConflictWritingOccurs(ledgerId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on the garbage collection triggered. If the gc is not triggered, we also should fail the case, because the ledger metadata has already been deleted, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on the garbage collection triggered. If the gc is not triggered, we also should fail the case, because the ledger metadata has already been deleted, correct?
- Before garbage collection: since the fenced state is still maintained by
LedgerHandleFactory.ledgers
, BK server knows it is fenced, so the variablefenced
of the result ofLedgerHandleFactory.getHandle(ledgerId)
will befenced
, so that the BK client will get aledger fenced
error. - After garbage collection: the ledger will be marked as
fenced & deleted
, so the BK client will also get aledger fenced
error.
So the current change is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the bk client opens the ledger by openLedgerNoRecovery
, the service won't mark the ledger as a fenced
state. The fenced
state is caused by bk client open the ledger with recover mode. Of cause, this fix can fix the issue we encountered in Pulsar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the bk client opens the ledger by openLedgerNoRecovery, the service won't mark the ledger as a fenced state. The fenced state is caused by bk client open the ledger with recover mode. Of cause, this fix can fix the issue we encountered in Pulsar.
If the BK client opens the ledger by openLedgerNoRecovery
, does it mean this client will only read data from the ledger and never close/delete
it?
If yes, only the first writable ledger handler will delete the ledger, so we can confirm there is a bug in the client who opens the ledger by openLedgerNoRecovery
if it deleted the ledger. Please correct the behavior if my understanding is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct! I checked all the use case of openLedgerNoRecovery
in BookKeeper and Pulsar, we won't delete ledgers with openLedgerNoRecovery LedgerHandle.
@@ -24,7 +24,7 @@ | |||
import java.io.IOException; | |||
|
|||
interface HandleFactory { | |||
LedgerDescriptor getHandle(long ledgerId, byte[] masterKey) | |||
LedgerDescriptor getHandle(long ledgerId, byte[] masterKey, boolean journalReplay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will readOnly
better than journalReplay
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Get a handle that is read-only will call
getReadOnlyHandle
- There is only one scenario that will call
getHandle
to read, it is journal replay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there contains Race Condition between deleteLedger
and getHandle
with same ledgerId. cc @eolivelli @nicoloboschi
There is no race condition issue. See details below There are two thread safety maps:
Scenario 1
Scenario 2
Scenario 3
Scenario 4
Scenario 5
Scenario 5 additional note: Since |
LOG.error("Should have thrown an exception"); | ||
fail("Should have thrown an exception when trying to write"); | ||
} catch (Exception e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use logger and catch the expected exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
* "delete" now, and other clients may call "write" continuously later. We mark these ledgers can not be written | ||
* anymore. And maintains the state for 7 days is safety. | ||
*/ | ||
private final Cache<Long, Boolean> recentlyFencedAndDeletedLedgers = CacheBuilder.newBuilder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in memory, what happens if you restart the bookie ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in memory, what happens if you restart the bookie?
I think scenarios 1
and 2
below are fine, and scenario 3
below is not fine.
Regarding scenario 3
, I think there are two solutions below, which is your preference?
client side
: After a bookie channel is reconnected, let all ledgers related to this channel change BKs ensemble.server side
: Let the GC task skip the fenced ledger until it is not used for a long time.
Scenario 1: the journal log is still there
- Deleted the ledger.
- Restart BK.
- Replay journal logs.
- Reset the ledger as fenced.
- Since the ledger is fenced, the client will get a fenced error when it tries to write.
- Delete the ledger after a GC
- Move the ledger into
recentlyFencedAndDeletedLedgers
- Since the ledger is in the variable
recentlyFencedAndDeletedLedgers
, the client will get a fenced error when it tries to write.
- Move the ledger into
Scenario 2: the journal log was removed & the client switched the ensemble due to pending operations failed
- Deleted the ledger.
- Journal logs were removed.
- Restart BK.
- The client tries to switch the ensemble due to previous operation fails
- Since the ledger metadata was removed, this operation will fail.
Scenario 3: the journal log was removed & the client did not switch the ensemble because there are no pending operations
- Deleted the ledger.
- Journal logs were removed.
- Restart BK.
- The client reconnect successfully.
- If the client tries to write, it will be successful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eolivelli How about we split this into 2 PRs
- Fix it in memory, which is what the current PR did. And this PR can be cherry-picked into the previous branches, this fix can fix almost all the scenarios(the probably of brokers and Bks crashes at the same time is quite few)
- Fix it with persistent storage, we can persist the states into a file, and add a config to define which file will be used, this will be released with the next feature release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The is a small nit about a printStackTrace, please remove it before committing the patch
Then I expect to see the follow up work:
|
Motivation
BK client 1
BK client 2
1
1
1
1
At the step
4
, the write should fail, but it succeeds. It leads users to assume the data has been written, but it can not be read.You can reproduce the issue by
testWriteAfterDeleted
There is a scenario that will lead to Pulsar loss messages
broker-2
created a ledgerbroker-2
's ZK session is expired, which will lead the topic it owned to be assigned to other brokersbroker-0
owned the topic againbroker-0
broker-2
broker-2
, the ledger can not be closed due to the ledger metadata has been deletedChanges
Once the ledger is fenced, it can not be wrote anymore.