Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cherry-pick Brin-related commits #846

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

cherry-pick Brin-related commits #846

wants to merge 5 commits into from

Conversation

reshke
Copy link
Contributor

@reshke reshke commented Jan 7, 2025

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


That bit is unlogged and therefore it's wrong to consider it in WAL page
comparison.

Add a test that tickles the case, as branch testing technology allows.

This has been a problem ever since wal consistency checking was
introduced (commit a507b86 for pg10), so backpatch to all supported
branches.

Author: 王海洋 (Haiyang Wang) <[email protected]>
Reviewed-by: Kyotaro Horiguchi <[email protected]>
Discussion: https://postgr.es/m/CACciXAD2UvLMOhc4jX9VvOKt7DtYLr3OYRBhvOZ-jRxtzc_7Jg@mail.gmail.com
Discussion: https://postgr.es/m/CACciXADOfErX9Bx0nzE_SkdfXr6Bbpo5R=v_B6MUTEYW4ya+cg@mail.gmail.com
@my-ship-it my-ship-it added the cherry-pick cherry-pick upstream commts label Jan 7, 2025
my-ship-it
my-ship-it previously approved these changes Jan 7, 2025
@my-ship-it
Copy link
Contributor

Good start

soumyadeep2007 and others added 2 commits January 7, 2025 11:13
This commit restores specific heap range summarization (range !=
BRIN_ALL_BLOCKRANGES) for heap tables, which is an API used by
autovacuum. Keeping it disabled for AO/CO for now.

There is no reason to have it disabled, except perhaps we need to think
about MPPizing it properly - which is out of scope here.

Also, we want to be able to prevent regressions for heap tables with any
changes we make to BRIN code downstream.

While fixing the tests accordingly, the following improvements have been
made:

* Clearer comments in the tests, specially for summarization.
* Made most tables localized to one QE - it is extremely hard to reason
  about correctness otherwise (since we are dealing with ctids etc)

remove diff
We have introduced a new table AM API relation_get_block_sequences()
that helps unify code for block-based iteration for BRIN scan and
summarization, in a table AM agnostic manner.

At its heart is the critical observation that BRIN indexes deal with
chunks of the tid space, represented in units of heap blocks. The
resultant heap block boundaries are physical for heap tables, but are
logical for AO/CO tables (AO/CO tables may have holes in the tid space
due to how rownumber allocation is done). BRIN conceptually operates on
these logical boundaries for tid selection, and we fully exploit that.

A contiguous set of these logical heap blocks is what we term as
BlockSequences.  Two BlockSequences may not be contiguous, but are
contiguous within themselves.  The table AM API expects that we return a
set of such BlockSequences. Heap tables have only 1 block sequence,
whereas AO/CO tables have 1 block sequence per aoseg/aocsseg. See
BlockSequence for more details.

Once we have the list of BlockSequences, we introduce an outer loop for
iterating over these. Within the outer loop the existing upstream loop
is leveraged. This applies to both bringetbitmap() and brinsummarize().

Thus, we end up with more unified code.

The existing code was error-prone in the way it was changing the
iteration from within the loop etc. This also fixes issue: #14843

Note: Summarize tests are to be improved in a future commit.

Note: The uao/limit_indexscan_inits had to be updated since we no longer
do an extra segno=0 gp_fastsequence lookup in
relation_get_block_sequences(), even if segno=0 is absent, as we were
doing in brin_ao_tid_ranges() (Possibly we were doing so because
brin_ao_tid_ranges() was copy-pasted from appendonly_fetch_init()).

Co-authored-by: Ashwin Agrawal <[email protected]>
@reshke reshke force-pushed the brin branch 4 times, most recently from 6a0c092 to c4e3c79 Compare January 7, 2025 13:36
Add a dedicated section to test summarization for BRIN indexes on AO/CO
tables. Add more commentary and coverage.
This commit fixes partial range summarization for the final range in AO
segments, in a given AO/CO table.

(1) Earlier, we incorrectly used to call RelationGetNumberOfBlocks() in
summarize_range(). While that is okay for heap tables, it isn't for
AO/CO tables. What we need is a way to get at the BlockSequence in which
a given heap block lies. Once we do so, we can easily get the upper
bound on the number of logical heap blocks in that segment. We have
introduced the table AM API relation_get_block_sequence() to this end.

(2) We also make the decision to avoid final partial range summarization
in case the final partial range was extended underneath us by a
different insert session. This is done by checking if the number of
blocks in the segment is still the same, as what we determined at the
beginning of summarization, and bailing accordingly.

(3) The assert inside summarize_range() only holds true for heap tables:
heapBlk % state->bs_pagesPerRange == 0
It wasn't exposed earlier due since no tests had pages_per_range > 1.
Relax accordingly.

(4) Added FastSequenceGetNumHeapBlocks() to extract the logic to obtain
number of logical heap blocks from a given lastSequence value. Also
added AOSegment_PopulateBlockSequence() to populate a given
BlockSequence, provided with a segrelid and segno.

Note: relation_get_block_sequence() will be useful for specific range
summarization in the future.
@reshke
Copy link
Contributor Author

reshke commented Jan 8, 2025

its ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick cherry-pick upstream commts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants