-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cherrypick FIXME enable 64bit bitmapset and update visimap etc #843
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Remove the unnecessary extra call as we are already calling this function a couple of lines above.
Currently, partition definition element can have mix of partition definitions and also COLUMN ENCODING clauses within it. For example: CREATE TABLE foo (a1 int, a2 char(5)) using ao_row partition by range (a1) ( start(1) end(5) every(1), --> partition definition COLUMN a1 ENCODING (compresstype=zstd) --> column encoding ); Due to this when GpPartitionDefinition has single list of partDefElems which contains partition definitions and also column encodings. Fixme was to evaluate if current grammer can be optimized to generate separate lists and eliminate the need for separating these later in transformGpPartitionDefinition(). So, not essentially anything to fix, just optimize. After looking into the details - seems not straightforward without modifing the grammer which will be breaking change. Also, don't see much performance benefit either from the change as current code is looping over partDefElems to transform, so not much extra cost added from having them together. Also, we don't expect to have huge number of elements in this list to overpower the cost from actually creating these many partition tables. Reviewed-by: Alexandra Wang <[email protected]> Reviewed-by: Huansong Fu <[email protected]>
The fixme is for pg_basebackup --no-verify-checksums flag. This feature doesn't work and hence is not supported currently in GPDB as no way to identify based on physical files if AO or HEAP. Need to come up with some mechanism to enable this feature. Till that happens, no point having FIXME related to it in test file. Adding breadcrumbs with GPDB_12_MERGE_FEATURE_NOT_SUPPORTED comment to fix the isolation2 setup file as well once feature is enabled. Removing the fixme from setup.sql and setup.out file. Reviewed-by: Huansong Fu <[email protected]>
Currently, Greenplum doesn't support foreign key constraint validation. However, we only emit a warning message saying Greenplum doesn't support it without skipping the validation process. I noticed this issue when I was running the following query: CREATE TABLE fk_test_reference (col2 text unique not null); INSERT INTO fk_test_reference VALUES ('stuff'); CREATE UNLOGGED TABLE id_taptest_table ( col1 bigint, col2 text not null default 'stuff', col3 timestamptz DEFAULT now(), col4 text, FOREIGN KEY (col2) REFERENCES fk_test_reference(col2)) PARTITION BY RANGE (col1); CREATE TABLE id_taptest_table_p3000000000 ( LIKE id_taptest_table INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING STORAGE INCLUDING COMMENTS INCLUDING GENERATED INCLUDING INDEXES); ALTER TABLE id_taptest_table ATTACH PARTITION id_taptest_table_p3000000000 FOR VALUES FROM (3000000000) TO (3000000010); There're 2 possible failures when attaching the newly created partitioned table. 1. Greenplum emits an error message: ERROR: XX000: no pre-assigned OID for pg_constraint tuple "id_taptest_table_col2_fkey" (namespace:0 keyOid1:16472 keyOid2:102) (oid_dispatch.c:374) (seg2 127.0.0.1:7004 pid=195321) (oid_dispatch.c:374) LOCATION: GetNewOrPreassignedOid, oid_dispatch.c:374 2. Greenplum emits an error message: ERROR: 0A000: function cannot execute on a QE slice because it accesses relation "public.id_taptest_table_p3000000000" (seg2 127.0.0.1:7004 pid=108395) CONTEXT: SQL statement "SELECT fk."col2" FROM ONLY "public"."id_taptest_table_p3000000000" fk LEFT OUTER JOIN ONLY "public"."fk_test_reference" pk ON ( pk."col2" OPERATOR(pg_catalog.=) fk."col2") WHERE pk."col2" IS NULL AND (fk."col2" IS NOT NULL)" LOCATION: querytree_safe_for_qe_walker, functions.c:238 The root cause analysis: When attaching a partitioned table with foreign key references, Greenplum will check the constraints by running the query both on the coordinator and on segments: SELECT fk."col2" FROM ONLY "public"."id_taptest_table_p3000000000" fk LEFT OUTER JOIN ONLY "public"."fk_test_reference" pk ON ( pk."col2" OPERATOR(pg_catalog.=) fk."col2") WHERE pk."col2" IS NULL AND (fk."col2" IS NOT NULL) If the check get passed, coordinator will dispatch the real command for 'ALTER TABLE ... ATTACH PARTITION ...' to segments. When performing the 'ALTER TABLE' command, coordinator needs to generate an OID for the pg_constraint tuple 'id_taptest_table_col2_fkey' to keep the catalog table consistent across the cluster. The dispatched OID will be cleaned up if that transaction is aborted or committed. The 1st possible failure occurs if we execute the 'ALTER TABLE' command directly, the dispatched OID for altering table get cleaned up after the constraint checking query finishing (the constraint checking query executed in function 'RI_Initial_Check'). The 2nd possible failure occurs if we execute the 'ALTER TABLE' command in a transaction block. The constraint checking query on coordinator will success and coordinator will dispatch the real 'ALTER TABLE' command to segments. The segment will also run the constraint checking query but in QE mode (Gp_role == GP_ROLE_EXECUTE), so the query cannot be executed because it's accessing some relation. Anyway, Greenplum doesn't support foreign key constraints validation, the simplest fixing is skipping the validation. Fix #14279
Fixup commit d28d587847f0baf267fb1211be8a8028d49b2919
Note that when SharedSnapshotDump is called, the caller must hold the SharedSnapshotLock in advance, so this commit fixes the case that didn't satisfy this condition. Co-authored-by: wuchengwen <[email protected]>
In an extreme case, two gpfdist occupy one port(ipv4/ipv6). The reason for this problem includes two aspects. The first one is that 'bind' is not mutually exclusive. And another is that when listening to a port fail, gpfdist will try the same port with a different protocol(ipv4 or ipv6). So the PR fixes the problem by changing the handling for failed listening behavior.
Concurrent index builds are not supported in GPDB. The currently implemented code in this function is all heap centric and not really AO. Hence, deleting the function implementation and adding ERROR if called. When we get to implement the feature of CONCURRENT index builds then can properly implement the interface.
Yes, wish is these macros shouldn't be required. Table AM serves and hides all the AM level details. Though that's far from reality and not simple or easy to make happen given the current usage of these macros. Yes, iteratively we can move towards the direction to make reach the goal eventually. Till then just convert the FIXME to code comment.
The implementation questioned is not new and has been there for GPDB4,5,6 and 7. Longer term goal is to make it better but nothing in short term to fix here. Plus, the fixme in SyncRepGetSyncStandbys() can't be done as standbconfig is not created or populated for coordinator-standby. Hence, the code needs to live in SyncRepGetSyncStandbys(). Once proper implementation is done for coodinator and standby then this part can be tackled as well.
There is nothing special about bgwriter in GPDB compared to upstream. So, remove the fixme and treat bgwriter section same as checkpointer or parts of the sample config.
… belong to opfamily of distributed key(#14977) If opno of clause does not belong to opfamily of distributed key, and use direct dispatch we will get wrong results: create table bar(a varchar); insert into bar values('a '); explain select * from bar where a = 'a'::bpchar; ``` QUERY PLAN ------------------------------------------------------------------------------ Gather Motion 1:1 (slice1; segments: 1) (cost=0.00..431.00 rows=1 width=8) -> Seq Scan on bar (cost=0.00..431.00 rows=1 width=8) Filter: ((a)::bpchar = 'a'::bpchar) Optimizer: Pivotal Optimizer (GPORCA) (4 rows) select * from bar where a = 'a'::bpchar; a --- (0 rows) ``` if opno of the clause does not belong to opfamily of distributed key, do not use direct dispatch to avoid wrong results. There are some cases that we could use direct dispatch, but we don't after this commit. create table t(a citext); explain select * from t where a = 'a'::text; texteq does not belong to the opfamily of the table's distributed key citext type. But from the implementation can deduce: texteq ==> citext_eq, and we can do the direct dispatch. But we do not have the kind of implication rule in Postgres: texteq ==> citext_eq. more details please refer to issue: https://github.com/greenplum-db/gpdb/issues/14887 fix direct_dispatch
these code seems useless, so just remove them and keep the codebase neat and clean. Co-authored-by: Junwang Zhao <[email protected]>
It was removed while merging PostgreSQL 12 because it hung back then, it has no issue now, get it back. The previous not sure analysis: https://greenplum.slack.com/archives/C74P7FZFG/p1586440484210500
The encoding clause is necessary to maintain catalog consistency for dropped columns in pg_attribute_encoding across major version upgrades. Without this, the column encoding will be set to the hardcoded server default in the new cluster. If the encoding was modified from the default in the old cluster, an error would occur if the column needed to be scanned for any reason. Resolves the issue `ERROR: decompression information missing` after a column is dropped from an AO/CO table and the table is run through pg_upgrade Authored-by: Brent Doil <[email protected]>
QE will hang in `ExecTupleSplit()` while executing the following query. ``` select DQA_1(expr1) filter (subquery1), DQA_2(expr2) ... ``` If a tuple is filtered out by DQA_1's filter, `filter_out` will be set to true. Because dqa2 has no filters, `filter_out` will not be set to false on the next loop. Then it's an endless loop. The solution is simple, if current Expr has no filter, set `filter_out` to false.
The log directory can often be symlinked in the same way the pg_wal directory is. Today, even if BOTH the source and target servers have their log directories symlinked, pg_rewind will run into the error: "log" is of different type in source and target This is because when we use the libpq query to fetch the filemap from the source server, we consider the log directory as a directory, even if it is a symlink. This is because pg_stat_file() is used in that query in libpqProcessFileList() and pg_stat_file() returns isdir=t for symlinks to directories. This shortcoming is somewhat called out: * XXX: There is no backend function to get a symbolic link's target in * general, so if the admin has put any custom symbolic links in the data * directory, they won't be copied correctly. We could fix the query and/or pg_stat_file(). However, we would also like to support deployments where only one of the primaries and/or standbys have the symlink. That is not hard to conceive, given primaries and standbys can have drastically disparate log volume and/or log collection requirements. So, we decide to extend the log directory the same courtesy as was done for pg_wal (pg_xlog) in 0e42397.
Prevent potential array out of bounds access by validating the expression ID is within the valid range of distinct qualified aggregates before accessing the aggregate filter array.
It includes the portion about enabling 64bit bms and other necessary changes related to visimap, plus updates of cmockery unit tests.
my-ship-it
approved these changes
Jan 6, 2025
avamingli
approved these changes
Jan 6, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheck
make -C src/test installcheck-cbdb-parallel
Impact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions