Skip to content

Comments

HIVE-29217: Add configuration to choose materialization strategy for CTEs#6092

Merged
zabetak merged 6 commits intoapache:masterfrom
zabetak:cte-cbo-ast-procs
Feb 20, 2026
Merged

HIVE-29217: Add configuration to choose materialization strategy for CTEs#6092
zabetak merged 6 commits intoapache:masterfrom
zabetak:cte-cbo-ast-procs

Conversation

@zabetak
Copy link
Member

@zabetak zabetak commented Sep 22, 2025

Why are the changes needed?

Avoid interference between AST and CBO based materialization of common table expressions (CTEs). More details in HIVE-29217.

Does this PR introduce any user-facing change?

Yes but highly unlikely. Users may see plan/perf changes if they set/rely on the experimental property hive.optimize.cte.suggester.class and they expect both AST and CBO materialization to trigger sequentially one after the other.

How was this patch tested?

Existing plus new tests.

mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile_regex=.*cte.*
mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver

@sonarqubecloud
Copy link

sonarqubecloud bot commented Oct 7, 2025

Copy link
Member

@deniskuzZ deniskuzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HiveConf default AST, but we test with CBO, is that intentional?

@zabetak
Copy link
Member Author

zabetak commented Feb 18, 2026

@deniskuzZ Many thanks for the review. I believe I addressed all your comments so please take another look once you get the chance.

HiveConf default AST, but we test with CBO, is that intentional?

Regarding the test coverage for hive.optimize.cte.suggester.type I kept the default to AST to avoid changes in behavior. In addition CBO is an experimental feature so not ready yet for being the default.

The AST is the default value and is tested as part of the main plan regression suite (TezTPCDS30TBCliConfig) as well as everywhere else where the config is not overridden.

The CBO is an experimental feature, and the main reason that we introduced a separate CTE suite (TPCDSCteCliConfig).

Copy link
Member

@deniskuzZ deniskuzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

HiveTableScan(table=[[cte, cte_suggestion_2]], table:alias=[cte_suggestion_2])
HiveProject(d_date_sk=[$0])
HiveFilter(condition=[BETWEEN(false, CAST($2):TIMESTAMP(9), 1998-08-04 00:00:00:TIMESTAMP(9), 1998-09-03 00:00:00:TIMESTAMP(9))])
HiveFilter(condition=[AND(<=(1998-08-04 00:00:00, CAST($2):TIMESTAMP(9)), <=(CAST($2):TIMESTAMP(9), 1998-09-03 00:00:00))])
Copy link
Member

@deniskuzZ deniskuzZ Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly!

1. Avoid the need to maintain lots of duplicated content inside XML config files.
2. Enhance the focus of the test by clearly separating the important properties
The test case in this file explicitly targets CBO suggestions.
@sonarqubecloud
Copy link

@zabetak zabetak merged commit d073125 into apache:master Feb 20, 2026
4 checks passed
@zabetak zabetak deleted the cte-cbo-ast-procs branch February 20, 2026 18:17
@zabetak
Copy link
Member Author

zabetak commented Feb 20, 2026

Thanks for the review @deniskuzZ ! Much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants