Skip to content

[FLINK-39276][table] CREATE OR ALTER MATERIALIZED TABLE should respect nullability defined in schema#27794

Merged
snuyanzin merged 2 commits intoapache:masterfrom
snuyanzin:flink39276
Mar 20, 2026
Merged

[FLINK-39276][table] CREATE OR ALTER MATERIALIZED TABLE should respect nullability defined in schema#27794
snuyanzin merged 2 commits intoapache:masterfrom
snuyanzin:flink39276

Conversation

@snuyanzin
Copy link
Contributor

@snuyanzin snuyanzin commented Mar 20, 2026

What is the purpose of the change

The problem is that
if there is existing materialized table mt with schema

`a` BIGINT NOT NULL, `b` STRING, `c` INT, `d` STRING

and then we use CREATE OR ALTER like

CREATE OR ALTER MATERIALIZED TABLE mt (
`a` BIGINT NOT NULL, `b` STRING, `c` INT, `d` STRING, `a1` BIGINT NOT NULL) 
AS SELECT a, b, c, d, a as `a1` FROM t3

then it will update existing schema adding a1 BIGINT, i.e. it omits nullability.
It would make sense if there is no explicit schema mentioned
however in case of schema is present it should respect it

Brief change log

tests are added

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): ( no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): ( no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? ( no)
  • If yes, how is the feature documented? (not applicable)

Copy link
Contributor

@raminqaf raminqaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 20, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

public boolean hasSchemaDefinition() {
final SqlNodeList sqlNodeList = sqlCreateMaterializedTable.getColumnList();
return !sqlNodeList.getList().isEmpty()
&& sqlNodeList.getList().get(0) instanceof SqlRegularColumn;
Copy link
Contributor

@davidradl davidradl Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious about this new method. Are there cases when the materialized table does not have columns?
Is it possible for the first column to be a metadata column? I assume the computed column would need to refer to an already defined regular column.

I see the return of this turns into variable schemaDefinedInQuery - I am checking my understanding - does this mean that a as a1`` in your example. Would a better name be schemaDerivedInQuery, as the schema is actually defined in the columns associated with the table not the select (query)

CREATE OR ALTER MATERIALIZED TABLE mt (
a BIGINT NOT NULL, b STRING, c INT, d STRING, a1 BIGINT NOT NULL)
AS SELECT a, b, c, d, a as a1 FROM t3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not about columns for MT, this is about schema
If it is empty, it means no schema is explicitly specified

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a better name be schemaDerivedInQuery,

no, schema derived from query and user provided schema might be different
for instance user might slightly change a type (nullability or type widening)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as the schema is actually defined in the columns associated with the table not the select (query)

this I didn't get
in a query there might be multiple tables
schema is associated with materialized table and could also contain constraint, watermark, metadata column information which is not retrievable from query

Copy link
Contributor

@gustavodemorais gustavodemorais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only one tiny nit


assertThat(sqlAlterMaterializedTableAsQuery.getTableChanges())
.containsExactly(
// If NOT NULL defined in schema it should stay
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
// If NOT NULL defined in schema it should stay
// If NOT NULL is defined in schema, it should stay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, addressed

@snuyanzin
Copy link
Contributor Author

thanks everyone for taking a look, merging

@snuyanzin snuyanzin merged commit 37004bb into apache:master Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants