Skip to content

Conversation

@Baunsgaard
Copy link
Contributor

Replace technote-space/get-diff-action with native GitHub Actions paths-ignore to prevent workflows from running on .md and .txt file changes. This saves CI resources by not allocating runners for documentation-only PRs.

Also removes references to non-existent unity/** directory in kernel_unitycatalog_test.yaml (Unity Catalog code is at kernel/unitycatalog/, already covered by kernel/**).

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Documentation-only changes (.md, .txt files) currently trigger all CI workflows, wasting compute resources on unnecessary test runs. Using native paths-ignore prevents workflows from starting entirely, saving CI time and removing the dependency on third-party actions.

How was this patch tested?

YAML syntax was validated locally. The workflows will be fully tested when this PR is pushed—GitHub Actions will verify the workflow files parse correctly and the path filtering behaves as expected on subsequent commits.

Does this PR introduce any user-facing changes?

No

Copy link
Contributor

@c27kwan c27kwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! The change seems sensible to me. Some questions

@@ -1,5 +1,23 @@
name: "Kernel Unity Catalog"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's better to call this "Java Kernel Unity Catalog" if we're making assumptions that we'll only run this for scala/Java files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe it is better to express this as a "disallow" list with paths-ignore like the rest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be, however, I decided to make the minimal change without renaming the action itself. If we rename it then potential downstream metrics will not link back. (but I do not know if we have any downstream metrics)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that an ignores makes more sense here (so you don't forget to add new patterns for opt-in...) but we can do this as a followup and keep this PR to retain status quo

Comment on lines -23 to -28
**
.github/workflows/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't specify paths under push/pull_request, then is it every file by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, except the ones matching 'paths-ignore'.

Copy link
Contributor

@c27kwan c27kwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Copy link
Collaborator

@scottsand-db scottsand-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Collaborator

@allisonport-db allisonport-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think we could improve our logic to be more selective and avoid more unnecessary runs but this can be done later.

Comment on lines +3 to +11
on:
push:
paths-ignore:
- '**.md'
- '**.txt'
pull_request:
paths-ignore:
- '**.md'
- '**.txt'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add spark/** here maybe? we use latest released delta-spark here now so no need to run these. Fine to do this as a follow-up though to keep this scoped

@@ -1,5 +1,23 @@
name: "Kernel Unity Catalog"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that an ignores makes more sense here (so you don't forget to add new patterns for opt-in...) but we can do this as a followup and keep this PR to retain status quo

name: "Delta Kernel"

on: [push, pull_request]
on:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh -- we have no path diffing checks today for kernel?

we should run these if there are changes in:

  • kernel folder
  • storage // for delta-storage

Anything else? I think we can be much more selective here cc @allisonport-db

Replace technote-space/get-diff-action with native GitHub Actions
paths-ignore to prevent workflows from running on .md and .txt file
changes. This saves CI resources by not allocating runners for
documentation-only PRs.

Also removes references to non-existent unity/** directory in
kernel_unitycatalog_test.yaml (Unity Catalog code is at
kernel/unitycatalog/, already covered by kernel/**).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants