Add bigquery pipeline audit prompt#774
Open
ramyashreeshetty wants to merge 2 commits intogithub:stagedfrom
Open
Add bigquery pipeline audit prompt#774ramyashreeshetty wants to merge 2 commits intogithub:stagedfrom
ramyashreeshetty wants to merge 2 commits intogithub:stagedfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Checklist
npm startand verified thatREADME.mdis up to date.Description
This prompt guides Copilot through a structured 6-section review (Cost Exposure, Dry Run Modes, Backfill/Loop Design, Query Safety, Safe Writes, and Observability) and produces a PASS/FAIL report with exact patch locations ordered by risk. Useful for data engineers who want to catch runaway BigQuery costs, prevent duplicate writes, and ensure pipeline failures are visible before shipping to production.
A standard code review wasn't able to catch these issues, which actually caused me problems in the past. This prompt helped me identify critical audit findings before deploying anything to production. It generates a structured report, what you act on is entirely up to you.
Usage Example:
Ran the bigquery-pipeline-audit prompt against run_backtest_simulation_v2.py, a Python script that runs BigQuery-backed backtest simulations. The audit identified 3 critical cost risks: a date-by-date loop generating up to 240 BQ jobs, missing maximum_bytes_billed limits exposing 6 TB of potential scans, and no idempotency on writes (re-runs create duplicate data). Estimated worst-case cost: ~$30 per backtest run with risk of unlimited growth. The prompt returned a prioritized patch list with exact file locations and function names, failing 5 of 6 audit sections.
Type of Contribution
Additional Notes
By submitting this pull request, I confirm that my contribution abides by the Code of Conduct and will be licensed under the MIT License.