Member-only story

Dynamic Query Execution for Sharded Tables in BigQuery

BigQuery hacks

Maïssa
3 min readNov 25, 2024

Working with sharded tables in BigQuery can be repetitive and error-prone. Using EXECUTE IMMEDIATE, you can automate tasks efficiently. Here’s how it works.

Photo by charlesdeluvio — Unsplash

The Problem

Imagine you need to clean up rows from multiple sharded tables where a specific condition, like status = 'inactive', applies. Doing this table by table for a range of dates isn’t scalable and wastes time.

The Solution

With dynamic query execution, you can loop through all relevant tables, build queries on the fly, and execute them in one go. Here’s a real-world example where we remove invalid rows from daily sharded tables for a specific date range:

DECLARE start_date DATE DEFAULT DATE('2024-11-01');
DECLARE end_date DATE DEFAULT DATE('2024-11-10');
DECLARE query STRING;
-- Loop through each date in the range
FOR date_suffix IN (
SELECT FORMAT_DATE('%Y%m%d', day) AS date_suffix
FROM UNNEST(GENERATE_DATE_ARRAY(start_date, end_date)) AS day
)
DO
-- Construct the delete query dynamically
SET query = FORMAT("""
DELETE FROM `project.dataset.events_%s`
WHERE status = 'inactive';
""", date_suffix.date_suffix);
-- Execute the query for the current table
EXECUTE IMMEDIATE query;
END FOR;

--

--

Maïssa
Maïssa

Written by Maïssa

Digital Analytics Professional. SQL, R. Data Clarity Advocate 🦦

No responses yet