A step in direction of flexibility and reusability utilizing parameters and variables utilizing a WITH clause
What are variables and why can they be helpful?
Variables are additionally known as parameters. They can be utilized in SQL for setting or declaring values.
Variables retailer a number of values you present them and might then be used anyplace in your question.
Normally set initially of your code, they are often helpful when:
- You don’t wish to change the identical worth in a number of locations as a result of your question has many strains of code
- You wish to make it simple to vary this worth with out affecting the code logic
- You wish to scale back the size and complexity of your queries
Variables can be utilized in different languages resembling Python, via your command line, or utilizing the BigQuery API. However on this article, we’ll deal with use it in a SQL question.
Now we have two methods of utilizing variables in BigQuery:
- Utilizing a WITH clause
- Utilizing BigQuery procedural language
On this article referred to as “Half 1”, we’ll solely cowl the WITH clause.
If you wish to know extra about procedural language (which is sort of a scripting language), you possibly can seek advice from the BigQuery documentation.
As a associated subject (however not fully related on this article), you may additionally be all for utilizing retailer procedures, which may additionally leverage procedural language.
We begin with a base information desk, which features a date, a rustic, a product identify, and the income linked with every order.
Experiment with a single worth (Handbook)
Within the first instance, we wish to get all merchandise with revenues equal to or larger than 250.
Within the “foremost question” part, we use a comma, after calling our desk (base desk and variable) within the FROM
clause, to have the ability to use the worth in our filtering clause with none be a part of.
However, I hear you, and you’re right: we might have prevented utilizing a WITH
clause and simply use the worth in our WHERE
clause.
However, as you possibly can see, it is a handy place to place this worth on the high of our question. Let’s take a look at a bit extra complicated instance, with multiple variable and querying two tables as an alternative of 1.
Experiment with a number of values (Handbook)
Now we have now a number of variables, that we write in an array (we instantly UNNEST
to get all values as particular person rows). We even have two tables, base_table
and base_table_2
that reuse the identical filtering variable.
On this case, it might be sooner so as to add, change or take away values inside ourWITH
clause which holds our variables.
One other tip, if you wish to mix a number of values, will be to make use of a format conbining ARRAY
and STRUCT
varieties.
The clause that holds our variable, is now having two fields, associated to the worth and the identify of a product. It may be used within the filter the identical approach as in our earlier instance.
Now that we’ve seen use guide values, let’s see make them dynamic, based mostly on a price computed from one other desk, or based mostly on information that may change when new information is ingested.
Experiment with a single worth (Dynamic)
Ranging from our base desk, we wish to discover all rows with a income per product that’s 3 times larger than the common product income.
Our WITH
clause will return 144.2, which is dynamically computed utilizing the common product income on the our base desk multiplied by 3. Observe that you should utilize any information set you need and that this worth is prone to change when performing this question with new information.
Experiment with an inventory (Handbook)
In the identical spirit, you too can use an inventory of things. Let’s say we wish to filter particular merchandise names in our base desk.
You would additionally write the identical question utilizing UNION ALL
within the WITH
clause. However I discover the arrays sooner to put in writing.
Experiment with an inventory (Dynamic)
We would like all merchandise that include the phrase “Google”. We are able to get all these values utilizing a LIKE
syntax in our variable assertion.
Experiment with dates (Handbook)
For dates, a guide addition would require changing the info sort to DATE
varieties (or TIMESTAMP
, or DATETIME
for use within the question).
Experiment with dates (Dynamic)
We wish to use the dates from one other information set (base_table_2
in our case) to get a dynamic vary to filter in our foremost SQL assertion.
A couple of concluding phrases
This text demonstrates the mechanics of utilizing variables in a WITH
clause utilizing a number of examples. We are going to talk about the BigQuery procedural language (additionally referred to as scripting syntax) in “Half 2”.
As we’ve seen, a WITH
clause will be very helpful, nevertheless it’s not at all times best: It provides extra strains of code, you might want to switch information varieties (like within the guide date instance), and BigQuery provides an easier syntax for declaring and setting variables (with a DECLARE
and SET
clause).
I hope this may assist, and let me know if you want to see extra examples!