Tuesday, August 2, 2022
HomeData ScienceRun BigQuery SQL utilizing Python API Shopper | by Cristian Saavedra Desmoineaux...

Run BigQuery SQL utilizing Python API Shopper | by Cristian Saavedra Desmoineaux | Aug, 2022


Supply Picture from unsplash.com

This yr I’m actively working with Google Cloud (GCP), and I’ll share methods to arrange the API Shopper with Python in WSL (Home windows Subsystem Linux) and Anaconda to execute BigQuery SQL queries from scratch.

For those who want extra details about WSL2 and Anaconda (Miniconda) set up, I’ll make it easier to with this earlier publish.

On this publish, we’re going to see methods to:

  1. Use WSL to create a brand new Anaconda setting.
  2. Set up the Google API Shopper Library utilizing PIP
  3. Create a Google Service Account utilizing the Internet Console
  4. Generate a Google Utility Credentials utilizing JSON KEY file
  5. Set the Google Utility Credential setting variable
  6. Add USA Names public dataset to BigQuery and take a look at a SQL question
  7. Setup Visible Studio Code to run on WSL
  8. Name a Python Script to execute a SQL question
  9. Bonus: Troubleshooting
  10. Thanks
  11. Conclusion

I like to recommend you test my earlier publish in case you don’t have Home windows Subsystem Linux (WSL) and Anaconda (Miniconda).

To not combine with current libraries, we’re going to arrange an area venture doing the next steps:

  1. Open a WSL2 terminal
  2. Go to your private folder a create a brand new folder gcp_projects
  3. Create an Anaconda setting with the identify gcp_prj
  4. Examine if the brand new setting was created
  5. Change to the brand new setting utilizing activate.

Use the next instructions:

mkdir gcp_projects
cd gcp_projects/
conda create --name gcp_prj
conda information --envs
conda activate gcp_prj
I’m creating a brand new Anaconda setting. Picture by the writer.

We’re going to set up PIP within the new Anaconda setting after which set up the Google API Shopper library utilizing the next instructions:

sudo apt set up python3-pip
pip set up --upgrade google-cloud-bigquery
Putting in PIP. Picture by the writer.
Putting in Google API Shopper Library for Python. Picture by the writer.

The following step is to create a brand new Challenge and Service Account by going to https://console.cloud.google.com/ > IAM & Admin > Service Account or https://console.cloud.google.com/iam-admin/serviceaccounts

Google Cloud Console, Service Account. Picture by the writer

Create a brand new venture with the identify MyBigQueryTest

Google Cloud Console, create a venture. Picture by the writer

Create a brand new service account with the identify sa_python_scripts and choose the function to entry your venture.

Making a Service Account in Google Cloud Console. Picture by the writer

For testing functions, I’ll use the Proprietor function with out granting customers entry to the service account. Bear in mind that the Proprietor function just isn’t really useful for the Manufacturing setting

Grant function to Service Account, Google Console. Picture by writer

Go to the Service Account and choose the MyBigQueryTest venture at https://console.cloud.google.com/iam-admin/serviceaccounts and click on the e-mail created for the service account to see the small print.

Service Accounts can be found by electronic mail. Picture edited by the writer.

Create a brand new key and obtain it as a JSON file

Create a brand new Service Account Key. Picture by the writer

Save the file as mybigquerytest.json, I’m utilizing the direct WSL path wsl$Ubuntuhomecsaavedraworkspace_linuxgcp_projects

Save to WSL utilizing wsl$ cover Home windows shared community. Picture by the writer.
Checking the file in WSL Terminal Ubuntu. Picture by the writer

Set the GOOGLE_APPLICATION_CREDENTIALS variable completely in your profile by enhancing the .bashrc file.

sudo nano ~/.bashrc
Modifying your profile. Picture by the writer.

Add the export GOOGLE_APPLICATION_CREDENTIALS on the finish of the file:

export GOOGLE_APPLICATION_CREDENTIALS="/house/csaavedra/workspace_linux/gcp_projects/mybigquerytest.json"
Add export GOOGLE_APPLICATION_CREDENTIALS to your profile. Picture by the writer.

Press Management + X and choose Y for Sure to avoid wasting the modified buffer (file)

Affirm adjustments. Picture by the writer

and press Enter key to substantiate the file identify and the adjustments.

Affirm file identify. Picture by the writer

I’m reloading the profile and checking the trail with the echo and printenv command for the variable GOOGLE_APPLICATION_CREDENTIALS

supply ~/.bashrc
echo $GOOGLE_APPLICATION_CREDENTIALS
printenv GOOGLE_APPLICATION_CREDENTIALS
Reloading the profile and checking the Google Utility Credential Path. Picture by the writer.

Go to the BigQuery Console at https://console.cloud.google.com/bigquery and Add Information Public Datasets as the next image:

Add Public Datasets to BigQuery. Picture by the writer

Available in the market, search USA names and choose the U.S. Social Safety Administration dataset

Google Public Dataset Market. Picture by the writer.

Add the Dataset by clicking on the blue button known as VIA DATASET:

Including USA Names dataset. Picture by the writer

The USA Names dataset was added, however you can’t discover it within the explorer tasks tree.

USA Names added. Picture by the writer.

So as to add it, you’ll want to search USA names and choose Broaden search to all tasks:

Search USA Names. Picture by the writer.

Then you’ll want to click on on the pin icon subsequent to the venture identify bigquery-public-data shifting your mouse near the three dots.

Pinging BigQuery public venture. Picture by the writer.

Now you may see the USA Names in your tree of tasks:

Picture by the writer.

The following step is to make use of an editor window and execute the next SQL question:

SELECT identify, SUM(quantity) as total_people
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE state = 'TX'
GROUP BY identify, state
ORDER BY total_people DESC
LIMIT 20

The consequence we’re going to use to test that the Python Script is working:

Question results of USA Names. Picture by the writer.

After downloading and putting in Visible Studio Code from https://code.visualstudio.com/ go to extensions and set up

  • Python Extension
  • Distant – WSL Extension

As the next picture:

Visible Studio Code, extensions. Picture by the writer.

Then click on on the left backside nook inexperienced icon with
greater-than and less-than signal

Picture by the writer.

And select New WSL Home windows to open a brand new occasion of Visible Studio linked to WSL remotely.

Picture by the writer.

You may test seeing the distinction label within the left backside nook as with the textual content WSL: Ubuntu

WSL: Ubuntu. Picture by the writer.

Now you’re going to open the folder in WSL by going to file and open the folder:

Visible Studio Code, open folder. Picture by the writer.

Choose the trail /house/csaavedra/workspace_linux/gcp_projects/ and press the okay button.

Open the distant folder. Picture by the writer.

We are able to discover the JSON KEY file there. Now create a brand new file by doing proper click on with the identify bigquery-usanames-test.py

Create a brand new distant file. Picture by the writer.

No module named Google

No module named google error. Picture by the writer.

For those who get the error No module named Google, test the set up of the google cloud library with the command:

pip set up --upgrade google-cloud-bigquery

Couldn’t mechanically decide credentials

For those who get the error: Couldn’t mechanically decide credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the appliance. For extra data, please see https://cloud.google.com/docs/authentication/getting-started

The issue is expounded to the GOOGLE_APPLICATION_CREDENTIALS export as an setting variable.

You may take a look at run the code os.environ or print the printenv within the terminal to see if return the variable.

import os
print(os.environ)

If not, you may export or reload the profile with the supply .bashrc and printenv.

export GOOGLE_APPLICATION_CREDENTIALS="/house/csaavedra/workspace_linux/gcp_projects/mybigquerytest.json"supply ~/.bashrc
echo $GOOGLE_APPLICATION_CREDENTIALS
printenv GOOGLE_APPLICATION_CREDENTIALS

Additionally, you may exit within the visible Studio terminal to pressure reload the session with the setting variable

To complete, I’m grateful to Chris von Csefalvay, who offers me the inspiration to maintain writing once more.

Blissful Improvement!

Connecting with Python to Google Cloud Providers (GCP) is simple to do utilizing the API Shopper and a Service Account. After doing this primary step, you are able to do a posh automatization like consuming a Human Excel file utilizing Python Pandas, loading and mixing in Google Huge Question, and refreshing a Tableau Dashboard.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments