What is this article about?
In this article, we will start exploring Google BigQuery.
You will learn how to create a Project, Schema and a query.
This article is mainly for BI Developers who want to expand their capabilities to handling Big Data.
What is BigQuery?
BigQuery is a full serverless service and Data Warehouse that can run extremely fast interactive and ad-hoc queries.
Data can be saved in Google Storage and be queried by BiqQuery, or alternatively, data can be imported to BigQuery and be queried there, too.
The difference between a Relational DB and Big Query
Getting Started
In order to start working in Google Big Query – you have to create a project (an environment).
Here is how to create your first google project:
** If you already have a project skip to the next section.
- Open your browser
- Enter https://cloud.google.com/bigquery
- Click on “Try BigQuery free”
- Enter your google credentials
- Check “I Agree to the terms…” and click “Agree to terms”
- If you got to the “Google cloud platform“page - you are in the right place.
This page holds all of the GCP(Google Cloud Platform) environment.
- Click on “Create Project”. Give it a name and click: “Create”.
- You should now enter the Google Console screen.
- You should now see your project name, on the Left bottom pane.
Creating a Data Set
In order to create a table, you first need to create a Data set (a Data base).
Here is how to create a data set and a table:
- Enter https://cloud.google.com/bigquery
- On the Left bottom pane, you should see your project name. Click on it.
- Now you will see an option to create a Dataset.
- Click on the “Create Dataset”
- Now, give your Dataset a name and choose its location.
- Click “Create Dataset”
- After you create a Data set, you can create tables inside of it.
Creating a Table
Now let’s create a Table inside your Data set:
- Click on your Data set from the Dataset list and you will see this window:
- Click on the Create Table button, and the following will appear:
- The Table Source can be one of the following:
- Empty Table – no data is loaded
- Google Cloud Storage – data is loaded from Google Storage
- Upload – upload a file from your computer and create a table from it
- Drive - data is loaded from Google Drive
- Google Cloud BigTable - data is loaded from Google BigTable
- Click “Upload” and browse for the “Financial sample.csv” file from here.
* * Most of the time we would use Google Cloud Storage, since this is where most businesses keep their data. But this time we just want to get some data from our local computer.
** If you have a bucket you can load lots of files to the same table using the wildcard character *. Read more about it here.
- Enter Table Name: “Financial_Example”
- Check “Auto Detect” Schema and input parameters
- Click “Create Table”
- On the left pane search for your table, click it, and click on “Query Table”
- Run the query after adding some relevant fields to the select statement.
- You should now view your table data
Advanced
- When creating a table, Destination section:
- Choose in which project your table should be created
- Then, you need to choose if the table will be native or external.
- External means that only metadata is saved in BigQuery - data is not saved. Queries are executed on Data in Google Storage.
- Native means that both metadata and data are saved in BigQuery. Queries are executed on Data in BigQuery.
- If your table is native - you can partition the table.
- We advise to create a corresponding folder for each Table. That will hold files that contain rows/objects with the same schema.