BigQuery
BigQuery is a completely serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data, with BI, machine learning and AI built in.
The BigQuery Wrapper allows you to read and write data from BigQuery within your Postgres database.
Preparation
Before you can query BigQuery, you need to enable the Wrappers extension and store your credentials in Postgres.
Enable Wrappers
Make sure the wrappers
extension is installed on your database:
_10create extension if not exists wrappers with schema extensions;
Enable the BigQuery Wrapper
Enable the bigquery_wrapper
FDW:
_10create foreign data wrapper bigquery_wrapper_10 handler big_query_fdw_handler_10 validator big_query_fdw_validator;
Store your credentials (optional)
By default, Postgres stores FDW credentials inside pg_catalog.pg_foreign_server
in plain text. Anyone with access to this table will be able to view these credentials. Wrappers is designed to work with Vault, which provides an additional level of security for storing credentials. We recommend using Vault to store your credentials.
_15-- Save your BigQuery service account json in Vault and retrieve the `key_id`_15insert into vault.secrets (name, secret)_15values (_15 'bigquery',_15 '_15 {_15 "type": "service_account",_15 "project_id": "your_gcp_project_id",_15 "private_key_id": "your_private_key_id",_15 "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",_15 ..._15 }_15 '_15)_15returning key_id;
Connecting to BigQuery
We need to provide Postgres with the credentials to connect to BigQuery, and any additional options. We can do this using the create server
command:
_10create server bigquery_server_10 foreign data wrapper bigquery_wrapper_10 options (_10 sa_key_id '<key_ID>', -- The Key ID from above._10 project_id 'your_gcp_project_id',_10 dataset_id 'your_gcp_dataset_id'_10 );
Create a schema
We recommend creating a schema to hold all the foreign tables:
_10create schema if not exists bigquery;
Options
The following options are available when creating BigQuery foreign tables:
table
- Source table or view name in BigQuery, requiredlocation
- Source table location (default: 'US')timeout
- Query request timeout in milliseconds (default: 30000)rowid_column
- Primary key column name (required for data modification)
You can also use a subquery as the table option:
_10table '(select * except(props), to_json_string(props) as props from `my_project.my_dataset.my_table`)'
Note: When using subquery, full qualified table name must be used.
Entites
Tables
The BigQuery Wrapper supports data reads and writes from BigQuery tables and views.
Operations
Object | Select | Insert | Update | Delete | Truncate |
---|---|---|---|---|---|
Tables | ✅ | ✅ | ✅ | ✅ | ❌ |
Usage
_10create foreign table bigquery.my_bigquery_table (_10 id bigint,_10 name text,_10 ts timestamp_10)_10 server bigquery_server_10 options (_10 table 'people',_10 location 'EU'_10 );
Notes
- Supports
where
,order by
andlimit
clause pushdown - When using
rowid_column
, it must be specified for data modification operations - Data in the streaming buffer cannot be updated or deleted until the buffer is flushed (up to 90 minutes)
Query Pushdown Support
This FDW supports where
, order by
and limit
clause pushdown.
Inserting Rows & the Streaming Buffer
This foreign data wrapper uses BigQuery’s insertAll
API method to create a streamingBuffer
with an associated partition time. Within that partition time, the data cannot be updated, deleted, or fully exported. Only after the time has elapsed (up to 90 minutes according to BigQuery’s documentation), can you perform operations.
If you attempt an UPDATE
or DELETE
statement on rows while in the streamingBuffer, you will get an error of UPDATE
or DELETE
statement over table datasetName - note that tableName would affect rows in the streaming buffer, which is not supported.
Supported Data Types
Postgres Type | BigQuery Type |
---|---|
boolean | BOOL |
bigint | INT64 |
double precision | FLOAT64 |
numeric | NUMERIC |
text | STRING |
varchar | STRING |
date | DATE |
timestamp | DATETIME |
timestamp | TIMESTAMP |
timestamptz | TIMESTAMP |
jsonb | JSON |
Limitations
This section describes important limitations and considerations when using this FDW:
- Large result sets may experience network latency during data transfer
- Data in streaming buffer cannot be modified for up to 90 minutes
- Only supports specific data type mappings between Postgres and BigQuery
- Materialized views using foreign tables may fail during logical backups
Examples
Some examples on how to use BigQuery foreign tables.
Let's prepare the source table in BigQuery first:
_13-- Run below SQLs on BigQuery to create source table_13create table your_project_id.your_dataset_id.people (_13 id int64,_13 name string,_13 ts timestamp,_13 props jsonb_13);_13_13-- Add some test data_13insert into your_project_id.your_dataset_id.people values_13 (1, 'Luke Skywalker', current_timestamp(), parse_json('{"coordinates":[10,20],"id":1}')),_13 (2, 'Leia Organa', current_timestamp(), null),_13 (3, 'Han Solo', current_timestamp(), null);
Basic example
This example will create a "foreign table" inside your Postgres database called people
and query its data:
_13create foreign table bigquery.people (_13 id bigint,_13 name text,_13 ts timestamp,_13 props jsonb_13)_13 server bigquery_server_13 options (_13 table 'people',_13 location 'EU'_13 );_13_13select * from bigquery.people;
Data modify example
This example will modify data in a "foreign table" inside your Postgres database called people
, note that rowid_column
option is mandatory:
_25create foreign table bigquery.people (_25 id bigint,_25 name text,_25 ts timestamp,_25 props jsonb_25)_25 server bigquery_server_25 options (_25 table 'people',_25 location 'EU',_25 rowid_column 'id'_25 );_25_25-- insert new data_25insert into bigquery.people(id, name, ts, props)_25values (4, 'Yoda', '2023-01-01 12:34:56', '{"coordinates":[10,20],"id":1}'::jsonb);_25_25-- update existing data_25update bigquery.people_25set name = 'Anakin Skywalker', props = '{"coordinates":[30,40],"id":42}'::jsonb_25where id = 1;_25_25-- delete data_25delete from bigquery.people_25where id = 2;