Database

BigQuery


BigQuery is a completely serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data, with BI, machine learning and AI built in.

The BigQuery Wrapper allows you to read and write data from BigQuery within your Postgres database.

Preparation

Before you can query BigQuery, you need to enable the Wrappers extension and store your credentials in Postgres.

Enable Wrappers

Make sure the wrappers extension is installed on your database:


_10
create extension if not exists wrappers with schema extensions;

Enable the BigQuery Wrapper

Enable the bigquery_wrapper FDW:


_10
create foreign data wrapper bigquery_wrapper
_10
handler big_query_fdw_handler
_10
validator big_query_fdw_validator;

Store your credentials (optional)

By default, Postgres stores FDW credentials inside pg_catalog.pg_foreign_server in plain text. Anyone with access to this table will be able to view these credentials. Wrappers is designed to work with Vault, which provides an additional level of security for storing credentials. We recommend using Vault to store your credentials.


_15
-- Save your BigQuery service account json in Vault and retrieve the `key_id`
_15
insert into vault.secrets (name, secret)
_15
values (
_15
'bigquery',
_15
'
_15
{
_15
"type": "service_account",
_15
"project_id": "your_gcp_project_id",
_15
"private_key_id": "your_private_key_id",
_15
"private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
_15
...
_15
}
_15
'
_15
)
_15
returning key_id;

Connecting to BigQuery

We need to provide Postgres with the credentials to connect to BigQuery, and any additional options. We can do this using the create server command:


_10
create server bigquery_server
_10
foreign data wrapper bigquery_wrapper
_10
options (
_10
sa_key_id '<key_ID>', -- The Key ID from above.
_10
project_id 'your_gcp_project_id',
_10
dataset_id 'your_gcp_dataset_id'
_10
);

Create a schema

We recommend creating a schema to hold all the foreign tables:


_10
create schema if not exists bigquery;

Options

The following options are available when creating BigQuery foreign tables:

  • table - Source table or view name in BigQuery, required
  • location - Source table location (default: 'US')
  • timeout - Query request timeout in milliseconds (default: 30000)
  • rowid_column - Primary key column name (required for data modification)

You can also use a subquery as the table option:


_10
table '(select * except(props), to_json_string(props) as props from `my_project.my_dataset.my_table`)'

Note: When using subquery, full qualified table name must be used.

Entites

Tables

The BigQuery Wrapper supports data reads and writes from BigQuery tables and views.

Operations

ObjectSelectInsertUpdateDeleteTruncate
Tables

Usage


_10
create foreign table bigquery.my_bigquery_table (
_10
id bigint,
_10
name text,
_10
ts timestamp
_10
)
_10
server bigquery_server
_10
options (
_10
table 'people',
_10
location 'EU'
_10
);

Notes

  • Supports where, order by and limit clause pushdown
  • When using rowid_column, it must be specified for data modification operations
  • Data in the streaming buffer cannot be updated or deleted until the buffer is flushed (up to 90 minutes)

Query Pushdown Support

This FDW supports where, order by and limit clause pushdown.

Inserting Rows & the Streaming Buffer

This foreign data wrapper uses BigQuery’s insertAll API method to create a streamingBuffer with an associated partition time. Within that partition time, the data cannot be updated, deleted, or fully exported. Only after the time has elapsed (up to 90 minutes according to BigQuery’s documentation), can you perform operations.

If you attempt an UPDATE or DELETE statement on rows while in the streamingBuffer, you will get an error of UPDATE or DELETE statement over table datasetName - note that tableName would affect rows in the streaming buffer, which is not supported.

Supported Data Types

Postgres TypeBigQuery Type
booleanBOOL
bigintINT64
double precisionFLOAT64
numericNUMERIC
textSTRING
varcharSTRING
dateDATE
timestampDATETIME
timestampTIMESTAMP
timestamptzTIMESTAMP
jsonbJSON

Limitations

This section describes important limitations and considerations when using this FDW:

  • Large result sets may experience network latency during data transfer
  • Data in streaming buffer cannot be modified for up to 90 minutes
  • Only supports specific data type mappings between Postgres and BigQuery
  • Materialized views using foreign tables may fail during logical backups

Examples

Some examples on how to use BigQuery foreign tables.

Let's prepare the source table in BigQuery first:


_13
-- Run below SQLs on BigQuery to create source table
_13
create table your_project_id.your_dataset_id.people (
_13
id int64,
_13
name string,
_13
ts timestamp,
_13
props jsonb
_13
);
_13
_13
-- Add some test data
_13
insert into your_project_id.your_dataset_id.people values
_13
(1, 'Luke Skywalker', current_timestamp(), parse_json('{"coordinates":[10,20],"id":1}')),
_13
(2, 'Leia Organa', current_timestamp(), null),
_13
(3, 'Han Solo', current_timestamp(), null);

Basic example

This example will create a "foreign table" inside your Postgres database called people and query its data:


_13
create foreign table bigquery.people (
_13
id bigint,
_13
name text,
_13
ts timestamp,
_13
props jsonb
_13
)
_13
server bigquery_server
_13
options (
_13
table 'people',
_13
location 'EU'
_13
);
_13
_13
select * from bigquery.people;

Data modify example

This example will modify data in a "foreign table" inside your Postgres database called people, note that rowid_column option is mandatory:


_25
create foreign table bigquery.people (
_25
id bigint,
_25
name text,
_25
ts timestamp,
_25
props jsonb
_25
)
_25
server bigquery_server
_25
options (
_25
table 'people',
_25
location 'EU',
_25
rowid_column 'id'
_25
);
_25
_25
-- insert new data
_25
insert into bigquery.people(id, name, ts, props)
_25
values (4, 'Yoda', '2023-01-01 12:34:56', '{"coordinates":[10,20],"id":1}'::jsonb);
_25
_25
-- update existing data
_25
update bigquery.people
_25
set name = 'Anakin Skywalker', props = '{"coordinates":[30,40],"id":42}'::jsonb
_25
where id = 1;
_25
_25
-- delete data
_25
delete from bigquery.people
_25
where id = 2;