Query Cloud SQL through Big Query

This article demonstrates Cloud SQL federated queries for Big Query, a neat and simple to use feature.

Connecting to Cloud SQL

One of the challenges presented when using Cloud SQL on a private network (VPC) is providing access to users. There are several ways to accomplish this which include:

  • open the database port on the VPC Firewall (5432 for example for Postgres) and let users access the database using a command line or locally installed GUI tool (may not be allowed in your environment)
  • provide a web based interface deployed on your VPC such as PGAdmin deployed on a GCE instance or GKE pod (adds security and management overhead)
  • use the Cloud SQL proxy (requires additional software to be installed and configured)

In additional, all of the above solutions require direct IP connectivity to the instance which may not always be available. Furthermore each of these operations requires the user to present some form of authentication – in many cases the database user and password which then must be managed at an individual level.

Enter Cloud SQL federated queries for Big Query…

Big Query Federated Queries for Cloud SQL

Big Query allows you to query tables and views in Cloud SQL (currently MySQL and Postgres) using the Federated Queries feature. The queries could be authorized views in Big Query datasets for example.

This has the following advantages:

  • Allows users to authenticate and use the GCP console to query Cloud SQL
  • Does not require direct IP connectivity to the user or additional routes or firewall rules
  • Leverages Cloud IAM as the authorization mechanism – rather that unmanaged db user accounts and object level permissions
  • External queries can be executed against a read replica of the Cloud SQL instance to offload query IO from the master instance

Setting it up

Setting up big query federated queries for Cloud SQL is exceptionally straightforward, a summary of the steps are provided below:

Step 1. Enable a Public IP on the Cloud SQL instance

This sounds bad, but it isn’t really that bad. You need to enable a public interface for Big Query to be able to establish a connection to Cloud SQL, however this is not accessed through the actual public internet – rather it is accessed through the Google network using the back end of the front end if you will.

Furthermore, you configure an empty list of authorized networks which effectively shields the instance from the public network, this can be configured in Terraform as shown here:

This configuration change can be made to a running instance as well as during the initial provisioning of the instance.

As shown below you will get a warning dialog in the console saying that you have no authorized networks – this is by design.

Cloud SQL Public IP Enabled with No Authorized Networks

Step 2. Create a Big Query dataset which will be used to execute the queries to Cloud SQL

Connections to Cloud SQL are defined in a Big Query dataset, this can also be use to control access to Cloud SQL using authorized views controlled by IAM roles.

Step 3. Create a connection to Cloud SQL

To create a connection to Cloud SQL from Big Query you must first enable the BigQuery Connection API, this is done at a project level.

As this is a fairly recent feature there isn’t great coverage with either the bq tool or any of the Big Query client libraries to do this so we will need to use the console for now…

Under the Resources -> Add Data link in the left hand panel of the Big Query console UI, select Create Connection. You will see a side info panel with a form to enter connection details for your Cloud SQL instance.

In this example I will setup a connection to a Cloud SQL read replica instance I have created:

Creating a Big Query Connection to Cloud SQL

More information on the Big Query Connections API can be found at: https://cloud.google.com/bigquery/docs/reference/bigqueryconnection/rest

The following permissions are associated with connections in Big Query:

bigquery.connections.create
bigquery.connections.get
bigquery.connections.list
bigquery.connections.use
bigquery.connections.update
bigquery.connections.delete

These permissions are conveniently combined into the following predefined roles:

roles/bigquery.connectionAdmin    (BigQuery Connection Admin)         
roles/bigquery.connectionUser     (BigQuery Connection User)          

Step 4. Query away!

Now the connection to Cloud SQL can be accessed using the EXTERNAL_QUERY function in Big Query, as shown here:

Querying Cloud SQL from Big Query

Google Cloud SQL – Availability for PostgreSQL – Part II (Read Replicas)

In this post we will look at read replicas as an additional method to achieve multi zone availability for Cloud SQL, which gives us – in turn – the ability to offload (potentially expensive) IO operations such as user created backups or read operations without adding load to the master instance.

In the previous post in this series we looked at Regional availability for PostgreSQL HA using Cloud SQL:

Recall that this option was simple to implement and worked relatively seamlessly and transparently with respect to zonal failover.

Now let’s look at read replicas in Cloud SQL as an additional measure for availability.

Deploying Read Replica(s)

Deploying read replicas is slightly more involved than simple regional (high) availability, as you will need to define each replica or replicas as a separate Cloud SQL instance which is a slave to the primary instance (the master instance).

An example using Terraform is provided here, starting by creating the master instance:

Next you would specify one or more read replicas (typically in a zone other than the zone the master is in):

Note that several of the options supplied are omitted when creating a read replica database instance, such as the backup and maintenance options – as these operations cannot be performed on a read replica as we will see later.

Cloud SQL Instances – showing master and replica
Cloud SQL Master Instance

Voila! You have just set up a master instance (the primary instance your application and/or users will connect to) along with a read replica in a different zone which will be asynchronously updated as changes occur on the master instance.

Read Replicas in Action

Now that we have created a read replica, lets see it in action. After connecting to the read replica (like you would any other instance), attempt to access a table that has not been created on the master as shown here:

SELECT operation from the replica instance

Now create the table and insert some data on the master instance:

Create a table and insert a record on the master instance

Now try the select operation on the replica instance:

SELECT operation from the replica instance (after changes have been made on the master)

It works!

Some Points to Note about Cloud SQL Read Replicas

  • Users connect to a read replica as a normal database connection (as shown above)
  • Google managed backups (using the console or gcloud sql backups create .. ) can NOT be performed against replica instances
  • Read replicas can be used to offload IO intensive operations from the the master instance – such as user managed backup operations (e.g. pg_dump)
pg_dump operation against a replica instance
  • BE CAREFUL Despite their name, read replicas are NOT read only, updates can be made which will NOT propagate back to the master instance – you could get yourself in an awful mess if you allow users to perform INSERT, UPDATE, DELETE, CREATE or DROP operations against replica instances.

Promoting a Read Replica

If required a read replica can be promoted to a standalone Cloud SQL instance, which is another DR option. Keep in mind however as the read replica is updated in an asynchronous manner, promotion of a read replica may result in a loss of data (hopefully not much but a loss nonetheless). Your application RPO will dictate if this is acceptable or not.

Promotion of a read replica is reasonably straightforward as demonstrated here using the console:

Promoting a read replica using the console

You can also use the following gcloud command:

 gcloud sql instances promote-replica  <replica_instance_name>

Once you click on the Promote Replica button you will see the following warning:

Promoting a read replica using the console

This simply states that once you promote the replica instance your instance will become an independent instance with no further relationship with the master instance. Once accepted and the promotion process is complete, you can see that you now have two independent Cloud SQL instances (as advertised!):

Promoted Cloud SQL instance

Some of the options you would normally configure with a master instance would need to be configured on the promoted replica instance – such as high availability, maintenance and scheduled backups – but in the event of a zonal failure you would be back up and running with virtually no data loss!

Full source code for this article is available at: https://github.com/gamma-data/cloud-sql-postgres-availability-tutorial

Google Cloud SQL – Availability, Replication, Failover for PostgreSQL – Part I

In this multi part blog we will explore the features available in Google Cloud SQL for High Availability, Backup and Recovery, Replication and Failover and Security (at rest and in transit) for the PostgreSQL DBMS engine. Some of these features are relatively hot of the press and in Beta – which still makes them available for general use.

We will start by looking at the High Availability (HA) options available to you when using the PostgreSQL engine in Google Cloud SQL.

Most of you would be familiar with the concepts of High Availability, Redundancy, Fault Tolerance, etc but let’s start with a definition of HA anyway:

High availability (HA) is a characteristic of a system, which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.

Wikipedia

Higher than a normal period is quite subjective, typically this is quantified by a percentage represented by a number of “9s”, that is 99.99% (which would be quoted as “four nines”), this would allot you 52.60 minutes of downtime over a one-year period.

Essentially the number of 9’s required will drive your bias towards the options available to you for Cloud SQL HA.

We will start with Cloud SQL HA in its simplest form, Regional Availability.

Regional Availability

Knowing what we know about the Google Cloud Platform, regional availability means that our application or service (in this case Cloud SQL) should be resilient to a failure of any one zone in our region. In fact, as all GCP regions have at least 3 zones – two zones could fail, and our application would still be available.

Regional availability for Cloud SQL (which Google refers to as High Availability), creates a standby instance in addition to the primary instance and uses a regional Persistent Disk resource to store the database instance data, transaction log and other state files, which is synchronously replicated to a Persistent Disk resource local to the zones that the primary and standby instances are located in.

A shared IP address (like a Virtual IP) is used to serve traffic to the healthy (normally primary) Cloud SQL instance.

An overview of Cloud SQL HA is shown here:

Cloud SQL High Availability

Implementing High Availability for Cloud SQL

Implementing Regional Availability for Cloud SQL is dead simple, it is one argument:

availability_type = "REGIONAL"

Using the gcloud command line utility, this would be:

gcloud sql instances create postgresql-instance-1234 \
  --availability-type=REGIONAL \
  --database-version= POSTGRES_9_6

Using Terraform (with a complete set of options) it would look like:

resource "google_sql_database_instance" "postgres_ha" {
  provider = google-beta
  region = var.region
  project = var.project
  name = "postgresql-instance-${random_id.instance_suffix.hex}"
  database_version = "POSTGRES_9_6"
  settings {
   tier = var.tier
   disk_size = var.disk_size
   activation_policy = "ALWAYS"
   disk_autoresize = true
   disk_type = "PD_SSD"
   availability_type = "REGIONAL"
   backup_configuration {
     enabled = true
     start_time = "00:00"
   }
   ip_configuration  {
     ipv4_enabled = false
     private_network = google_compute_network.private_network.self_link
   }
   maintenance_window  {
     day = 7
     hour = 0
     update_track = "stable"
   }
  }
 } 

Once deployed you will notice a few different items in the console, first from the instance overview page you can see that the High Availability option is ENABLED for your instance.

Second, you will see a Failover button enabled on the detailed management view for this instance.

Failover

Failovers and failbacks can be initiated manually or automatically (should the primary be unresponsive). A manual failover can be invoked by executing the command:

gcloud sql instances failover postgresql-instance-1234

There is an --async option which will return immediately, invoking the failover operation asynchronously.

Failover can also be invoked from the Cloud Console using the Failover button shown previously. As an example I have created a connection to a regionally available Cloud SQL instance and started a command which runs a loop and prints out a counter:

Now using the gcloud command shown earlier, I have invoked a manual failover of the Cloud SQL instance.

Once the failover is initiated, the client connection is dropped (as the server is momentarily unavailable):

The connection can be immediately re-established afterwards, the state of the running query is lost – importantly no data is lost however. If your application clients had retry logic in their code and they weren’t executing a long running query, chances are no one would notice! Once reconnecting normal database activities can be resumed:

A quick check of the instance logs will show that the failover event has occured:

Now when you return to the instance page in the console you will see a Failback button, which indicates that your instance is being served by the standby:

Note that there may be a slight delay in the availability of this option as the replica is still being synched.

It is worth noting that nothing comes for free! When you run in REGIONAL or High Availability mode – you are effectively paying double the costs as compared to running in ZONAL mode. However availability and cost have always been trade-offs against one another – you get what you pay for…

More information can be found at: https://cloud.google.com/sql/docs/postgres/high-availability

Next up we will look at read replicas (and their ability to be promoted) as another high availability alternative in Cloud SQL.