Azure Static Web App Review

The Azure Static Web App feature is relatively new in the Azure estate which has recently become generally available, I thought I would take it for a test drive and discuss my findings.

I am a proponent of the JAMStack architecture for front end applications and a user of CD enabled CDN services like Netlify, so this Azure feature was naturally appealing to me.

Azure SWAs allow you to serve static assets (like JavaScript) without a origin server, meaning you don’t need a web server, are able to streamline content distribution and web app performance, and reduce the attack surface area of your application.

The major advantage to using is simplicity, no scaffolding or infra requirements and it is seamlessly integrated into your CI/CD processes (natively if you are using GitHub).

Deploying Static Web Apps in Azure

Pretty simple to setup, aside from a name and a resource group, you just need to supply:

  • a location (Azure region to be used for serverless back end APIs via Azure Function Apps) note that this is not a location where the static web is necessarily running
  • a GitHub or GitLab repo URL
  • the branch you wish to use to trigger production deployments (e.g. main)
  • a path to your code within your app (e.g. where your package.json file is located)
  • an output folder (e.g. dist) this should not exist in your repo
  • a project or personal access token for your GitHub account (alternatively you can perform an interactive OAuth2.0 consent if using the portal)

An example is shown here:

GitHub Actions

Using the consent provided (either using the OAuth flow or by providing a token), Azure Static Web Apps will automagically create the GitHub Actions workflow to deploy your application on a push or merge event to your repo. This includes providing scoped API credentials to Azure to allow access to the Static Web App resource using secrets in GitHub (which are created automagically as well). An example workflow is shown here:

Preview or Staging Releases

Similar to the functionality in analogous services like Netlify, you can configure preview releases of your application to be deployed from specified branches on pull request events.

Routes and Authorization

Routes (for SPAs) need to be provided to Azure by using a file named staticwebapp.config.json located in the application root of your repo (same level as you package.json file). You can also specify response codes and whether the rout requires authentication as shown here:

Pros

  • Globally distributed CDN
  • Increased security posture, reduced attack surface area
  • Simplified architecture and deployment
  • No App Service Plan required – cost reduction
  • Enables Continuous Deployment – incl preview/staging environments
  • TLS and DNS can be easily configured for your app

Cons

  • Serverless API locations are limited
  • Integration with other VCS/CI/CD systems like GitLab would need to be custom built (GitHub and Azure DevOps is integrated)

Overall, this is a good feature for deploying SPAs or PWAs in Azure.

Okta Admin Command Line Interface

Identity and Access Management is a critical component of any application or SaaS architecture. I’m currently doing a spike of the Okta solution for an application development project I am on. Okta is a comprehensive solution built on the open OAuth2 and OIDC protocols, as well as supporting more conventional identity federation approaches such as SAML.

Okta has a clean and easy to use web-based Admin interface which can be used to create applications, users, claims, identity providers and more.

During my spike, which was done in a crash and burn test Okta organisation, I had associated my user account with a Microsoft Identity Provider for SSO, and subsequently had issues accessing the Microsoft Account my user was associated with, as a result I managed to lock myself (the super admin) out of the Okta Admin Console.

Fortunately, prior to doing this I had created an API token for my user. So, I went about looking at ways I could interact with Okta programmatically. My first inclination was to use a simple CLI for Okta to get me out of jail… but I found there wasn’t one that suited. There are, however, a wealth of SDKs for Okta across multiple front-end and back-end oriented programming languages (such as JavaScript, Golang, Python and more).

Being in lockdown and having some free time on my hands, I decided to create a simple open source command line tool which could be used to administer an Okta organisation. The result of this weekend lockdown is okta-admin

okta-admin cli

For this project I used the Golang SDK for Okta, along with the Cobra and Viper Golang packages (used by docker, kubectl and other popular command line utilities). To provide a query interface to JSON response payloads I use GJson.

Will keep adding to this so stay tuned…

Complete source code for this project is available at https://github.com/gammastudios/okta-admin

Enumerating all roles for a user in Snowflake

Snowflake allows roles to be assigned to other roles, so when a user is assigned to a role, they may inherit the ability to use countless other roles.

Challenge: recursively enumerate all roles for a given user

One solution would be to create a complex query on the “SNOWFLAKE"."ACCOUNT_USAGE"."GRANTS_TO_ROLES" object.

An easier solution is to use a stored procedure to recurse through grants for a given user and return an ARRAY of roles for that user.

This is a good programming exercise in tail call recursion (sort of) in JavaScript. Here is the code:

To call the stored proc, execute:

One drawback of stored procedures in Snowflake is that they can only have scalar or array return types and cannot be used directly in a SQL query, however you can use the table(result_scan(last_query_id())) trick to get around this, as shown below where we will pivot the ARRAY into a record set with the array elements as rows:

IMPORTANT

This query must be the next statement run immediately after the CALL statement and cannot be run again until you run another CALL statement.

More adventures with Snowflake soon!

EventArc: The state of eventing in Google Cloud

When defining event-driven architectures, it’s always good to keep up with how the landscape is changing. How do you connect microservices in your architecture? Is Pub/Sub the end-game for all events? To dive a bit deeper, let’s talk through the benefits of having a single orchestrator, or perhaps a choreographer is better?

Orchestration versus choreography refresher

My colleague @jeffreyaven did a recent post explaining this concept in simple terms, which is worth reviewing, see:

Should there really be a central orchestrator controlling all interactions between services…..or, should each service work independently and only interact through events?

  • Orchestration is usually viewed as a domain-wide central service that defines the flow and control of communication between services. In this paradigm, in becomes easier to change and ultimately monitor policies across your org.
  • Choreography has each service registering and emitting events as they need to. It doesn’t direct or define the flow of communication, but using this method usually has a central broker passing around messages and allows services to be truly independent.

Enter Workflows, which is suited for centrally orchestrated services. Not only Google Cloud service such as Cloud Functions and Cloud Run, but also external services.

How about choreography? Pub/Sub and Eventarc are both suited for this. We all know and love Pub/Sub, but how do I use EventArc?

What is Eventarc?

Announced in October-2020, it was introduced as eventing functionality that enables you, the developer, to send events to Cloud Run from more than 60 Google Cloud sources.

But how does it work?

Eventing is done by reading those sweet sweet Audit Logs, from various sources, and sending them to Cloud Run services as events in Cloud Events format. Quick primer on Cloud Events: its a specification for describing event data in a common way. The specification is now under the Cloud Native Computing Foundation. Hooray! It can also read events from Pub/Sub topics for custom applications. Here’s a diagram I graciously ripped from Google Cloud Blog:

Eventarc

Why do I need Eventarc? I have the Pub/Sub

Good question. Eventarc provides and easier path to receive events not only from Pub/Sub topics but from a number of Google Cloud sources with its Audit Log and Pub/Sub integration. Actually, any service that has Audit Log integration can be an event source for Eventarc. Beyond easy integration, it provides consistency and structure to how events are generated, routed and consumed. Things like:

Triggers

They specify routing rules from events sources, to event sinks. Listen for new object creation in GCS and route that event to a service in Cloud Run by creating an Audit-Log-Trigger. Create triggers that also listen to Pub/Sub. Then list all triggers in one, central place in Eventarc:

gcloud beta eventarc triggers list

Consistency with eventing format and libraries

Using the CloudEvent-compliant specification will allow for event data in a common way, increasing the movement towards the goal of consistency, accessibility and portability. Makes it easier for different languages to read the event and Google Events Libraries to parse fields.

This means that the long-term vision of Eventarc to be the hub of events, enabling a unified eventing story for Google Cloud and beyond.

Eventarc producers and consumers

In the future, you can excpect to forego Audit Log and read these events directly and send these out to even more sinks within GCP and any HTTP target.


This article written on inspiration from https://cloud.google.com/blog/topics/developers-practitioners/eventarc-unified-eventing-experience-google-cloud. Thanks Mete Atamel!

Using the Azure CLI to Create an API using a Function App within API Management

Function Apps, Logic Apps and App Services can be used to expose APIs within Azure API Management which is an easy way to deploy serverless microservices. You can see this capability in the Azure portal below within API Management:

Add a new API using a Function App as a back end

Like most readers, I like to script everything, so I was initially frustrated when I couldn’t replicate this operation in the Azure cli, REST, PowerShell, or any of the other SDKs or IaC tools. Others shared my frustration as seen here.

I was nearly resigned to using click ops in the portal (arrrgh) before I worked out this workaround.

The Solution

There is a bit more prep work required to automate this process, but it is well worth it.

1. Create an OpenApi (or Swagger spec or WADL) specification document, as seen below (use the absolute URL for your Function App in the url parameter):

2. Use the az apim api import function (not the az apim api create function), as shown here:

3. Associate the API with a product (which is how you can rate limit APIs)

That’s it! You can now access your function via the API gateway using the gateway url or via the developer portal as seen below:

Function App API in API Management in the Azure Portal
Function App API in the Dev Portal

Multi Cloud Diagramming with PlantUML

Following on from the recent post GCP Templates for C4 Diagrams using PlantUML, cloud architects are often challenged with producing diagrams for architectures spanning multiple cloud providers, particularly as you elevate to enterprise level diagrams.

In this post, with the magic of !includeurl we have brought PlantUML template libraries together for AWS, Azure and GCP icon sets, allowing us to produce multi cloud C4 diagrams using PlantUML like this one:

Multi Cloud Architecture Diagram using PlantUML

Creating a multi cloud diagram is simple, start by adding the following include statements after the @startuml label in a new PlantUML C4 diagram:

Then add references to the required services from different providers…

Then include the predefined resources from your different cloud providers in your diagram as shown here (describing a client server application over a cloud to cloud VPN between Azure and GCP)…

Happy multi-cloud diagramming!

Full source code is available at:

https://github.com/gamma-data/plantuml-multi-cloud-diagrams

GCP Templates for C4 Diagrams using PlantUML

I am a believer in the mantra of “Everything-as-Code”, this includes diagrams and other architectural artefacts. Enter PlantUML…

PlantUML

PlantUML is an open-source tool which allows users to create UML diagrams from an intuitive DSL (Domain Specific Language). PlantUML is built on top of Graphviz and enables software architects and designers to use code to create Sequence Diagrams, Use Case Diagrams, Class Diagrams, State and Activity Diagrams and much more.

C4 Diagrams

PlantUML can be extended to support the C4 model for visualising software architecture. Which describes software in different layers including Context, Container, Component and Code diagrams.

GCP Architecture Diagramming using C4

PlantUML and C4 can be used to produce cloud architectures, there are official libraries available through PlantUML for Azure and AWS service icons, however these do not exist for GCP yet. There are several open source libraries available, however I have made an attempt to simplify the implementation.

The code below can be used to generate a C4 diagram describing a GCP architecture including official GCP service icons:

@startuml
!define GCPPuml https://raw.githubusercontent.com/gamma-data/GCP-C4-PlantUML/master/templates

!includeurl GCPPuml/C4_Context.puml
!includeurl GCPPuml/C4_Component.puml
!includeurl GCPPuml/C4_Container.puml
!includeurl GCPPuml/GCPC4Integration.puml
!includeurl GCPPuml/GCPCommon.puml

!includeurl GCPPuml/Networking/CloudDNS.puml
!includeurl GCPPuml/Networking/CloudLoadBalancing.puml
!includeurl GCPPuml/Compute/ComputeEngine.puml
!includeurl GCPPuml/Storage/CloudStorage.puml
!includeurl GCPPuml/Databases/CloudSQL.puml

title Sample C4 Diagram with GCP Icons

Person(publisher, "Publisher")
System_Ext(device, "User")

Boundary(gcp,"gcp-project") {
  CloudDNS(dns, "Managed Zone", "Cloud DNS")
  CloudLoadBalancing(lb, "L7 Load Balancer", "Cloud Load Balancing")
  CloudStorage(bucket, "Static Content Bucket", "Cloud Storage")
  Boundary(region, "gcp-region") {
    Boundary(zonea, "zone a") {
      ComputeEngine(gcea, "Content Server", "Compute Engine")
      CloudSQL(csqla, "Dynamic Content", "Cloud SQL")
    }
    Boundary(zoneb, "zone b") {
      ComputeEngine(gceb, "Content Server", "Compute Engine")
      CloudSQL(csqlb, "Dynamic Content\n(Read Replica)", "Cloud SQL")
    }
  }
}

Rel(device, dns, "resolves name")
Rel(device, lb, "makes request")
Rel(lb, gcea, "routes request")
Rel(lb, gceb, "routes request")
Rel(gcea, bucket, "get static content")
Rel(gceb, bucket, "get static content")
Rel(gcea, csqla, "get dynamic content")
Rel(gceb, csqla, "get dynamic content")
Rel(csqla, csqlb, "replication")
Rel(publisher,bucket,"publish static content")

@enduml

The preceding code generates the diagram below:

Additional services can be added and used in your diagrams by adding them to your includes, such as:

!includeurl GCPPuml/DataAnalytics/BigQuery.puml
!includeurl GCPPuml/DataAnalytics/CloudDataflow.puml
!includeurl GCPPuml/AIandMachineLearning/AIHub.puml
!includeurl GCPPuml/AIandMachineLearning/CloudAutoML.puml
!includeurl GCPPuml/DeveloperTools/CloudBuild.puml
!includeurl GCPPuml/HybridandMultiCloud/Stackdriver.puml
!includeurl GCPPuml/InternetofThings/CloudIoTCore.puml
!includeurl GCPPuml/Migration/TransferAppliance.puml
!includeurl GCPPuml/Security/CloudIAM.puml
' and more…

The complete template library is available at:

https://github.com/gamma-data/GCP-C4-PlantUML

Automated GCS Object Scanning Using DLP with Notifications Using Slack

This is a follow up to a previous blog, Google Cloud Storage Object Notifications using Slack in which we used Slack to notify us of new objects being uploaded to GCS.

In this article we will take things a step further, where uploading an object to a GCS bucket will trigger a DLP inspection of the object and if any preconfigured info types (such as credit card numbers or API credentials) are present in the object, a Slack notification will be generated.

As DLP scans are “jobs”, meaning they run asynchronously, we will need to trigger scans and inspect results using two separate Cloud Functions (one for triggering a scan [gcs-dlp-scan-trigger] and one for inspecting the results of the scan [gcs-dlp-evaluate-results]) and a Cloud PubSub topic [dlp-scan-topic] which is used to hold the reference to the DLP job.

The process is described using the sequence diagram below:

The Code

The gcs-dlp-scan-trigger Cloud Function fires when a new object is created in a specified GCS bucket. This function configures the DLP scan to be executed, including the DLP info types (for instance CREDIT_CARD_NUMBER, EMAIL_ADDRESS, ETHNIC_GROUP, PHONE_NUMBER, etc) a and likelihood of that info type existing (for instance LIKELY). DLP scans determine the probability of an info type occurring in the data, they do not scan every object in its entirety as this would be too expensive.

The primary function executed in the gcs-dlp-scan-trigger Cloud Function is named inspect_gcs_file. This function configures and submits the DLP job, supplying a PubSub topic to which the DLP Job Name will be written, the code for the inspect_gcs_file is shown here:

At this stage the DLP job is created an running asynchronously, the next Cloud Function, gcs-dlp-evaluate-results, fires when a message is sent to the PubSub topic defined in the DLP job. The gcs-dlp-evaluate-results reads the DLP Job Name from the PubSub topic, connects to the DLP service and queries the job status, when the job is complete, this function checks the results of the scan, if the min_likliehood threshold is met for any of the specified info types, a Slack message is generated. The code for the main method in the gcs-dlp-evaluate-results function is shown here:

Finally, a Slack webhook is used to send the message to a specified Slack channel in a workspace, this is done using the send_slack_notification function shown here:

An example Slack message is shown here:

839Slack Notification for Sensitive Data Detected in a Newly Created GCS Object

Full source code can be found at: https://github.com/gamma-data/automated-gcs-object-scanning-using-dlp-with-notifications-using-slack

Forseti Terraform Validator: Enforcing resource policy compliance in your CI pipeline

Terraform is a powerful tool for managing your Infrastructure as Code. Declare your resources once, define their variables per environment and sleep easy knowing your CI pipeline will take care of the rest.

But… one night you wake up in a sweat. The details are fuzzy but you were browsing your favourite cloud provider’s console – probably GCP 😉 – and thought you saw a bucket had been created outside of your allowed locations! Maybe it even had risky access controls.

You go brush it off and try to fall back to sleep, but you can’t quite push the thought from your mind that somewhere in all that Terraform code, someone could be declaring resources in unapproved locations, and your CICD pipeline would do nothing to stop it. Oh the regulatory implications.

Enter Terraform Validator by Forseti

Terraform Validator by Forseti allows you to declare your Policy as Code, check compliance of your Terraform plans against said Policy, and automatically fail violating plans in a CI step. All without setting up servers or agents.

You’re going to learn how to enforce policy on GCP resources like BigQuery, IAM, networks, MySQL, Google Kubernetes Engine (GKE) and more. If you’re particularly crafty, you may be able to go beyond GCP.

Forseti’s suite of solutions are GCP focused and allow a wide range of live config validation, monitoring and more using the Policy Library we’re going to set up. These additional capabilities require additional infrastructure. But we’re going one step at a time, starting with enforcing policy during deployment.

Getting Started

Let’s assume you already have an established CICD pipeline that uses Terraform, or that you are content to validate your Terraform plans locally for now. In that case, we need just two things:

  1. A Policy Library
  2. Terraform Validator

It’s that simple! No new servers, agents, firewall rules, extra service accounts or other nonsense. Just add Policy Library, the Validator tool and you can enforce policy on your Terraform deployments.

We’re going to tinker with some existing GCP-focused sample policies (aka Constraints) that Forseti makes available. These samples cover a wide range of resources and use cases, so it is easy to adjust what’s provided to define your own Constraints.

Policy Library

First let’s open up some of Forseti’s pre-defined constraints. We’ll copy them into our own Git repository and adjust to create policies that match our needs. Repeatable and configurable – that’s Policy as Code at work.

Concepts

In the world of Forseti and in particular Terraform Validator, Policies are defined and understood via easy to read YAML files known as Constraints

There is just enough information in a Constraint file for to make clear its purpose and effect, and by tinkering lightly with a pre-written Constraint you can achieve a lot without looking too deeply into the inner workings . But there’s more happening than meets the eye.

Constraints are built on Templates – which are like forms with some extra bits waiting to be completed to make a Constraint. Except there’s a lot more hidden away that’s pretty cool if you want to understand it.

Think of a Template as a ‘Class’ in the OOP sense, and of a Constraint as an instantiated Template with all the key attributes defined.

E.g. A generic Template for policy on bucket locations and a Constraint to specify which locations are relevant in a given instance. Again, buckets and locations are just the basic example – the potential applications are far greater.

Now the real magic is that just like a ‘Class’, a Template contains logic that makes everything abstracted away in the Constraint possible. Templates contain inline Rego (ray-go), borrowed lovingly by Forseti from the Open Policy Agent (OPA) team.

Learn more about Rego and OPA here to understand the relationship to our Terraform Validator.

But let’s begin.

Set up your Policies

Create your Policy Library repository

Create your Policy Library repository by cloning https://github.com/forseti-security/policy-library into your own VCS.

This repo contains templates and sample constraints which will form the basis of your policies. So get it into your Git environment and clone it to local for the next step.

Customise sample constraints to fit your needs

As discussed in Concepts, Constraints are defined Templates, which make use of Rego policy language. Nice. So let’s take a sample Constraint, put it in our Policy Library and set the values to what we need. It’s that easy – no need to write new templates or learn Rego if your use case is covered.

In a new branch…

  1. Copy the sample Constraint storage_location.yaml to your Constraints folder.
    $ cp policy-library/samples/storage_location.yaml policy-library/policies/constraints/storage_location.yaml
  2. Replace the sample location (asia-southeast1) in storage_location.yaml with australia-southeast1.
    spec:
    severity: high
    match:
    target: ["organization/*"]
    parameters:
    mode: "allowlist"
    locations:
    - australia-southeast1
    exemptions: []
  3. Push back to your repo – not Forseti’s!
    $ git push https://github.com/<your-repository>/policy-library.git

Policy review

There you go – you’ve customised a sample Constraint. Now you have your own instance of version controlled Policy-as-Code and are ready to apply the power of OPA’s Rego policy language that lies within the parent Template. Impressively easy right?

That’s a pretty simple example. You can browse the rest of Forseti’s Policy Library to view other sample Constraints, Templates and the Rego logic that makes all of this work. These can be adjusted to cover all kinds of use cases across GCP resources.

I suggest working with and editing the sample Constraints before making any changes to Templates.

If you were to write Rego and Templates from scratch, you might even be able to enforce Policy as Code against non-GCP Terraform code.

Terraform Validator

Now, let’s set up the Terraform Validator tool and have it compare a sample piece of Terraform code against the Constraint we configured above. Keep in mind you’ll want to translate what’s done here into steps in your CICD pipeline.

Once the tool is in place, we really just run terraform plan and feed the output into Terraform Validator. The Validator compares it to our Constraints, runs all the abstracted logic we don’t need to worry about and returns 0 or 2 when done for pass / fail respectively. Easy.

So using Terraform if I try to make a bucket in australia-southeast1 it should pass, if I try to make one in the US it should fail. Let’s set up the tool, write some basic Terraform and see how we go.

Setup Terraform Validator

Check for the latest version of terraform-validator from the official terraform-validator GCS bucket.

Very important when using tf version 0.12 or greater. This is the easy way – you can pull from the Terraform Validator Github and make it yourself too.

$ gsutil ls -r gs://terraform-validator/releases

Copy the latest version to the working dir

$ gsutil cp gs://terraform-validator/releases/2020-03-05/terraform-validator-linux-amd64 .

Make it executable

$ chmod 755 terraform-validator-linux-amd64

Ready to go!

Review your Terraform code

We’re going to make a ridiculously simple piece of Terraform that tries to create one bucket in our project to keep things simple.

main.tf

resource "google_storage_bucket" "tf-validator-demo-bucket" {  
  name          = "tf-validator-demo-bucket"
  location      = "US"
  force_destroy = true

  lifecycle_rule {
    condition {
      age = "3"
    }
    action {
      type = "Delete"
    }
  }
}

This is a pretty standard bit of Terraform for a GCS bucket, but made very simple with all the values defined directly in main.tf. Note the location of the bucket – it violates our Constraint that was set to the australia-southeast1 region.

Make the Terraform plan

Warm up Terraform.
Double check your Terraform code if there are any hiccups.

$ terraform init

Make the Terraform plan and store output to file.

$ terraform plan --out=terraform.tfplan

Convert the plan to JSON

$ terraform show -json ./terraform.tfplan > ./terraform.tfplan.json

Validate the non-compliant Terraform plan against your Constraints, for example

$ ./terraform-validator-linux-amd64 validate ./tfplan.tfplan.json --policy-path=../repos/policy-library/

TA-DA!

Found Violations:

Constraint allow_some_storage_location on resource //storage.googleapis.com/tf-validator-demo-bucket: //storage.googleapis.com/tf-validator-demo-bucket is in a disallowed location.

Validate the compliant Terraform plan against your Constraints

Let’s see what happens if we repeat the above, changing the location of our GCS bucket to australia-southeast1.

$ ./terraform-validator-linux-amd64 validate ./tfplan.tfplan.json --policy-path=../repos/policy-library/

Results in..

No violations found.

Success!!!

Now all that’s left to do for your Policy as Code CICD pipeline is to configure the rest of your Constraints and run this check before you go ahead and terraform apply. Be sure to make the apply step dependent on the outcome of the Validator.

Wrap Up

We’ve looked at how to apply Policy as Code to validate our Infrastructure as Code. Sounds pretty modern and DevOpsy doesn’t it.

To recap, we learned about Constraints, which are fully defined instances of Policy as Code. They’re based on YAML Templates that refer to the OPA policy language Rego, but we didn’t have to learn it 🙂

We created our own version controlled Policy Library.

Using the above learning and some handy pre-existing samples, we wrote policies (Constraints) for GCP infrastructure, specifying a whitelist for locations in which GCS buckets could be deployed.

As mentioned there are dozens upon dozens of samples across BigQuery, IAM, networks, MySQL, Google Kubernetes Engine (GKE) and more to work with.

Of course, we stored these configured Constraints in our version-controlled Policy Library.

  • We looked at a simple set of Terraform code to define a GCS bucket, and stored the Terraform plan to a file before applying it.
  • We ran Forseti’s Terraform Validator against the Terraform plan file, and had the Validator compare the plan to our Policy Library.
  • We saw that the results matched our expectations! Compliance with the location specified in our Constraint passed the Validator’s checks, and non-compliance triggered a violation.

Awesome. And the best part is that all this required no special permissions, no infrastructure for servers or agents and no networking.

All of that comes with the full Forseti suite of Inventory taking Config Validation of already deployed resources. We might get to that next time.

References:

https://github.com/GoogleCloudPlatform/terraform-validator https://github.com/forseti-security/policy-library https://www.openpolicyagent.org/docs/latest/policy-language/ https://cloud.google.com/blog/products/identity-security/using-forseti-config-validator-with-terraform-validator https://forsetisecurity.org/docs/latest/concepts/

Creating a Site to Site VPN Connection Between GCP and Azure with Google Private Access

This article demonstrates creating a site to site IPSEC VPN connection between a GCP VPC network and an Azure Virtual Network, enabling private RFC1918 network connectivity between virtual networks in both clouds. This is done using a single PowerShell script leveraging Azure PowerShell and gcloud commands in the Google SDK.

Additionally, we will use Azure Private DNS to enable private access between Azure hosts and GCP APIs (such as Cloud Storage or Big Query).

An overview of the solution is provided here:

Azure to GCP VPN Design

One note before starting – site to site VPN connections between GCP and Azure currently do not support dynamic routing using BGP, however creating some simple routes on either end of the connection will be enough to get going.

Let’s go through this step by step:

Step 1 : Authenticate to Azure

Azure’s account equivalent is a subscription, the following command from Azure Powershell is used to authenticate a user to one or more subscriptions.

Connect-AzAccount

This command will open a browser window prompting you for Microsoft credentials, once authenticated you will be returned to the command line.

Step 2 : Create a Resource Group (Azure)

A resource group is roughly equivalent to a project in GCP. You will need to supply a Location (equivalent to a GCP region):

New-AzResourceGroup `
  -Name "azure-to-gcp" `
  -Location "Australia Southeast"

Step 3 : Create a Virtual Network with Subnets and Routes (Azure)

An Azure Virtual Network is the equivalent of a VPC network in GCP (or AWS), you must define subnets before creating a Virtual Network. In this example we will create two subnets, one Gateway subnet (which needs to be named accordingly) where the VPN gateway will reside, and one subnet named ‘default’ where we will host VMs which will connect to GCP services over the private VPN connection.

Before defining the default subnet we must create and attach a Route Table (equivalent of a Route in GCP), this particular route will be used to route ‘private’ requests to services in GCP (such as Big Query).

# define route table and route to GCP private access
$azroutecfg = New-AzRouteConfig `
  -Name "google-private" `
  -AddressPrefix "199.36.153.4/30" `
  -NextHopType "VirtualNetworkGateway" 

$azrttbl = New-AzRouteTable `
  -ResourceGroupName "azure-to-gcp" `
  -Name "google-private" `
  -Location "Australia Southeast" `
  -Route $azroutecfg

# define gateway subnet
$gatewaySubnet = New-AzVirtualNetworkSubnetConfig  `
  -Name "GatewaySubnet" `
  -AddressPrefix "10.1.2.0/24"

# define default subnet
$defaultSubnet  = New-AzVirtualNetworkSubnetConfig `
  -Name "default" `
  -AddressPrefix "10.1.1.0/24" `
  -RouteTable $azrttbl

# create virtual network and subnets
$vnet = New-AzVirtualNetwork  `
  -Name "azure-to-gcp-vnet" `
  -ResourceGroupName "azure-to-gcp" `
  -Location "Australia Southeast" `
  -AddressPrefix "10.1.0.0/16" `
  -Subnet $gatewaySubnet,$defaultSubnet

Step 4 : Create Network Security Groups (Azure)

Network Security Groups in Azure are stateful firewalls much like Firewall Rules in VPC networks in GCP. Like GCP, the lower priority overrides higher priority rules.

In the example we will create several rules to allow inbound ICMP, TCP and UDP traffic from our Google VPC and RDP traffic from the Internet (which we will use to logon to a VM in Azure to test private connectivity between the two clouds):

# create network security group
$rule1 = New-AzNetworkSecurityRuleConfig `
  -Name rdp-rule `
  -Description "Allow RDP" `
  -Access Allow `
  -Protocol Tcp `
  -Direction Inbound `
  -Priority 100 `
  -SourceAddressPrefix Internet `
  -SourcePortRange * `
  -DestinationAddressPrefix * `
  -DestinationPortRange 3389

$rule2 = New-AzNetworkSecurityRuleConfig `
  -Name icmp-rule `
  -Description "Allow ICMP" `
  -Access Allow `
  -Protocol Icmp `
  -Direction Inbound `
  -Priority 101 `
  -SourceAddressPrefix * `
  -SourcePortRange * `
  -DestinationAddressPrefix * `
  -DestinationPortRange *

$rule3 = New-AzNetworkSecurityRuleConfig `
  -Name gcp-rule `
  -Description "Allow GCP" `
  -Access Allow `
  -Protocol Tcp `
  -Direction Inbound `
  -Priority 102 `
  -SourceAddressPrefix "10.2.0.0/16" `
  -SourcePortRange * `
  -DestinationAddressPrefix * `
  -DestinationPortRange *

$nsg = New-AzNetworkSecurityGroup `
  -ResourceGroupName "azure-to-gcp" `
  -Location "Australia Southeast" `
  -Name "nsg-vm" `
  -SecurityRules $rule1,$rule2,$rule3

Step 5 : Create Public IP Addresses (Azure)

We need to create two Public IP Address (equivalent of an External IP in GCP) which will be used for our VPN gateway and for the VM we will create:

# create public IP address for VM
$vmpip = New-AzPublicIpAddress `
  -Name "vm-ip" `
  -ResourceGroupName "azure-to-gcp" `
  -Location "Australia Southeast" `
  -AllocationMethod Dynamic

# create public IP address for NW gateway 
$ngwpip = New-AzPublicIpAddress `
  -Name "ngw-ip" `
  -ResourceGroupName "azure-to-gcp" `
  -Location "Australia Southeast" `
  -AllocationMethod Dynamic

Step 6 : Create Virtual Network Gateway (Azure)

The Virtual Network Gateway in Azure is the VPN Gateway equivalent in Azure which will be used to create a VPN tunnel between Azure and a GCP VPN Gateway. This gateway will be placed in the Gateway subnet created previously and one of the Public IP addresses created in the previous step will be assigned to this gateway.

# create virtual network gateway
$ngwipconfig = New-AzVirtualNetworkGatewayIpConfig `
  -Name "ngw-ipconfig" `
  -SubnetId $gatewaySubnet.Id `
  -PublicIpAddressId $ngwpip.Id

# use the AsJob switch as this is a long running process
$job = New-AzVirtualNetworkGateway -Name "vnet-gateway" `
  -ResourceGroupName "azure-to-gcp" `
  -Location "Australia Southeast" `
  -IpConfigurations $ngwipconfig `
  -GatewayType "Vpn" `
  -VpnType "RouteBased" `
  -GatewaySku "VpnGw1" `
  -VpnGatewayGeneration "Generation1" `
  -AsJob

$vnetgw = Get-AzVirtualNetworkGateway `
  -Name "vnet-gateway" `
  -ResourceGroupName "azure-to-gcp"

Step 7 : Create a VPC Network and Subnetwork(s) (GCP)

A VPC network and subnet need to be created in GCP, the subnet defines the VPC address space. This address space must not overlap with the Azure Virtual Network CIDR. For all GCP steps it is assumed that the project is set for client config (e.g. gcloud config set project <>) so it does not need to be specified for each operation. Private Google access should be enabled on all subnets created.

# creating VPC network and subnets
gcloud compute networks create "azure-to-gcp-vpc" `
  --subnet-mode=custom `
  --bgp-routing-mode=regional

gcloud compute networks subnets create "aus-subnet" `
  --network  "azure-to-gcp-vpc" `
  --range "10.2.1.0/24" `
  --region "australia-southeast1" `
  --enable-private-ip-google-access

Step 8 : Create an External IP (GCP)

An external IP address will need to be created in GCP which will be used for the external facing interface of the VPN Gateway.

# create external IP
gcloud compute addresses create "ext-gw-ip" `
  --region "australia-southeast1"

$gcp_ipaddr_obj = gcloud compute addresses describe "ext-gw-ip" `
  --region "australia-southeast1" `
  --format json | ConvertFrom-Json

$gcp_ipaddr = $gcp_ipaddr_obj.address

Step 9 : Create Firewall Rules (GCP)

VPC firewall rules will need to be created in GCP to allow VPN traffic as well as SSH traffic from the internet (which allows you to SSH into VM instances using Cloud Shell).

# create VPN firewall rules
gcloud compute firewall-rules create "vpn-rule1" `
  --network "azure-to-gcp-vpc" `
  --allow tcp,udp,icmp `
  --source-ranges "10.1.0.0/16"

gcloud compute firewall-rules create "ssh-rule1" `
  --network "azure-to-gcp-vpc" `
  --allow tcp:22

Step 10 : Create VPN Gateway and Forwarding Rules (GCP)

Create a VPN Gateway and Forwarding Rules in GCP which will be used to create a tunnel between GCP and Azure.

# create cloud VPN 
gcloud compute target-vpn-gateways create "vpn-gw" `
  --network "azure-to-gcp-vpc" `
  --region "australia-southeast1" `
  --project "azure-to-gcp-project"

# create forwarding rule ESP
gcloud compute forwarding-rules create "fr-gw-name-esp" `
  --ip-protocol ESP `
  --address "ext-gw-ip" `
  --target-vpn-gateway "vpn-gw" `
  --region "australia-southeast1" `
  --project "azure-to-gcp-project"

# creating forwarding rule UDP500
gcloud compute forwarding-rules create "fr-gw-name-udp500" `
  --ip-protocol UDP `
  --ports 500 `
  --address "ext-gw-ip" `
  --target-vpn-gateway "vpn-gw" `
  --region "australia-southeast1" `
  --project "azure-to-gcp-project"

# creating forwarding rule UDP4500
gcloud compute forwarding-rules create "fr-gw-name-udp4500" `
  --ip-protocol UDP `
  --ports 4500 `
  --address "ext-gw-ip" `
  --target-vpn-gateway "vpn-gw" `
  --region "australia-southeast1" `
  --project "azure-to-gcp-project"

Step 10 : Create VPN Tunnel (GCP Side)

Now we will create the GCP side of our VPN tunnel using the Public IP Address of the Azure Virtual Network Gateway created in a previous step. As this example uses a route based VPN the traffic selector values need to be set at 0.0.0.0/0. A PSK (Pre Shared Key) needs to be supplied which will be the same key used when we configure a VPN Connection on the Azure side of the tunnel.

# get peer public IP address of Azure gateway
$azpubip = Get-AzPublicIpAddress `
  -Name "ngw-ip" `
  -ResourceGroupName "azure-to-gcp"

# create VPN tunnel 
gcloud compute vpn-tunnels create "vpn-tunnel-to-azure" `
  --peer-address $azpubip.IpAddress `
  --local-traffic-selector "0.0.0.0/0" `
  --remote-traffic-selector "0.0.0.0/0" `
  --ike-version 2 `
  --shared-secret <<Pre-Shared Key>> `
  --target-vpn-gateway "vpn-gw" `
  --region  "australia-southeast1" `
  --project "azure-to-gcp-project"

Step 11 : Create Static Routes (GCP Side)

As we are using static routing (as opposed to dynamic routing) we will need to define all of the specific routes on the GCP side. We will need to setup routes for both outgoing traffic to the Azure network as well as incoming traffic for the restricted Google API range (199.36.153.4/30).

# create static route (VPN)
gcloud compute routes create "route-to-azure" `
  --destination-range "10.1.0.0/16" `
  --next-hop-vpn-tunnel "vpn-tunnel-to-azure" `
  --network "azure-to-gcp-vpc" `
  --next-hop-vpn-tunnel-region "australia-southeast1" `
  --project "azure-to-gcp-project"

# create static route (Restricted APIs)
gcloud compute routes create apis `
  --network  "azure-to-gcp-vpc" `
  --destination-range "199.36.153.4/30" `
  --next-hop-gateway default-internet-gateway `
  --project "azure-to-gcp-project"

Step 12 : Create a Local Gateway (Azure)

A Local Gateway in Azure is an object that represents the remote gateway (GCP VPN gateway).

# create local gateway
$azlocalgw = New-AzLocalNetworkGateway `
  -Name "local-gateway" `
  -ResourceGroupName "azure-to-gcp" `
  -Location "Australia Southeast" `
  -GatewayIpAddress $gcp_ipaddr `
  -AddressPrefix "10.2.0.0/16"

Step 13 : Create a VPN Connection (Azure)

Now we can setup the Azure side of the VPN Connection which is accomplished by associating the Azure Virtual Network Gateway with the Local Network Gateway. A PSK (Pre Shared Key) needs to be supplied which is the same key used for the GCP VPN Tunnel in step 10.

# create connection
$azvpnconn = New-AzVirtualNetworkGatewayConnection `
  -Name "vpn-connection" `
  -ResourceGroupName "azure-to-gcp" `
  -VirtualNetworkGateway1 $vnetgw `
  -LocalNetworkGateway2 $azlocalgw `
  -Location "Australia Southeast" `
  -ConnectionType IPsec `
  -SharedKey  << Pre-Shared Key >>  `
  -ConnectionProtocol "IKEv2"

VPN Tunnel Established!

At this stage we have created an end to end connection between the virtual networks in both clouds. You should see this reflected in the respective consoles in each provider.

GCP VPN Tunnel to a Azure Virtual Network
Azure VPN Connection to a GCP VPC Network

Congratulations! You have just setup a multi cloud environment using private networking. Now let’s setup Google Private Access for Azure hosts and create VMs on each side to test our setup.

Step 14 : Create a Private DNS Zone for googleapis.com (Azure)

We will now need to create a Private DNS zone in Azure for the googleapis.com domain which will host records to redirect Google API requests to the Restricted API range.

# create private DNS zone
New-AzPrivateDnsZone `
  -ResourceGroupName "azure-to-gcp" `
  -Name "googleapis.com"

# Add A Records   
$Records = @()
$Records += New-AzPrivateDnsRecordConfig `
  -IPv4Address 199.36.153.4
$Records += New-AzPrivateDnsRecordConfig `
  -IPv4Address 199.36.153.5
$Records += New-AzPrivateDnsRecordConfig `
  -IPv4Address 199.36.153.6
$Records += New-AzPrivateDnsRecordConfig `
  -IPv4Address 199.36.153.7

New-AzPrivateDnsRecordSet `
  -Name "restricted" `
  -RecordType A `
  -ResourceGroupName "azure-to-gcp" `
  -TTL 300 `
  -ZoneName "googleapis.com" `
  -PrivateDnsRecords $Records

# Add CNAME Records   
$Records = @()
$Records += New-AzPrivateDnsRecordConfig `
  -Cname "restricted.googleapis.com."

New-AzPrivateDnsRecordSet `
  -Name "*" `
  -RecordType CNAME `
  -ResourceGroupName "azure-to-gcp" `
  -TTL 300 `
  -ZoneName "googleapis.com" `
  -PrivateDnsRecords $Records

# Create VNet Link
New-AzPrivateDnsVirtualNetworkLink `
  -ResourceGroupName "azure-to-gcp" `
  -ZoneName "googleapis.com" `
  -Name "dns-zone-link" `
  -VirtualNetworkId $vnet.Id

Step 15 : Create a Virtual Machine (Azure)

We will create a VM in Azure which we can use to test the VPN tunnel as well as to test Private Google Access over our VPN tunnel.

# create VM
$az_vmlocaladminpwd = ConvertTo-SecureString << Password Param >> `
  -AsPlainText -Force
$Credential = New-Object System.Management.Automation.PSCredential  ("LocalAdminUser", $az_vmlocaladminpwd);

$nic = New-AzNetworkInterface `
  -Name "vm-nic" `
  -ResourceGroupName "azure-to-gcp" `
  -Location "Australia Southeast" `
  -SubnetId $defaultSubnet.Id `
  -NetworkSecurityGroupId $nsg.Id `
  -PublicIpAddressId $vmpip.Id `
  -EnableAcceleratedNetworking `
  -Force

$VirtualMachine = New-AzVMConfig `
  -VMName "windows-desktop" `
  -VMSize "Standard_D4_v3"

$VirtualMachine = Set-AzVMOperatingSystem `
  -VM $VirtualMachine `
  -Windows `
  -ComputerName  "windows-desktop" `
  -Credential $Credential `
  -ProvisionVMAgent `
  -EnableAutoUpdate

$VirtualMachine = Add-AzVMNetworkInterface `
  -VM $VirtualMachine `
  -Id $nic.Id

$VirtualMachine = Set-AzVMSourceImage `
  -VM $VirtualMachine `
  -PublisherName 'MicrosoftWindowsDesktop' `
  -Offer 'Windows-10' `
  -Skus 'rs5-pro' `
  -Version latest

New-AzVM `-ResourceGroupName "azure-to-gcp"
  -Location "Australia Southeast" `
  -VM $VirtualMachine `
  -Verbose

Step 16 : Create a VM Instance (GCP)

We will create a Linux VM in GCP to test connectivity to hosts in Azure using the VPN tunnel we have established.

# create VM instance
gcloud compute instances create "gcp-instance" `
  --zone "australia-southeast1-b" `
  --machine-type "f1-micro" `
  --subnet "aus-subnet" `
  --network-tier PREMIUM `
  --maintenance-policy MIGRATE `
  --image=debian-9-stretch-v20200309 `
  --image-project=debian-cloud `
  --boot-disk-size 10GB `
  --boot-disk-type pd-standard `
  --boot-disk-device-name instance-1 `
  --reservation-affinity any

Test Connectivity

Now we are ready to test connectivity from both sides of the tunnel.

Azure to GCP

Establish a remote desktop (RDP) connection to the Azure VM created in Step 15. Ping the GCP VM instance using its private IP address.

Test Private IP Connectivity from Azure to GCP

GCP to Azure

Now SSH into the GCP Linux VM instance and ping the Azure host using its private IP address.

Test Private IP Connectivity from GCP to Azure

Test Private Google Access from Azure

Now that we have established bi-directional connectivity between the two clouds, we can test private access to Google APIs from our Azure host. Follow the steps below to test private access:

  1. RDP into the Azure VM
  2. Install the Google Cloud SDK ( https://cloud.google.com/sdk/)
  3. Perform an nslookup to ensure that calls to googleapis.com resolve to the restricted API range (e.g. nslookup storage.googleapis.com). You should see a response showing the A records from the googleapis.com Private DNS Zone created in step 14.
  4. Now test connectivity to Google APIs, for example to test access to Google Cloud Storage using gsutil, or test access to Big Query using the bq command

Congratulations! You are now a multi cloud ninja!