Scalable, Secure Application Load Balancing with VPC Native GKE and Istio

At the time of this writing, GCP does not have a generally available non-public facing Layer 7 load balancer. While this is sure to change in the future, this article outlines a design pattern which has been proven to provide scalable and extensible application load balancing services for multiple applications running in Kubernetes pods on GKE.

When you create a service of type LoadBalancer in GKE, Kubernetes hooks into the provider (GCP in this case) on your behalf to create a Google Load Balancer, while this may be specified as INTERNAL, there are two issues:

Issue #1:

The GCP load balancer created for you is a Layer 4 TCP load balancer.

Issue #2:

The normal behaviour is for Google to enumerate all of the node pools in your GKE cluster and “automagically” create mapping GCE instance groups for each node pool for each zone the instances are deployed in. This means the entire surface area of your cluster is exposed to the external network – which may not be optimal for internal applications on a multi tenanted cluster.

The Solution:

Using Istio deployed on GKE along with the Istio Ingress Gateway along with an externally created load balancer, it is possible to get scalable HTTP load balancing along with all the normal ALB goodness (stickiness, path-based routing, host-based routing, health checks, TLS offload, etc.).

An abstract depiction of this architecture is shown here:

Istio Ingress Design Pattern for VPC Native GKE Clusters

This can be deployed with a combination of Terraform and kubectl. The steps to deploy at a high level are:

  1. Create a GKE cluster with at least two node pools: ingress-nodepool and service-nodepool. Ideally create these node pools as multi-zonal for availability. You could create additional node pools for your Egress Gateway or an operations-nodepool to host Istio, etc as well.
  2. Deploy Istio.
  3. Deploy the Istio Ingress Gateway service on the ingress-nodepool using Service type NodePort.
  4. Create an associated Certificate Gateway using server certificates and private keys for TLS offload.
  5. Create a service in the service-nodepool.
  6. Reserve an unallocated static IP address from the node network range.
  7. Create an internal TCP load balancer:
    1. Specify the frontend as the IP address reserved in step 6.
    2. Specify the backend as the managed instance groups created during the node pool creation for the ingress-nodepool (ingress-nodepool-ig-a, ingress-nodepool-ig-b, ingress-nodepool-ig-c).
    3. Specify ports 80 and 443.
  8. Create a GCP Firewall Rule to allow traffic from authorized sources (network tags or CIDR ranges) to a target of the ingress-nodepool network tag.
  9. Create a Cloud DNS A Record for your managed zone as *.namespace.zone pointing to the IP Address assigned to the load balancer frontend in step 7.1.
  10. Enable Health Checks through the GCP firewall to reach the ingress-nodepool network tag at a minimum – however there is no harm in allowing these to all node pools.

The service should then be resolvable and routable from authorized internal networks (peered private VPCs or internal networks connected via VPN or Dedicated Interconnect) as:

https://<service>.<namespace>.<zone>/<endpoint>

The advantages of this design pattern are…

  1. The Ingress Gateway provides fully functional application load balancing services.
  2. Istio provides service discovery and routing using names and namespaces.
  3. The Ingress Gateway service and ingress gateway node pool can be scaled as required to meet demand.
  4. The Ingress Gateway is multi zonal for greater availability

AWS Professional and Speciality Exam Tips

One you get beyond the Associate level AWS certification exams into the Professional or Speciality track exams the degree of difficulty rises significantly. As a veteran of the Certified Solutions Architect Professional and Big Data Specialty exams, I thought I would share my experiences which I believe are applicable to all the certification streams and tracks in the AWS certification program.

First off let me say that I am a self-professed certification addict, having sat more than thirty technical certification exams over my thirty plus year career in technology including certification and re-certification exams. I would put the AWS professional and specialty exams right up there in terms of their level of difficulty.

The AWS Professional and Specialty exams are specifically designed to be challenging. Although they have removed the pre-requisites for these exams (much to my dismay…), you really need to be prepared for these exams otherwise you are throwing your hard-earned money away.

There are very few – if any – “easy” questions. All of the questions are scenario based and require you to design a solution to meet multiple requirements. The question and/or the correct answer will invariably involve the use of multiple AWS services (not just one). You will be tested on your reading comprehension, time management and ability to cope under pressure as well as being tested on your knowledge of the AWS platform.

The following sections provide some general tips which will help you approach the exam and give you the best chance of success on the day. This is not a brain dump or a substitute for the hard work and dedication required to ensure success on your exam day.

Time Management

Needless to say, your ability to manage time is critical, on average you will have approximately 2-3 minutes to answer each question. Reading the questions and answers carefully may take up 1-2 minutes on its own. If the answer is not apparent to you, you are best to mark the question and come back to it at the end of the exam.

In many cases there may be subsequent questions and answer sets which jog your memory or help you deduce the correct answers to the questions you initial passed on. For instance, you may see references in future questions which put context around services you may not be completely familiar with, this may enable you to answer flagged questions with more confidence.

Of course, you must answer all questions before completing the exam, there are no points for incomplete or unattempted answers.

Recommended Approach to each Question

Most of the questions on the Professional or Specialty certification exams fall into one of three categories:

  • Short-ish question, multiple long detailed answer options
  • Long-ish scenario question, multiple relatively short answer options
  • Long-ish question with multiple relatively long, detailed answers

The latter scenario is thankfully less common. However, in all cases it is important to read the last sentence in the question first, this will provide indicators to help you read through the question in its entirety and all of the possible answers with a clear understanding of what is “really” being asked. For instance, the operative phrase may be “highly available” or “most cost effective”.

Try to eliminate answers based on what you know, for instance answers with erroneous instance families can be eliminated immediately. This will give you a much better statistical chance of success, even if you have to venture an educated guess in the end.

The Most Complicated Solution is Probably Not the Correct One

In many answer sets to questions on the Professional or Specialty exams you will see some ridiculously complicated solution approaches, these are most often incorrect answers. Although there may be enough loosely relevant terminology or services to appear reasonable.

Note the following statement direct from the AWS Certified Solutions Architect Professional Exam Blueprint:


“Distractors, or incorrect answers, are response options that an examinee with incomplete knowledge or skill would likely choose. However, they are generally plausible responses that fit in the content area defined by the test objective.”

AWS wants professionals who design and implement solutions which are simple, sustainable, highly available, scalable and cost effective. One of the key Amazon Leadership Principles is “Invent and Simplify”, simplify is often the operative word.

Don’t spend time on dumps or practice exams (other than those from AWS)

The question pools for AWS exams are enormous, the chances of you getting the same questions and answer sets as someone else are slim. Furthermore, non-AWS sources may not be trustworthy. There is no substitute to AWS white papers, how to’s, and real-life application of your learnings.

Don’t focus on Service Limits or Calculations

In my experiences with AWS exams, they are not overly concerned with service limits, default values, formulas (e.g. the formula to calculate required partitions for a DynamoDB table) or syntax – so don’t waste time remembering them. You should however understand the 7 layer OSI model and be able to read and interpret CIDR notation.

Mainly, however, they want you to understand how services work together in an AWS solution to achieve an outcome for a customer.

Some Final Words of Advice

Always do what you think AWS would want you to do! 

It is worthwhile having a quick look at the AWS Leadership Principles (I have already referenced one of these in this article) as these are applied religiously in every aspect of the AWS business.  In particular, you should pay specific attention to the principals around simplicity and frugality.

Good luck!

GCP Networking for AWS Professionals

GCP and AWS share many similarities, they both provide similar services and both leverage containerization, virtualization and software defined networking.

There are some significant differences when it comes to their respective implementations, networking is a key example of this.

Before we compare and contrast the two different approaches to networking, it is worthwhile noting the genesis of the two major cloud providers.

Google was born to be global, Amazon became global

By no means am I suggesting that Amazon didn’t have designs on going global from it’s beginnings, but AWS was driven (entirely at the beginning) by the needs of the Amazon eCommerce business. Amazon started in the US before expanding into other regions (such as Europe and Australia). In some cases the expansion took decades (Amazon only entered Australia as a retailer in 2018).

Google, by contrast, was providing application, search and marketing services worldwide from its very beginning. GCP which was used as the vector to deliver these services and applications was architected around this global model, even though their actual data centre expansion may not have been as rapid as AWS’s (for example GCP opened its Australia region 5 years after AWS).

Their respective networking implementations reflect how their respective companies evolved.

AWS is a leader in IaaS, GCP is a leader in PaaS

This is only an opinion and may be argued, however if you look at the chronology of the two platforms, consider this:

  • The first services released by AWS (simultaneously for all intents and purposes) were S3, SQS and EC2
  • The first service released by Google was AppEngine (a pure PaaS offering)

Google has launched and matured their IaaS offerings since as AWS has done similarly with their PaaS offerings, but they started from much different places.

With all of that said, here are the key differences when it comes to networking between the two major cloud providers:

GCP VPCs are Global by default, AWS VPCs are Regional only

This is the first fundamental difference between the two providers. Each GCP project is allocated one VPC network with Subnets in each of the 18 GCP Regions. Whereas each AWS Account is allocated one Default VPC in each AWS Region with a Subnet in each AWS Availability Zone for that Region, that is each account has 17 VPCs in each of the 17 Regions (excluding GovCloud regions).

Default Global VPC Network in GCP

It is entirely possible to create VPCs in GCP which are Regional, but they are Global by default.

This global tenancy can be advantageous in many cases, but can be limiting in others, for instance there is a limit of 25 peering connections to any one VPC, the limit in AWS is 125.

GCP Subnets are Regional, AWS Subnets are Zonal

Subnets in GCP automatically span all Zones in a Region, whereas AWS VPC Subnets are assigned to Availability Zones in a Region. This means you are abstracted from some of the networking and zonal complexity, but you have less control over specific network placement of instances and endpoints. You can infer from this design that Zones are replicated or synchronised within a Region, making them less of a direct consideration for High Availability (or at least as much or your concern as they otherwise would be).

All GCP Firewall Rules are Stateful

AWS Security Groups are stateful firewall rules – meaning they maintain connection state for inbound connections, AWS also has Network ACLs (NACLs) which are stateless firewall rules. GCP has no direct equivalent of NACLs, however GCP Firewall Rules are more configurable than their AWS counterparts. For instance, GCP Firewall Rules can include Deny actions which is not an option with AWS Security Group Rules.

Load Balancers in GCP are layer 4 (TCP/UDP) unless they are public facing

AWS Application Load Balancers can be deployed in private VPCs with no external IPs attached to them. GCP has Application Load Balancers (Layer 7 load balancers) but only for public facing applications, internal facing load balancers in GCP are Network Load Balancers. This presents some challenges with application level load balancing functionality such as stickiness. There are potential workarounds however such as NGINX in GKE behind

Firewall rules are at the Network Level not at the Instance or Service Level

There are simple firewall settings available at the instance level, these are limited to allowing HTTP and HTTPS traffic to the instance only and don’t allow you to specify sources. Detailed Firewall Rules are set at the GCP VPC Network level and are not attached or associated with instances as they are in AWS.

Hopefully this is helpful for AWS engineers and architects being exposed to GCP for the first time!

The Streaming Data Warehouse: Kappa Architecture and Data Warehousing re-imagined

The aspiration to extend data analysis (predictive, descriptive or otherwise) to streaming event data has been common across every enterprise scale program I have been involved with. Often, however, this aspiration goes unrealised as it tends to slide down the priority scale as we still grapple with legacy batch oriented integration patterns and processes.

Event processing is not a new concept, real time event and transaction processing has been a standard feature for security, digital and operations functions for some time, however in the Data Warehousing, BI and Advanced Analytics worlds it is often spoken about but rarely implemented, except for tech companies of course. In many cases personalization is still a batch oriented process, e.g. train a model from a feature set built from historical data, generate recommendations in batch, serve these recommendations upon the next visit – wash, rinse, and repeat.

Lambda has existed for several years now as a data-processing architecture pattern designed to incorporate both batch and stream-processing capabilities. Moreover, messaging platforms have existed for decades, from point-to-point messaging systems, to message-oriented-middleware systems, to distributed pub-sub messaging systems such as Apache Kafka.

Additionally, open source streaming data processing frameworks and tools have proliferated in recent years with projects such as Storm, Samza, Flink and Spark Streaming becoming established solutions.

Kafka in particular, with its focus on durability, resiliency, availability and consistency, has graduated into fully fledged data platform not simply a transient messaging system. In many cases Kafka is serving as a back end for operational processes, such as applications implementing the CQRS (Command Query Responsibility Segregation) design pattern.

In other words, it is not the technology that holds us back, it’s our lack of imagination.

Enter Kappa Architecture where we no longer have to attempt to integrate streaming data with batch processes…everything is a stream. The ultimate embodiment of Kappa Architecture is the Streaming Data Warehouse.

In the Streaming Data Warehouse, tables are represented by topics. Topics represent either:

  • unbounded event or change streams; or
  • stateful representations of data (such as master, reference or summary data sets).

This approach makes possible the enrichment and/or summarisation of transaction or event data with master or reference data. Furthermore many of the patterns used in data warehousing and master data management are inherent in Kafka as you can represent the current state of an object as well as the complete change history of that object (in other words change data capture and associated slowly changing dimensions from one inbound stream).

Data is acquired from source systems either in real time or as a scheduled extract process, in either case the data is presented to Kafka as a stream.

The Kafka Avro Schema Registry provides a systematic contract with source systems which also serves as a data dictionary for consumers supporting schema evolution with backward and forward compatibility. Data is retained on the Kafka platform for a designated period of time (days or weeks) where it is available for applications and processes to consume – these processes can include data summarisation or sliding window operations for reporting or notification, or data integration or datamart building processes which sink data to other systems – these could include relational or non-relational data stores.

Real time applications can be built using the KStreams API and emerging tools such as KSQL can be used to provide a well-known interface for sampling streaming data or performing windowed processing operations on streams. Structured Streaming in Spark or Spark Streaming in its original RDD/DStream implementation can be used to prepare and enrich data for machine learning operations using Spark ML or Spark MLlib.

In addition, data sinks can operate concurrently to sink datasets to S3 or Google Cloud Storage or both (multi cloud – like real time analytics – is something which is talked about more than it’s implemented…).

In the Streaming Data Warehouse architecture Kafka is much more than a messaging platform it is a distributed data platform, which could easily replace major components of a legacy (or even a modern) data architecture.
It just takes a little imagination…

When Life Gives you Undrinkable Wine – Make Cognac…

Firstly, this is not another motivational talk or motherhood statement (there are plenty of those out there already), but a metaphor about resourcefulness.

We have all heard the adage “when life gives you lemons, make lemonade”. Allow me to present a slight twist (pardon the pun…) on this statement.

Cognac is a variety of brandy named after the town of Cognac, France. It is distilled from a white wine grown from grapes in the area. The wine used to make Cognac is characterised as “virtually undrinkable”. However in its distilled form, Cognac is considered to be the world’s most refined spirit. With bottles of high end Cognac fetching as much as $200,000.

The area surrounding the town of Cognac was a recognised wine-growing area dating back to the third century, however when it was evident that the grapes produced in the town of Cognac itself were unsuitable for wine making, local producers developed the practice of double distillation in copper pot stills and ageing in oak barrels for at least two years. The product yielded was the spirit we now know as Cognac.

Fast forward 15 centuries to Scotland in the 1800’s, where John Walker was a humble shopkeeper. Local whiskey producers would bring John different varieties of single malt whiskeys, many of them below an acceptable standard. Instead of on selling these varieties – he began blending them as made-to-order whiskies, blended to meet specific customer requirements. From this idea a product (and subsequently one of the most globally recognised brands) was created which would last over 200 years.

Interesting, but what does this have to do with technology you might ask?

Ingenuity and resourcefulness have existed for as long as man has, but in the realm of technology and in particular cloud computing and open source software it has never been more imperative. We are continually faced with challenges on each assignment or project, often technology not working as planned or as advertised, or finding dead ends when trying to solve problems. This is particularly evident now as software and technology are moving at such an accelerated rate.

To be successful in this era you need creativity and lateral thinking. You need an ability not just to solve problems to which there are no documented solutions, but to create new products from existing ones which are not necessarily suitable for your specific objective. In many cases, looking to add value in the process. That is not just making lemonade from lemons (which anyone could think of) but making Cognac from sub-standard grapes or premium blended whiskey from sub-standard single malt whiskies.

One of my favourite Amazon Leadership Principles is “Invent and Simplify”:

Leaders expect and require innovation and invention from their teams and always find ways to simplify. They are externally aware, look for new ideas from everywhere, and are not limited by “not invented here”. As we do new things, we accept that we may be misunderstood for long periods of time.

https://www.amazon.jobs/en/principles

The intent in this statement is not simply to “lift and shift” current on premise systems and processes to the cloud, but to look for ways to streamline, rationalise and simplify processes in doing so. This mandates a combination of practicality and creativity.

The take away is when you are presented with a challenge, be creative and inventive and not just solve the problem but look for ways to add value in the process.

Cheers!

Really Simple Terraform – S3 Object Notifications using Lambda and SES

Following on from the previous post in the Really Simple Terraform series simple-lambda-ec2-scheduler, where we used Terraform to deploy a Lambda function including the packaging of the Python function into a ZIP archive and creation of all supporting objects (roles, policies, permissions, etc) – in this post we will take things a step further by using templating to update parameters in the Lambda function code before the packaging and creation of the Lambda function.

S3 event notifications can be published directly to an SNS topic which you could create an email subscription, this is quite straightforward. However the email notifications you get look something like this:

Email Notification sent via an SNS Topic Subscription

There is very little you can do about this.

However if you take a slightly different approach by triggering a Lambda function to send an email via SES you have much more control over content and formatting. Using this approach you could get an email notification that looks like this:

Email Notification sent using Lambda and SES

Much easier on the eye!

Prerequisites

You will need verified AWS SES (Simple Email Service) email addresses for the sender and recipient’s addresses used for your object notification emails. This can be done via the console as shown here:

SES Email Address Verification

Note that SES is not available in every AWS region, pick one that is generally closest to your particular reason (but it really doesn’t matter for this purpose).

Deployment

The Terraform module creates an IAM Role and associated policy for the Lambda function as shown here:

Variables in the module are substituted into the function code template, the rendered template file is then packaged as a ZIP archive to be uploaded as the Lambda function source as shown here:

As in the previous post, I will reiterate that although Terraform is technically not a build tool, it can be used for simple build operations such as this.

The Lambda function is deployed using the following code:

Finally the S3 object notification events are configured as shown here:

Use the following commands to run this example (I have created a default credentials profile, but you could supply your API credentials directly, use STS, etc):

cd simple-notifications-with-lambda-and-ses
terraform init
terraform apply

Full source code for this article can be found at: https://github.com/avensolutions/simple-notifications-with-lambda-and-ses

Really Simple Terraform – Infrastructure Automation using AWS Lambda

There are many other blog posts and examples available for either scheduling infrastructure tasks such as the starting or stopping of EC2 instances; or deploying a Lambda function using Terraform. However, I have found many of the other examples to be unnecessarily complicated, so I have put together a very simple example doing both.

The function itself could be easily adapted to take other actions including interacting with other AWS services using the boto3 library (the Python AWS SDK). The data payload could be modified to pass different data to the function as well.

The script only requires input variables for schedule_expression (cron schedule based upon GMT for triggering the function – could also be expressed as a rate, e.g. rate(5 minutes)) and environment (value passed to the function on each invocation). In this example the Input data is the value for the “Environment” key for an EC2 instance tag – a user defined tag to associate the instance to a particular environment (e.g. Dev, Test. Prod). The key could be changed as required, for instance if you wanted to stop instances based upon their given name or part thereof you could change the tag key to be “Name”.

When triggered, the function will stop all running EC2 instances with the given Environment tag.

The Terraform script creates:

  • an IAM Role and associated policy for the Lambda Function
  • the Lambda function
  • a Cloudwatch event rule and trigger

The IAM role and policies required for the Lambda function are deployed as shown here:

The function source code is packaged into a ZIP archive and deployed using Terraform as follows:

Admittedly Terraform is an infrastructure automation tool and not a build/packaging tool (such as Jenkins, etc), but in this case the packaging only involves zipping up the function source code, so Terraform can be used as a ‘one stop shop’ to keep things simple.

The Cloudwatch schedule trigger is deployed as follows:

Use the following commands to run this example (I have created a default credentials profile, but you could supply your API credentials directly, use STS, etc):

cd simple-lambda-ec2-scheduler
terraform init
terraform apply
Terraform output

Full source code is available at: https://github.com/avensolutions/simple-lambda-ec2-scheduler

Stay tuned for more simple Terraform deployment recipes in coming posts…

Multi Stage ETL Framework using Spark SQL

Most traditional data warehouse or datamart ETL routines consist of multi stage SQL transformations, often a series of CTAS (CREATE TABLE AS SELECT) statements usually creating transient or temporary tables – such as volatile tables in Teradata or Common Table Expressions (CTE’s).

The initial challenge when moving from a SQL/MPP based ETL framework platformed on Oracle, Teradata, SQL Server, etc to a Spark based ETL framework is what to do with this…

Multi Stage SQL Based ETL

One approach is to use the lightweight, configuration driven, multi stage Spark SQL based ETL framework described in this post.

This framework is driven from a YAML configuration document. YAML was preferred over JSON as a document format as it allows for multi-line statements (SQL statements), as well as comments – which are very useful as SQL can sometimes be undecipherable even for the person that wrote it.

The YAML config document has three main sections: sources, transforms and targets.

The sources section is used to configure the input data source(s) including optional column and row filters. In this case the data sources are tables available in the Spark catalog (for instance the AWS Glue Catalog or a Hive Metastore), this could easily be extended to read from other datasources using the Spark DataFrameReader API.

The transforms section contains the multiple SQL statements to be run in sequence where each statement creates a temporary view using objects created by preceding statements.

Finally the targets section writes out the final object or objects to a specified destination (S3, HDFS, etc).

The process_sql_statements.py script that is used to execute the framework is very simple (30 lines of code not including comments, etc). It loads the sources into Spark Dataframes and then creates temporary views to reference these datasets in the transforms section, then sequentially executes the SQL statements in the list of transforms. Lastly the script writes out the final view or views to the desired destination – in this case parquet files stored in S3 were used as the target.

You could implement an object naming convention such as prefixing object names with sv_, iv_, fv_ (for source view, intermediate view and final view respectively) if this helps you differentiate between the different objects.

To use this framework you would simply use spark-submit as follows:

spark-submit process_sql_statements.py config.yml

Full source code can be found at: https://github.com/avensolutions/spark-sql-etl-framework

The Cost of Future Change: What we should really be focused on (but no one is…)

In my thirty year career in Technology I can’t think of a more profound period of change. It is not as if change never existed previously, it did. Innovation and Moore’s law have been constant features in technology. However, for decades we had experienced a long period of relativity stability. That is, the rate of change was somewhat fixed. This was largely due to the fact that change, for the most part, was dictated almost entirely by a handful of large companies (such as Oracle, Microsoft, IBM, SAP, etc.). We, the technology community, were exclusively at the mercy of the product management teams at these companies dreaming up the next new feature, and we were subject to the tech giant’s product release cycle as to when we could use this feature. The rate of change was therefore artificially suppressed to a degree.

Then along came the open source software revolution. I can recall being fortunate enough to be in a meeting with Steve Ballmer along with a small group of Microsoft partners in Sydney, Australia around 2002. I can remember him as a larger than life figure with a booming voice and an intense presence. I can vividly remember him going on a long diatribe about Open Source Software, primarily focused around Linux. To put some historical context around it, Linux was a fringe technology at best at that point in time – nobody was taking it seriously in the enterprise. The Apache Software Foundation wasn’t a blip on the radar at the time, Software-as-a-Service was in its infancy, and Platform-as-a-Service and cloud were non-factors – so what was he worried about?

From the fear instilled in his rant, you would have thought this was going to be the end of humanity as we knew it. Well it was the beginning of the end, the end of the halcyon days of the software industry as we knew it. The early beginnings of the end of the monopoly on change held by a select few software vendors. The beginning of the democratisation of change.

Fast forward fifteen years, and now everyone is a (potential) software publisher. Real world problems are solved in real companies and in many cases software products are created, published and made (freely) available to other companies. We have seen countless examples of this pattern, Hadoop at Yahoo!, Kafka at LinkedIn, Airflow at Airbnb, to name a few. Then there are companies created with scant capital or investment, driven by a small group of smart people focused on solving a problem or streamlining a process that they have encountered in the real world. Many of these companies growing to be globally recognised names, such as Atlassian or Hashicorp.

The rate of change is no longer suppressed to the privileged few – in fact we have an explosion of change. No longer do we have a handful of technology options to achieve a desired outcome or solve a particular problem, we have a constellation of options.

Now the bad news, the force multiplier in change created by open source communities cuts both ways. When the community is behind a project it gets hyper-charged, conversely when the community drops off, functionality and stability drop off just as quickly.

This means there are components, projects, modules, services we are using today that won’t be around in the same format in 2-3 years’ time. There are products that don’t exist today that will be integral (and potentially critical) to our operations in 1-2 years’ time.

This brings me to my final point – the cost of future change. We as leaders and custodians of technology in this period should not only factor in current day run costs or the cost of change we know about (the cost of current change), but the hidden cost of future change. We need to think that whatever we are doing now is not what we will be doing in a years’ time – we will need to accommodate future change.

We need to be thinking about the cost of future change and what we can do to minimise this. This means not just swapping out one system for another, but thinking about how you will change this system or component again in the near future. We need to take the opportunity that is in front of us to replace monoliths with modular systems, move from tightly coupled to loosely coupled components, from proprietary systems to open systems, from rigid architectures to extensible architectures.

When you’re planning a change today, consider the cost of future change…