Using the Cloud to build multi-region architecture

There are a lot of documents and recommendations on how to build an architecture in the cloud, assuming the customer is located in a specific region, but what if we are an organization who provides information or services to customers all over the globe?

Example of such services — e-Commerce sites, streaming video services, news sites, gaming platforms, IoT services, etc.

In this article, I will try to map some of the considerations for choosing global services, which will enable us to build common architecture for customers all over the globe.

Background

When designing multi-region architecture, we need to take under consideration aspects such as deployment to multiple regions, ability to handle failure (or connectivity issues) in a specific region, ability to replicate data between remote geographic areas, ability to write/update data in a specific time interval over multiple geographic regions and the ability to deploy new application build or control the scale of an application in a simple manner, over multiple remote geographic areas (such as gradual application upgrades).

In certain scenarios (such as streaming media), we may wish to synchronize the same content to different areas in the world. In other scenarios (such as e-Commerce sites, news sites, etc.), we may wish to build similar architecture in different regions, while storing the data itself (such as product catalog, customer preferences, language, etc.) in the same geographic region as the customer.

When reviewing the requirement to build multi-region architecture, there are common reasons for such architecture:

Network latency between the customer and the cloud service:

Storing data close to the customer, improves the customer experience Example of such scenario — Streaming media. In this scenario we wish to sync the same content to multiple geographic regions

Disaster recovery:

Active-Active Site — An expensive solution, but enable us quick recovery from disaster, assuming we sync data in real-time (or near real-time) between multiple geographic regions
Active-Passive Site — Enable us to recover from disaster, but requires data synchronization mechanism and manual update of DNS records between sites, and manual switching between database roles from replica to master role

Regulation or customer related requirements:

The need to store customers’ data in a specific geographic region, according to regulation requirements (such as GDPR) Example of such scenario — e-Commerce site. In this scenario, we will build similar architecture in different geographic regions, but we will store customer related data (such as product catalog, customer preferences, etc.) in the same geographic region as the customer

Example of multi-region architecture

Network related aspects

When designing multi-region architecture, the first aspect we wish to review, from the customer’s point of view (browser, mobile device, IoT device, etc.), is the network aspect.

DNS services - Using these services, the customer accesses our infrastructure (or application) from the Internet.

Below are common DNS services for multi-region architecture:

Amazon Route 53 — Globally distributed service, which enable us to configure resource name resolution rules based on geo-location
Azure Traffic Manager — Globally distributed service, which enable us to redirect traffic based on DNS requests
Google Cloud DNS — Globally distributed service, which enable us to configure resource name resolution rules

‍

CDN (Content Delivery Network) services — Globally distributed network infrastructure, which enable our customers’ fast access to resources (using caching).

Below are common CDN services:

Amazon CloudFront — Globally distributed CDN infrastructure, based on Edge Locations
Azure CDN — Globally distributed CDN infrastructure (in certain countries, based on Akamai CDN infrastructure)
Google Cloud CDN — Globally distributed CDN infrastructure, based on Google global network infrastructure

‍

Defense against distributed denial of service (DDoS) and application (Layer 7) attacks — Since we are designing an infrastructure accessible and exposed from the Internet, we wish to protect our infrastructure and be able to integrate with other services (such as DNS, Load Balancing, etc.)

Below are common protection services:

AWS Shield — Globally distributed DDoS protection service, with WAF (Web Application Firewall) capabilities
Azure Front Door — Globally distributed DDoS protection service, with WAF (Web Application Firewall) capabilities
Google Cloud Armor — Globally distributed DDoS protection service, with WAF (Web Application Firewall) capabilities

‍

Load-Balancing services — Services that enable us to distribute the network load between different data centers or different geographic regions.

Below are common load-balancing services:

Amazon Application Load Balancer — Layer 7 (application) load balancing service. Although this is a regional service, we can build global infrastructure, by redirecting DNS traffic from Amazon Route 53 to our regional Amazon ALB
AWS Global Accelerator — Global service, which enable us to accelerate network traffic to our application, while using single global public IP address and supporting HTTP/HTTPS and TCP/UDP traffic
Azure Front Door — Global load-balancing service, supporting HTTP/HTTPS traffic
Google Cloud Load Balancing — Global load-balancing service, use single global public IP address and supporting HTTP/HTTPS, TCP/SSL and UDP traffic

‍

File storage related aspects

As in any other system, most chances that we want to share static content (files) in multiple regions, whether it is the source origin from which we wish to share content using CDN services, configuration files, backups, etc.

Object storage services — This type of services enable us to store files for read and update.

Below are common object storage services:

Amazon S3 — Managed object storage service. Although this is a regional service, we can replicate files between S3 buckets located in remote regions, using Cross Region replication feature
Azure Blob Storage — Managed object storage service. Although this is a regional service, we can replicate files between blob storage located in remote regions, using Geo Redundant Storage or Geo Zone Redundant Storage features
Google Cloud Storage — Globally managed object storage service, allowing us to store and replicate files automatically between remote regions

Database storage related aspects

Almost every system that exists today contains various types of databases for storing and querying data.

Relational Databases — Databases for working with structured data and a clearly defined schema Common relational database services:

Amazon Aurora — Managed database service, based on MySQL or PostgreSQL engine. Using a feature called Amazon Aurora Global Database, we can build global database between remote regions
Azure SQL Database — Managed database service, based on MS-SQL engine. Although this is a regional service, we can build asynchronous data replication process between remote regions, using Active Geo-Replication and Automatic Asynchronous Replication features
Google Cloud Spanner — Globally managed database service, which enable us to replicate data (read/write mode) between remote regions

NoSQL / Non-Relational Databases — Databases for storing large amount of non-structured data Common NoSQL database services:

Amazon DynamoDB — Managed NoSQL database service. Use a feature called Global Tables, we can build data replication process (read/write mode) between remote regions
Azure Cosmos DB — Globally managed NoSQL database service, which enable us to data replication process (read/write mode) between remote regions
Google Cloud BigTable — Globally managed NoSQL database service, which enable us to data replication process (read/write mode) between remote regions

Cost aspects

When designing multi-region architecture, we need to consider cost aspects, such as:

Service cost — In many scenarios mentioned in this article, global solution requires an expensive premium license
Egress (outbound) traffic cost — Cross region replication and inter-region traffic has its own cost model for each cloud provider

Operational aspects

When designing multi-region architecture, we need to consider operational aspects, such as:

DevOps, application deployment and upgrades — Ability to perform gradual application deployment or upgrades over multiple remote regions
Source / Configuration registry — The need to build central configuration / container registry for storing configuration, container images and any other type of data required to be synched between multiple remote regions around the globe
Monitoring — The requirement for constant monitoring of multiple services (from availability, through resource usage, scale, etc.), other remote regions
Data integrity / consistency — The ability to make sure data is synched and stored consistent between multiple remote regions
Availability — The ability to monitor service availability over multiple remote regions
Disaster recovery — The ability to conduct disaster recovery drills between remote regions, including failover and redirection of customer traffic between regions
Security and Governance — The ability to enforce access policies and security configurations over multiple regions
Regulation compliance — The ability to maintain global infrastructure, while complying with local regulation and privacy in certain parts of the world (such as the GDPR)

Summary

In this article, I have reviewed many aspects and consequences of designing and building multi-region architecture, which enables organizations to scale and to provide better customer service, by their origin.

It is important to remember, there is no instant architecture, which fits all organizations and all types of systems, and for each scenario we need to make the proper adjustments and choose the most appropriate service (whether managed service or not). The list of services mentioned in this article, will allow you to review your alternatives.

We need to take under consideration that multi-region (or global) architecture is just the mean and not the goal itself. Building and designing global infrastructure is expensive and requires considerations beyond the technical side — new monitoring services, different monitoring capabilities, effect of development process, etc.

Additional references: