12 min readJan 11, 2021

Designing Cloud Solutions Doesn’t Have to be a Daunting Task

Part 3 — Creating a Reference Cloud Solution Architecture based on the Architecture Framework

Carol B. Hernandez, Ph.D., Cloud Solutions Architect, IBM

Doug Eppard, Cloud Integration Architect, IBM

Kudos, you’ve almost crossed the finish line! Now that you understand the Architecture Framework and how to use it to create Cloud Solutions, this blog will apply those concepts to create a cloud reference solution and you’ll find it is not a daunting task!

This is part 3 of a 3-part blog series which discusses:

1. Introduction to the Architecture Framework

2. How to use the Architecture Framework to create fit-for-purpose Cloud Solutions

3. Creating a Reference Cloud Solution Architecture based on the Architecture Framework

In Part 1 and Part 2 of this blog series, we introduced the Architecture Framework and a structured approach to design Cloud Architecture solutions using the Architecture Framework. In this blog we will show you step by step how to create a solution to address a specific set of requirements using the approach described in Part 2. We will describe the requirements, solution components, high level architecture diagram, and architecture decisions for the purpose of supporting the Bill of Materials (BOM) for the cloud solution.

Common Themes for Workload Migration to Cloud

Looking at how enterprises move their workloads and business capabilities to the cloud, we see the following common trends:

1. Hybrid Cloud

Hybrid cloud and multi-cloud deployments are becoming the norm as enterprise customers look to consume cloud IaaS, PaaS and SaaS services and source those services from a variety of cloud providers.

2. Non-Functional Requirements (NFRs)

Cloud solutions have to meet the same resiliency, security, and regulatory compliance requirements of enterprise workloads deployed on-premises.

3. Interoperability

Enterprise business processes do not exist in isolation; there is often significant interoperability requirements across business processes and functions, regardless of where they are deployed.

Table 1 summarizes the high-level containerized workload requirements which will be used to create the Reference Solution in this blog.

Applying the 5-Step Approach to Cloud Architecture Design

In Part 2 of this blog series, we introduced five steps to simplify Cloud Architecture Design:

1. Create a Solution Architecture Heatmap that highlights the Aspects and Domains relevant to the solution requirements.

2. Identify Component Options for each domain in the Solution Architecture heatmap.

3. Use Decision Tools to select the best-fit components for each domain based on requirements and constraints.

4. Document Architecture Decisions for the components associated with each domain.

5. Create a Reference Architecture Diagram that illustrates how the main components of the solution come together.

The next sections illustrate each of the steps to create a solution that addresses the common enterprise requirements listed in Table 1.

Step 1: Create Solution Architecture Heatmap

The first step is to review requirements to identify the aspects and domains that will apply to the solution we are creating. For this example, we will target requirements for selected domains of the following aspects: Compute, Network, Storage, Security, Resiliency, and Service Management.

Figure 1 shows the Solution Architecture heatmap that identifies the domains that address the requirements from table 1.

Figure 1: Solution Architecture Heatmap for Containerized Workload Hybrid-Cloud Solution

Step 2: Identify Component Options

The next step is to identify the available Components (technology or product choices) for each aspect and domain relevant to the solution on a specific target cloud deployment or cloud service provider. In this example, the target deployment is hybrid cloud and the cloud service provider is IBM Cloud.

Compute and Storage Aspects Components

Let’s start with the component options for the Containers. From requirements R1 & R3, we will need a container platform that can be used on-premises and the target public cloud provider (IBM Cloud). Looking at the IBM Cloud Containers catalog, we find the following options:

1. Kubernetes Service

2. Red Hat OpenShift on IBM Cloud (Classic or VPC Infrastructure)

3. Satellite

Note: Satellite provides an option to consume Red Hat OpenShift on IBM Cloud clusters and other IBM Cloud Platform as a Service (PaaS) services on any cloud environment: enterprise on-premises data centers, public cloud providers, or at the edge. At the time of this writing, Satellite is in Beta and will not be considered as an option for this use case.

Container platforms use persistent storage for stateful applications. The storage component options for Kubernetes Service and OpenShift on IBM Cloud are as follows:

1. File Storage

2. Block Storage (Classic)

3. Block Storage (VPC)

4. Object Storage

5. Portworx

The Container Registry Component Options on IBM Cloud are:

1. OpenShift Container Registry (OCR)

2. IBM Cloud Container Registry (ICCR)

Networking Aspect Components

The Solution Architecture heatmap indicates we need solution components for the Enterprise Connectivity, Load Balancing, and Network Segmentation domains. The available components for each of these domains on the IBM Cloud are identified next.

Enterprise Connectivity Domain

The component options to connect enterprise private networks to IBM Cloud Virtual Private Cloud (VPC) networks in a region or to connect VPC networks in multiple IBM Cloud regions are:

1. Direct Link 2.0

2. VPN Gateway for VPC

3. Transit Gateway

Application Load Balancing Domain

The component options to distribute traffic among VPC Virtual Servers or applications deployed on containers are:

1. VPC Application Load Balancer

2. OpenShift Route

3. Ingress Controller

Global Load Balancing Domain

The component options to distribute traffic across IBM Cloud regions are:

1. Cloud Internet Services (CIS)

2. DNS Services

3. On-premises Global Load Balancer (GLB)

Network Segmentation Domain

The component options to segregate network traffic and control access to VPC Virtual Servers and containers are:

1. Virtual Private Cloud (VPC)

2. VPC Security Groups

3. VPC Access Control Lists (ACLs)

4. VPC Subnets

5. Kubernetes Network Policies

Table 2 summarizes the component options for the Networking domains in the solution scope.

Security Aspect Components

The available components on the IBM Cloud for each of the security domains within the scope of the solution are identified next.

Network: Edge Protection IPS/IDS Domain

The component options to provide Intrusion Protection and Intrusion Detection are:

1. F5 BIG IP-VE

2. Palo Alto Next-Gen Virtual Firewall

3. Check Point CloudGuard

Data Encryption and Security Operation Domains

Data Encryption is provided as follows:

1. Storage encryption capabilities available for the selected storage components.

2. Container Cluster Data encryption options for the selected container platform. The Kubernetes Services and the Red Hat OpenShift on IBM Cloud clusters support data encryption of cluster data with Key Management Services (KMS).

There are two Key Management Services available on IBM Cloud that can be used with the available storage options:

1. Key Protect

2. Hyper Protect Crypto Services

Identity and Access Management Domain

The component options to control access Containers on IBM Cloud are:

1. IBM Cloud IAM Roles

2. Kubernetes RBAC Roles

3. Security Context Constraints

Table 3 summarizes the component options for the Security domains in the solution scope.

Resiliency Aspect Components

The component options on IBM Cloud to meet the High Availability requirements are:

1. Single-zone Kubernetes cluster in 2 regions

2. Multi-zone Kubernetes cluster in 2 regions

3. Multiple single-zone Kubernetes Clusters in one region

Service Management Aspect Components

The component options for Monitoring Containers on IBM Cloud are:

1. Prometheus Cluster Monitoring

2. Monitoring with Sysdig

The Logging component options for Containers on IBM Cloud are:

1. OpenShift Cluster Logging Operator

2. Log Analysis with LogDNA

Step 3: Use Decision Tools to Select Best-Fit Components

The third step is to use Decision Tools to evaluate the component options for the applicable aspects and domains against solution requirements and known constraints.

This section will illustrate available information and comparison tables that can help you select the best-fit component to meet your requirements.

Compute

IBM Cloud docs provides a table that compares OpenShift and Community Kubernetes clusters which can help you decide which Container Service to use. Given requirements R1 & R2, we will choose the Red Hat OpenShift on IBM Cloud Container Platform for the Solution.

OpenShift clusters on IBM Cloud can be deployed on Classic or VPC infrastructure. Based on the networking requirement R4, we are going to use OpenShift Clusters on VPC Infrastructure, which allows you to disable the cluster’s public service endpoint so the cluster can only be accessed through the private network.

OpenShift Clusters can be created as single-zone clusters or multi-zone clusters. Planning Your Cluster for High Availability describes these two options. A multi-zone cluster deployed in three zones has a 99.99% SLA, which meets Requirement R10.

OpenShift clusters on IBM Cloud are set up by default with the OpenShift Internal registry and integrated with the private IBM Cloud Container Registry. The IBM Cloud Container Registry is a private registry service that can manage images for multiple clusters and performs automatic vulnerability scanning of the images.

Storage

Since the OpenShift Container Platform was chosen for the containers domain, we need to consider the persistent storage requirements for the OpenShift Container Internal Registry, Monitoring and Logging Services, and stateful applications deployed in the cluster.

OpenShift on VPC clusters automatically back up the internal registry to IBM Cloud Object Storage.

The type of persistent storage chosen for an application depends on the application characteristics, the structure of the data, and the data usage pattern, among other things. Applications deployed on OpenShift clusters could also use available Database Services on IBM Cloud to persist their data.

Portworx Storage requires the OpenShift cluster to have public network connectivity. This goes against requirement R4, which means it cannot be used for our solution. VPC Block Storage will be used along with Cloud Object Storage to meet the persistent storage requirements of the applications and OpenShift Internal Registry.

The Monitoring and Logging services components requirements will be discussed in the Service Management section.

Networking

Applications deployed on the OpenShift cluster may need to access applications or data on-premises. Table 4 contrasts the available options.

Table 4. Enterprise Connectivity Options

Based on this information and assuming there aren’t low data latency requirements, we will choose the VPN Gateway for VPC for on-premises connectivity.

To meet requirements R4 and R5, the OpenShift cluster will be created within a Virtual Private Cloud (VPC) network. Based on requirement R11, the application will be deployed in more than one region. A Transit Gateway will be needed to connect VPCs in different IBM Cloud Regions.

VPC Security Groups, Subnets, Access Control Lists, and Kubernetes Network Policies can be used to meet requirement R6. A comparison of these options can be found at this link.

The article Choosing Among Load Balancing Solutions discusses options for exposing applications deployed on OpenShift clusters. To meet requirement R5, the OpenShift cluster is created with a private service endpoint only, and applications are exposed to the private network using a Private OpenShift Router and Private VPC Load Balancer.

A private Global Load Balancer is needed to distribute application traffic across two regions. The IBM Cloud Internet service provides global load balancing on the public network only, so it cannot be used in the solution. For our scenario, we will use a private network Global Load Balancer already deployed on-premises.

Application users will connect to the applications exposed on the private network through the customer enterprise network.

Security

The component choice to meet the Network Security domain requirements, R5 in particular, is driven by the consistency of tools across IT environments and available skills. It depends, to a large degree, on the customer’s established standard for tools. For this scenario we will assume F5 BIG IP-VE is the preferred choice.

For Data Encryption, the components chosen for persistent storage, VPC Block Storage, and Cloud Object Storage support encryption at rest and in transit with customer provided keys and can be integrated with IBM Cloud Key Management Services (KMS). The selected container platform, OpenShift on VPC, supports encryption of cluster data with IBM Cloud KMS.

The choice of KMS depends on the workloads single tenancy and security requirements. Table 5 describes the two options available on IBM Cloud.

For this solution we will choose Key Protect, a multi-tenant service that supports Bring-Your-Own-Key and uses FIPS 140–2 Level 3 certified HSMs.

For Identity and Access Management, all the available components are used to assign different access levels to OpenShift cluster resources. IBM Cloud Identity and Access Management (IAM) platform and service access roles can be used to set access policies to authorize users to work with cluster infrastructure resources or just deploy containers. Kubernetes RBAC roles can be used along with IAM service roles to isolate resources in the cluster. Security context constraints (SCCs) are used to control access to cluster resources at the pod level. For more information see Assigning Cluster Access.

Service Management

The Monitoring and Logging component options for Red Hat OpenShift on IBM Cloud are discussed in the Options for Monitoring and Options for Logging articles.

The IBM Cloud Monitoring with Sysdig and Log Analysis with LogDNA services are the preferred choices for this solution since these are highly available and scalable services that can be integrated with other cloud resources and services that are part of the solution.

Resiliency

Table 6 contrasts the different options to achieve High Availability. The multi-zone clusters across two regions is the best match for requirements R10 and R11.

Step 4: Document Architecture Decisions

The components chosen for each of the domains from the analysis in the preceding step along with the rationale are captured in the architecture decisions. Tables 7–10 contain all the Architecture Decisions for deploying a Containerized Workload Hybrid-Cloud Solution on IBM Cloud. The tables cover the component decisions made to meet the requirements mapped to relevant domains in the Compute, Storage, Networking, Security, Resiliency, and Service Management aspects.

Compute and Storage

Networking

Security

Resiliency and Service Management

Step 5: Create Reference Architecture Diagram

The last step is to create a Reference Architecture Diagram that illustrates the connections and chosen components for the solution.

The solution uses Red Hat OpenShift on IBM Cloud as the container platform to provide portability, a seamless user experience, and a consistent set of management tools across on-premises and multiple public cloud service providers. Applications are deployed in OpenShift on VPC multi-zone clusters in two IBM Cloud regions to protect against data center and zone failures as well as region outages.

OpenShift clusters are set up by default with the internal OpenShift Container Registry, which provides local image caching for faster builds. The internal OpenShift Container Registry is integrated with the private IBM Cloud Container Registry to enable image-sharing across multiple clusters and provide automatic vulnerability scanning of the images.

OpenShift cluster data and application data stored in OpenShift persistent storage is encrypted with customer-provided keys using the Key Protect service.

The solution meets network isolation and security requirements that are typical for regulated workloads. It provides connectivity to a private enterprise network through a secure VPN connection. OpenShift clusters are configured with a private service endpoint so they are accessed only through the private network. Applications deployed on the OpenShift clusters are exposed through a Private OpenShift Router and access all Cloud Services using private network service endpoints. Application users connect to the applications through the customer enterprise network using a VPN connection.

Figure 2 shows the Reference Architecture Diagram for the example Containerized Workload Hybrid-Cloud Solution. Draw.io was used to create this diagram; however, you can use your tool of choice. Draw.io comes with a variety of shapes and icons which can be added to your shapes pallet for IBM, AWS, Google Cloud and Azure.

Figure 2: Containerized Workload Hybrid-Cloud Solution Architecture

Table 11 contains the list of solution components.

Table 11. Containerized Workload Hybrid-Cloud Solution Components

This completes the example reference solution created using the five steps covered in blog 2 and the last part of this blog series.

Congratulations! You made it to the finish line!

You should be able to apply the approach outlined in this blog series to create cloud solutions for any target deployment model using technologies from any cloud provider.