A Sample Architecture

As you can imagine, the architecture of a full stack pool environment is pretty straightforward. In fact, on the surface, it doesn’t look all that unlike any classic application architecture. Figure 3-9 provides an example of a fully pooled architecture deployed in an AWS architecture.

Figure 3-9. A full stack pooled architecture

Here you’ll see I’ve included many of the same constructs that were part of our full stack silo environment. There’s a VPC for the network of our environment and it includes two Availability Zones for high availability. WIthin the VPC there are separate private and public subnets that separate the external and internal view of our resources. And, finally, within the private subnet you’ll see the placeholders for the various microservices that deliver the server side functionality of our application. These services have storage that is deployed in a pooled model and their compute is scaled horizontally using an auto-scaling group. At the top, of course, we also illustrate that this environment is being consumed by multiple tenants.

Now, in looking at this at this level of detail, you’d be hard-pressed to find anything distinctly multi-tenant about this architecture. In reality, this could be the architecture of almost any flavor of application. Multi-tenancy doesn’t really show up in a full stack pooled model as some concrete construct. The multi-tenancy of a pooled model is only seen if you look inside the run-time activity that’s happening within this environment. Every request that is being sent through this architecture is accompanied by tenant context. The infrastructure and the services must acquire and apply this context as part of every request that is sent through this experience.

Imagine, for example, a scenario where Tenant 1 makes a request to fetch an item from storage. To process that request, your multi-tenant services will need to extract the tenant context and use it to determine which items within the pooled storage are associated with Tenant 1. As I move through the upcoming chapters, you’ll see how this context ends up having a profound influence on the implementation and deployment of these services. For now, though, the key here is to understand that a full stack pooled model relies more on its run-time ability to share resources and apply tenant context where needed.

This architecture represents just one flavor of a full stack pooled model. Each technology stack (containers, serverless, relational storage, NoSQL storage, queues) can influence the footprint of the full stack pooled environment. The spirit of full stack pool remains the same across most of these experiences. Whether you’re in a Kubernetes cluster or a VPC, the basic idea here is that the resources in that environment will be pooled and will need to scale based on the collective load of all tenants.

A Hybrid Full Stack Deployment Model

So far, I’ve mostly presented full stack silo and full stack pool deployment models as two separate approaches to the full stack problem. It’s fair to think of these two models as addressing a somewhat opposing set of needs and almost view them as being mutually exclusive. However, if you step back and overlay market and business realities on this problem, you’ll see how some organizations may see value in supporting both of these models.

Figure 3-10 provides a view of a sample hybrid full stack deployment model. Here we have the same concepts we covered with full stack silo and pool deployment models sitting side-by-side.

Figure 3-10. A hybrid deployment model

So, why both models? What would motivate adopting this approach? Well, imagine you’ve built your SaaS business and you started out offering all customers a full stack pooled model (shown on the left here). Then, somewhere along the way, you ran into a customer that was uncomfortable running in a pooled model. They may have noisy neighbor concerns. They may be worried about some compliance issues. Now, you’re not necessarily going to cave to every customer that has this pushback. That would undermine much of what you’re trying to achieve as a SaaS business. Instead, you’re going to make efforts to help customers understand the security, isolation, and strategies you’ve adopted to address their needs. This is always part of the job of selling a SaaS solution. At the same time, there may be rare conditions when you might be open to offering a customer their own full stack siloed environment. This could be driven by a strategic opportunity or it may be that some customer is willing to write a large check that could justify offering a full stack silo.

In Figure 3-10, you can see how the hybrid full stack deployment model lets you create a blended approach to this problem. On the left-hand side of this diagram is an instance of a full stack pooled environment. This environment supports that bulk of your customers and we label these tenants, in this example, as belonging to your basic tier experience.

Now, for the tenants that demanded a more siloed experience, I have created a new premium tier that allows tenants to have a full stack silo environment. Here we have two full stack siloed tenants that are running their own stacks. The assumption here (for this example) is that these tenants are connected to a premium tier strategy that has a separate pricing model.

For this model to be viable, you must apply constraints to the number of tenants that are allowed to operate in a full stack silo model. If the ratio of siloed tenants becomes too high, this can undermine your entire SaaS experience.

The Mixed-Mode Deployment Model

To this point, I’ve focused heavily on the full stack models. While it’s tempting to view multi-tenant deployments through these more coarse-grained models, the reality is that many systems rely on a much more fine-grained approach to multi-tenancy, making silo and pool choices across the entire surface of their SaaS environment. This is where we look more at what I refer to as a mixed mode deployment model.

With mixed mode deployments, you’re not dealing with the heavy absolutes that come with full stack models. Instead, mixed mode allows us to look at the workloads within our SaaS environment and determine how each of the different services and resources within your solution should be deployed to meet the specific requirements of a given use case.

Let’s take a simple example. Imagine I have two services in my e-commerce solution. I have an order service that has challenging throughput requirements that are prone to noisy neighbor problems. This same service also stores data that is going to grow significantly and has strict compliance requirements that are hard to support in a pooled model. I also have a ratings service that is used to manage product ratings. It doesn’t really face any significant throughput challenges and can easily scale to handle the needs of tenants–even when a single single tenant might be putting a disproportionate load on the service. Its storage is also relatively small and contains data that isn’t part of the system’s compliance profile.

In this scenario, I can step back and consider these specific parameters to arrive at a deployment strategy that best serves the needs of these services. Here, I might choose to make both the compute and the storage of my order service siloed and the compute and storage of my rating service pooled. There might even be cases where the individual layers of a service could have separate silo/pool strategies. This is the basic point I was making when I was first introducing the notion of silo and pool at the outset of this chapter.

Equipped with this more granular approach to silo and pool strategies, you can now imagine how this might yield much more diverse deployment models. Consider a scenario where you might use this strategy in combination with a tiering model to define your multi-tenant deployment footprint.

The image in Figure 3-11 provides a conceptual view of how you might employ a mixed mode deployment model in your SaaS environment. I’ve shown a variety of different deployment experiences spanning the basic and premium tier tenants.

Figure 3-11. A mixed mode deployment model

On the left-hand side of this image, we have our basic tier services. These services cover all the functionality that is needed for our SaaS environment. However, you’ll note that they are deployed in different silo/pool configurations. Service 1, for example, has siloed compute and pooled storage. Meanwhile, Service 2 has siloed compute and siloed storage. Services 3-6 are all pooled compute and pooled storage. The idea here is that I’ve looked across the needs of my pooled tenants and identified, on a service-by-service basis, which silo/pool strategy will best fit the needs of that service. The optimizations that have been introduced here were created as baseline strategies that were core to the experience of any tenant using the system.

Now, where tiers do come into play is when you look at what I’ve done with the premium tier tenants. Here, you’ll notice that Services 5 and 6 are deployed in the basic tier and they’re also deployed separately for a premium tier tenant. The thought was that, for these services, the business determined that offering these services in a dedicated model would represent value that could distinguish the experience of the system’s premium tier. So, for each premium tier tenant, we’ll create new deployments of Service 5 and 6 to support the tiering requirements of our tenants. In this particular example, Tenant 3 is a premium tier tenant that consumes a mix of the services on the left and these dedicated instances of Services 5 and 6 on the right.

Approaching your deployment model in this more granular fashion provides a much higher degree of flexibility to you as the architect and to the business. By supporting silo and pool models at all layers, you have the option to compose the right blend of experiences to meet the tenant, operational, and other factors that might emerge throughout the life of your solution. If you have a pooled microservice with performance issues that are creating noisy neighbor challenges, you could silo the compute and/or storage of the service to address this problem. If your business wants to offer some parts of your system in a dedicated model to enable new tiering strategies, you are better positioned to make this shift.

This mixed mode deployment model, in my opinion, often represents a compelling option for many multi-tenant builders. It allows them to move away from having to approach problems purely through the lens of full stack solutions that don’t always align with the needs of the business. Yes, there will always be solutions that use the full stack model. For some SaaS providers, this will be the only way to meet the demands of their market/customers. However, there are also cases where you can use the strengths of the mixed mode deployment model to address this need without moving everything into a full stack silo. If you can just move specific services into the silo and keep some lower profile services in the pool, that could still represent a solid win for the business.

The Pod Deployment Model

So far, I’ve mostly looked at deployment models through the lens of how you can represent the application of siloed and pooled concepts. We explored coarse- and fine-grained ways to apply the silo and pool model across your SaaS environment. To be complete, I also need to step out of the silo/pool focus and think about how an application might need to support a variation of deployment that might be shaped more by where it needs to land, how it deals with environmental constraints, and how it might need to morph to support the scale and reach of your SaaS business. This is where the pod deployment model comes into the picture.

When I talk about pods here, I’m talking about how you might group a collection of tenants into some unit of deployment. The idea here is that I may have some technical, operational, compliance, scale, or business motivation that pushes me toward a model where I put tenants into individual pods and these pods become a unit of deployment, management, and operation for my SaaS business. Figure 3-12 provides a conceptual view of a pod deployment.

Figure 3-12. A pod deployment model

In this pod deployment model, you’ll notice that we have the same centralized control plane on the left-hand side of this experience. Now, however, on the right-hand side, I have included individual pods that are used to represent self-contained environments that support the workload of one or more tenants. In this example, I have Tenants 1-3 in pod 1 and Tenants 4-6 in pod 2.

These separate pods bring a degree of complexity to a SaaS environment, requiring your control to build in the mechanisms to support this distribution model. How tenants are onboarded, for example, must consider which pod a given tenant will land in. Your management and operations must also become pod aware, providing insights into the health and activity of each pod.

There are a number of factors that could drive the adoption of a pod-based delivery model. Imagine, for example, having a full stack pooled model running in the cloud that, at a certain number of tenants, begins to exceed infrastructure limits of specific services. In this scenario, your only option might be to create separate cloud accounts that host different groups of tenants to work around these constraints. This could also be driven by a need for deploying a SaaS product into multiple geographies where the requirements of that geography or performance considerations could tip you toward a pod-based deployment model where different geographies might be running different pods.

Some teams may also use pods as an isolation strategy where there’s an effort to reduce cross-tenant impacts. This can be motivated by a need for greater protections from noisy neighbor conditions. Or, it might play a role in the security and availability story of a SaaS provider.

If you choose to adopt a pod model, you’ll want to consider how this will influence the agility of your business. Adopting a pod model means committing to absorbing the extra complexity and automation that allows you to support and manage pods without having any one-off mechanisms for individual pods. To scale successfully, the configuration and deployment of these pods must all be automated through your control plane. If some change is required to pods, that change is applied universally to all pods. This is the mirror of the mindset I outlined with full stack silo environments. The pod cannot be viewed as an opportunity to enable targeted customization for individual tenants.

One dynamic that comes with pods is this idea of placing tenants into pods and potentially viewing membership within a pod as something can be shifted during the life of a tenant. Some organizations may have distinct pod configurations that might be optimized around the profile of a tenant. So, if a tenant’s profile somehow changes and their sizing or consumption patterns aren’t aligned with those of a given pod, you could consider moving that tenant to another pod. However, this would come with some heavy lifting to get the entire footprint of the tenant transferred to another pod. Certainly this would not be a daily exercise, but is something that some SaaS teams support–especially those that have pods that are tuned to a specific experience.

While pods have a clear place in the deployment model discussion, it’s important not to see pods as a shortcut for dealing with multi-tenant challenges. Yes, the pod model can simplify some aspects of scale, deployment, and isolation. At the same time, pods also add complexity and inefficiencies that can undermine the broader value proposition of SaaS. You may not, for example, be able to maximize the alignment between tenant consumption and infrastructure resources in this model. Instead, you may end up with more instances of idle or overprovisioned resources distributed across the collection of pods that your system supports. Imagine an environment where you had 20 pods. This could have a significant impact on the overall infrastructure cost profile and margins of your SaaS business.

Conclusion

This chapter focused on identifying the range of SaaS deployment models that architects must consider when designing a multi-tenant architecture. While some of these models have very different footprints, they all fit within the definition of what it means to be SaaS. This aligns with the fundamental mindset I outlined in Chapter 1, identifying SaaS as a business model that can be realized through multiple architecture models. Here, you should see that–even though I outlined multiple deployment models–they all shared the idea of having a single control plane that enables each environment and its tenant to be deployed, managed, operated, onboarded, and billed through a unified experience. Full stack silo, full stack pool, mixed mode–they all conform with the notion of having all tenants running the same version of a solution and being operated through a single pane of glass.

From looking at these deployment models, it should be clear that there are a number of factors that might push you toward one model or another. Legacy, domain, compliance, scale, cost efficiency, and a host of other business and technical parameters are used to find the deployment model (or combination of deployment models) that best align with the needs of your team and business. It’s important to note that the models I covered here represent the core themes while still allowing for the fact that you might adopt some variation of one of these models based on the needs of your organization. As you saw with the hybrid full stack model, it’s also possible that your tiering or other considerations might have you supporting multiple models based on the profile of your tenants.

Now that you have a better sense of these foundational models, we can start to dig into the more detailed aspects of building a multi-tenant SaaS solution. I’ll start covering the under-the-hood moving parts of the application and control planes, highlighting the services and code that is needed to bring these concepts to life. The first step in that process is to look at multi-tenant identity and onboarding. Identity and onboarding often represent the starting point of any SaaS architecture discussion. They lay the foundation for how we associate tenancy with users and how that tenancy flows through the moving parts of your multi-tenant architecture. As part of looking at identity, I’ll also explore tenant onboarding which is directly connected to this identity concept. As each new tenant is onboarded to a system, you must consider how that tenant will be configured and connected to its corresponding identity. Starting here will allow us to explore the path to a SaaS architecture from the outside in.

About the Author

Tod Golding is a cloud applications architect who has spent the last seven years immersed in cloud-optimized application design and architecture. As a global SaaS lead within AWS, Tod has been a SaaS technology thought leader, publishing and providing SaaS best practices guidance through a broad set of channels (speaking, writing, and working directly with a wide range of SaaS companies). Tod has over 20 years of experience as an architect and developer, including time at both startups and tech giants (AWS, eBay, Microsoft). In addition to speaking at technical conferences, Tod also authored Professional .NET Generics, was coauthor on another book, and was a columnist for Better Software magazine.

What is cloud computing?

The best way to understand the cloud is to take the electricity supply analogy. To get electricity in your house, you just flip the switch on. Electric bulbs lighten your home and other appliances. In this case, you only pay for your electricity use when you need them. When you switch off the electric appliances, you are not paying anything. Now, imagine if you need to power a couple of appliances, and for that, you have to set up an entire powerhouse. It will be costly, right? as it involves the cost of maintaining the turbine, generator, and building the whole infrastructure. Utility companies make your job easier by supplying the electricity quantity you need. They maintain the entire infrastructure to generate electricity. They could keep the cost down by distributing electricity to millions of houses which helped them benefit from mass utilization. Now let’s come to cloud computing; while consuming cloud resources, you pay for IT infrastructure such as computing and storage in the pay-as-you-go model. Here, public clouds like AWS do the heavy lifting to maintain IT infrastructure and provide you access over the internet under pay as you go, model. They are revolutionizing the IT infrastructure industry, where traditionally, you have to maintain your servers all by yourself on-premise to run your business, but now you can offload that to the public cloud and focus on your core business. For example, CapitalOne’s core business is banking and does not run a large data center. Before going deeper into cloud computing, let’s analyze some of the key characteristics of the public cloud.

Cloud elasticity

One important characteristic of the public cloud providers such as AWS is the ability to quickly and frictionlessly provision resources. These resources could be a single instance of a database or a thousand copies of the same server used to handle your web traffic. These servers can be provisioned within minutes.Contrast that with how performing the same operation may play out in a traditional on-premises environment. Let’s use an example. You need to set up a cluster of computers to host your latest service. Your next actions probably look something like this:

  1. You visit the data center and realize that the current capacity is insufficient to host this new service.
  2. You map out a new infrastructure architecture.
  3. You size the machines based on the expected load, adding a few more terabytes and a few gigabytes to ensure that you don’t overwhelm the service.
  4. You submit the architecture for approval to the appropriate parties.
  5. You wait. Most likely for months.

It may not be uncommon once you get the approvals to realize that the market opportunity for this service is now gone or that it has grown more. The capacity you initially planned will not suffice. It isn’t easy to overemphasize how important the ability to deliver a solution quickly is when you use cloud technologies to enable these solutions.Image what will happen if, after getting everything set up in the data center and after months of approvals, you told the business sponsor that you made a mistake. You ordered a 64 GB RAM server instead of a 128 GB, so you won’t have enough capacity to handle the expected load. Getting the right server will take a few more months? Also, the market is moving fast, and your user workload increases 5x by the time you get the server. Now it’s good news for business, but as you cannot scale your server so fast, the user experience will ultimately be compromised, and they will switch to other options. It is not necessarily a problem in a cloud environment because instead of needing months to provision your servers, they can be provisioned in minutes. Correcting the size of the server may be as simple as shutting down the server for a few minutes, changing a drop-down box value, and restarting the server again. Even you can go serverless and let the cloud handle the scaling for you while you focus on your business problems.Hopefully, the above example here drives our point home about the power of the cloud. The cloud exponentially improves time to market. And being able to deliver quickly may not just mean getting there first. It may be the difference between getting there first and not getting there. Another powerful characteristic of a cloud computing environment is the ability to quickly shut down resources and, significantly, not be charged for that resource while it is down. In our continuing on-premises example, if we shut down one of our servers. Do you think we can call the company that sold us the server and politely asks them to stop charging us because we shut the server down? That would be a very quick conversation. It would probably not be a delightful user experience, depending on how persistent we were. They will probably say, “You bought the server; you can do whatever you want with it, including using it as a paperweight.” Once the server is purchased, it is a sunk cost for the duration of the server’s useful life.In contrast, whenever we shut down a server in a cloud environment. The cloud provider can quickly detect that and put that server back into the pool of available servers for other cloud customers to use that newly unused capacity.

Private versus public clouds

A private cloud is a service dedicated to a single customer—it is like your on-premise data center, which is accessible to one large enterprise. A private cloud has become a fancy name for a data center managed by a trusted third party. All the elasticity benefits wither away. This concept has gained momentum to ensure security. Initially, enterprises were skeptical about public cloud security, which is multi-tenant. But having your own infrastructure dimmish the value of the cloud as you have to pay for resources even if you are not running it. Let’s use an analogy to understand the private cloud further. The gig economy has great momentum. Everywhere you look, people are finding employment as contract workers. Uber drivers are setting up Airbnbs, and people are doing contract work for Upwork. One of the reasons contract work is getting more popular as it enables consumers to contract services that they may otherwise not be able to afford. Could you imagine how expensive it would be to have a private chauffeur? But with Uber or Lyft, you almost have a private chauffeur who can be at your beck and call within a few minutes of you summoning them.A similar economy of scale happens with a public cloud. You could have access to infrastructure and services that would cost millions of dollars if you bought them on your own. Instead, you can access the same resources for a small fraction of the cost.Even though AWS, Azure, GCP, and the other popular cloud providers are considered mostly public clouds. There are some actions you can take to make them more private. For example, AWS offers Amazon EC2 dedicated instances, which are EC2 instances that ensure that you will be the only user for a given physical server. Further, AWS offers AWS Outpost, where you can order server rack and host workload in your premise using the AWS control plane. Dedicated instance and Outpost costs are significantly higher than on-demand EC2 instances. On-demand instances ?? may be shared with other AWS users. As mentioned earlier in the chapter, you will never know the difference because of virtualization and hypervisor technology. One common use case for choosing dedicated instances is government regulations and compliance policies. That requires certain sensitive data to not be in the same physical server with other cloud users.Indeed private clouds are expensive to run and maintain. For that reason, many of the resources and services offered by the major cloud providers reside in public clouds. But just because you are using a private cloud does not mean that it cannot be set up insecurely and conversely. Suppose you are running your workloads and applications on a public cloud. You can use security best practices and sleep well at night knowing that you use state-of-the-art technologies to secure your sensitive data.Additionally, most major cloud providers’ clients use public cloud configurations, but there are a few exceptions even in this case. For example, the United States government intelligence agencies are a big AWS customer. As you can imagine, they have deep pockets and are not afraid to spend. In many cases with these government agencies, AWS will set up the AWS infrastructure and services on the agency’s premises. You can find out more about this here:https://aws.amazon.com/federal/us-intelligence-community/Now that we have gained a better understanding of cloud computing in general. Let’s get more granular and learn about how AWS does cloud computing.

Cloud virtualization

Virtualization is running multiple virtual instances on top of a physical computer system using an abstract layer sitting on top of actual hardware. More commonly, virtualization refers to the practice of running multiple operating systems on a single computer at the same time. Applications running on virtual machines are oblivious that they are not running on a dedicated machine. These applications are unaware that they share resources with other applications on the same physical machine.A hypervisor is a computing layer that enables multiple operating systems to execute in the same physical compute resource. These operating systems running on top of these hypervisors are Virtual Machines (VMs) – a component that can emulate a complete computing environment using only software but as if it was running on bare metal. Hypervisors, also known as Virtual Machine Monitors (VMMs), manage these VMs while running side by side. A hypervisor creates a logical separation between VMs. It provides each of them with a slice of the available compute, memory, and storage resources.It allows VMs not to clash and interfere with each other. If one VM crashes and goes down, it will not make other VMs go down with it. Also, if there is an intrusion in one VM, it is fully isolated from the rest.

Definition of the cloud

Let’s now attempt to define cloud computing.The cloud computing model offers computing services such as compute, storage, databases, networking, software, machine learning, and analytics over the internet and on-demand. You generally only pay for the time and services you use. Most cloud providers can provide massive scalability for many of their services and make it easy to scale services up and down.As much as we tried to nail it down, this is still a pretty broad definition. For example, we specify that the cloud can offer software in our definition. That’s a pretty general term. Does the term software in our definition include the following?

  • Video Conferencing
  • Virtual desktops
  • Email services
  • Contact Center
  • Document Management

These are just a few examples of what may or may not be included as available services in a cloud environment. When it comes to AWS and other major cloud providers, The answer is yes. When AWS started, it only offered a few core services, such as compute (Amazon EC2) and basic storage (Amazon S3). As of 2022, AWS has continually expanded its services to support virtually any cloud workload. Currently, It has more than 200 fully featured services for compute, storage, databases, networking, analytics, machine learning, artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, virtual & augmented reality (VR and AR), media, application development, and deployment. As a fun fact, as of 2021, Amazon Elastic Cloud Compute (EC2) alone offers over 475 types of compute instances.For the individual examples given here, AWS offers the following:

  • Video conferencing – Amazon Chime
  • Virtual desktops – AWS Workspaces
  • Email services – Amazon WorkMail
  • Contact Center – Amazon Connect
  • Document Management – Amazon Workdocs

As we will see throughout the book, here is a sample of AWS’s offers many services. Additionally, since it was launched, AWS services and features have grown exponentially every year, as shown in the following figure:

 Figure 1.1 – AWS – number of featuresFigure 1.1 – AWS – number of features  

There is no doubt that the number of offerings will continue to grow at a similar rate for the foreseeable future. AWS is a cloud market leader as it has a lot of functionality. They are innovating faster, especially in new areas such as Machine Learning and Artificial Intelligence, the Internet of Things, Serverless Computing, Blockchain, and even quantum computing.You must have heard cloud terms more often in different contexts, including the public and private clouds. Let’s learn more about it.