Azure Arc: Unified Management for Hybrid and Multi-Cloud Environments

I’ve been deep in the trenches of multi-cloud tooling for years now, exploring everything from Kubernetes to Terraform, and all the glue that holds modern infrastructures together. Recently, I took a deep dive into Azure Arc, Microsoft’s hybrid and multi-cloud management solution, to see what it brings to the table. What follows is a breakdown of Azure Arc, framed through the lens of someone who’s seen the evolution of these tools over time.

The Core Idea Behind Azure Arc

At its core, Azure Arc is Microsoft’s answer to the complexities of hybrid and multi-cloud management. The idea is simple yet powerful: provide a unified management experience for resources, regardless of where they live. Whether you’re running workloads on-premises, across different cloud providers, or out on the edge, Azure Arc aims to bring them all under a single, cohesive management umbrella.

What Azure Arc Really Does

Azure Arc extends Azure’s management capabilities beyond its own boundaries. Think of it as a bridge that connects your existing infrastructure to Azure’s powerful tools and services. Once your resources are “Arc-enabled,” you can manage them just like you would any native Azure resource. This means applying policies, leveraging Azure’s security features, and using monitoring tools – all from within the Azure portal.

The beauty of Azure Arc is that it doesn’t discriminate based on where your resources are. Whether it’s a Linux server running in your own datacenter, a Kubernetes cluster on Google Cloud, or even a SQL database on AWS, Azure Arc brings it all together. This isn’t just about management, though. Azure Arc also allows you to deploy Azure services, like SQL and PostgreSQL Hyperscale, directly into your non-Azure environments.

Azure Arc for Kubernetes

Azure Arc’s support for Kubernetes is where things get particularly interesting. If you’re managing Kubernetes clusters across different environments—whether it’s AKS, EKS on AWS, GKE on Google Cloud, or even an on-premises setup—Azure Arc brings these disparate clusters into the fold under Azure’s management.

With Azure Arc, you can attach your Kubernetes clusters to Azure, enabling you to deploy applications consistently across all your clusters using GitOps, apply consistent security and governance policies, and even monitor and manage them from the Azure portal. This is incredibly powerful in multi-cloud environments where you might have clusters spread across different platforms but need a unified approach to management and operations.

The integration with AKS is seamless, of course, but the real power lies in Azure Arc’s ability to connect with other cloud providers’ Kubernetes offerings. Whether you’re dealing with AWS’s EKS, GCP’s GKE, or a custom Kubernetes setup, Azure Arc enables a level of control and consistency that can be a game-changer in complex, hybrid environments.

Defining Azure Arc

Azure Arc is, in essence, a set of technologies that extend Azure’s control plane to wherever your resources reside. Here’s what it means in practice:

  • Unified Server Management: You can connect and manage your Windows and Linux servers across on-premises, edge, and multi-cloud environments from a single pane of glass within Azure.
  • Kubernetes Cluster Integration: Azure Arc allows you to attach and manage Kubernetes clusters from anywhere. This means consistent management, monitoring, and governance across your entire Kubernetes estate, regardless of where those clusters are running.
  • Data Services Anywhere: With Azure Arc, you can run Azure’s data services, like Azure SQL and PostgreSQL Hyperscale, in any environment. This gives you the flexibility to use Azure’s data capabilities wherever you need them most.
  • Consistent Governance and Security: Perhaps one of the biggest wins here is the ability to enforce compliance and governance policies consistently across all your resources, no matter where they’re deployed.

In short, Azure Arc is Microsoft’s play to bring coherence and control to the sprawling, often chaotic, world of hybrid and multi-cloud environments. It’s a tool that’s not just about visibility but about giving you the power to manage, secure, and optimize your entire infrastructure from a single point of control. And in a world where resources are scattered across different platforms and locations, that’s a game-changer.

Further Reading and Viewing on Azure Arc

If you’re interested in diving deeper into Azure Arc and particularly how it integrates with Kubernetes across various environments, there are a few must-see resources:

  1. Azure Arc-enabled Kubernetes Extensibility Model | Azure Friday – In this video, Scott Hanselman and Lior Kamrat discuss how Azure Arc enables Kubernetes clusters outside of Azure to be managed and governed like native Azure resources, complete with demos.
  2. Azure Arc-enabled Kubernetes with GitOps | Azure Friday – This session delves into how Azure Arc uses GitOps to manage Kubernetes clusters across various environments, showcasing how to maintain a single source of truth through GitHub repositories.
  3. Building Modern Hybrid Applications with Azure Arc and Azure Stack | Azure Friday – Thomas Maurer joins Scott Hanselman to demonstrate how to build and manage hybrid applications across multiple environments using Azure Arc, including a demo of Kubernetes integration.

These resources provide a comprehensive look at how Azure Arc can be integrated into your multi-cloud strategy, particularly if Kubernetes is part of your infrastructure mix.

Meetup Video: “Does the Cloud Kill Open Source?”

🆕 Had a great time at the last Seattle Scalability Meetup. I’ve also just finished processing and fixing up the talk video from this last Seattle Scalability Meetup. I feel like I’ve finally gotten the process of streaming and getting things put together post-stream so that I can make them available almost immediately afterwards.

Here @rseroter gives us a full review of various business models, open source licenses, and a solid situational report on cloud providers and open source.

Join the meetup group here: https://www.meetup.com/Seattle-Scalability-Meetup/

The next meetup on April 23rd we’ve got Dr. Ryan Zhang coming in to talk about serverless options. More details, and additional topic content will be coming soon.

Then in May, on the 28th, Guinevere (@guincodes) is going to present “The Pull Request That Wouldn’t Merge”. More details, and additional topic content will be coming soon.

Here’s some of the talks I streamed recently. Note, didn’t have the gear setup all that well just yet, but the content is there!

Terraform “Invalid JWT Signature.”

I ran into this issue recently. The “Invalid JWT Signature.” error while running some Terraform. It appeared to occur whenever I was setting up a bucket in Google Cloud Platform to use for a back-end to store Terraform’s state. In console, here’s the exact error.

terraform-jwt-invalid.png

My first quick searches uncovered some Github issues that looked curiously familiar. Invalid JWT Token when using Service Account JSON #3100 which was closed without any particular resolution. Upon further searching it didn’t help to much but I’d be curious as to what the resolution was. The second is Creating GCP project in terraform #13109 which sounded much more on point compared to my issue. This appeared closer to my issue but it looked like I should probably just start from scratch, since this did work on one machine already but just didn’t work on this machine I shifted to. (Grumble grumble, what’d I miss).

The Solution(s)?

In the end this is a message, if you work on multiple machines with multiple cloud accounts you might get the keys mixed up. In this particular case I reset my NIC (i.e. you can just reboot too, especially if on Windows it’s just easier to do that). Then everything just started working. In some cases however, the JSON with the gcloud/gcp keys needs to be regenerated as the old key was rolled or otherwise invalidated.

 

Let’s Really Discuss Lock In

For to long lock-in has been referred to with an almost entirely negative connotation even though it can be inferred in positive and negative situations. The fact is that there’s a much more nuanced and balanced range to benefits and disadvantages of lock-in. Often this may even be referred to as this or that dependency, but either way a dependency often is just another form of lock in. Weighing those and finding the right balance for your projects can actually lead to lock-in being a positive game changer or something that simply provides one a basis in which to work and operate. Sometimes lock-in actually will provide a way to remove lock-in by providing more choices to other things, that in turn may provide another variance of lock-in.

Concrete Lock-in Examples

The JavaScript Lock-In

IT Security icons. Simplus seriesTake the language we choose to build an application in. JavaScript is a great example. It has become the singular language of the web, at least on the client side. This was long ago, a form of lock-in that browser makers (and standards bodies) chose that dictated how and in which direction the web – at least web pages – would progress.

JavaScript has now become a prominent language on the server side now too thanks to Node.js. It has even moved in as a first class language in serverless technology like AWS’s Lambda. JavaScript is a perfect example of a language, initially being a source of specific lock-in, but required for the client, that eventually expanded to allow programming in a number of other environments – reducing JavaScript’s lock in – but displacing lock in through abstractions to other spaces such as the server side and and serverless functions.

The .NET Windows SQL Server Lock In

IT Security icons. Simplus seriesJavaScript is merely one example, and a relatively positive one that expands one’s options in more ways than limits one’s efforts. But let’s say the decision is made to build a high speed trading platform and choose SQL Server, .NET C#, and Windows Server. Immediately this is a technology combination that has notoriously illuminated in the past * how lock-in can be extremely dangerous.

This application, say it was built out with this set of technology platforms and used stored procedures in SQL Server, locking the application into the specific database, used proprietary Windows specific libraries in .NET with the C# code, and on Windows used IIS specific advances to make the application faster. When it was first built it seemed plenty fast and scaled just right according to the demand at the time.

Fast forward to today. The application now has a sharded database when it hit a mere 8 Terabytes, loaded on two super pumped up – at least for today – servers that have many cores, many CPUs, GPUs, and all that jazz. They came in around $240k each! The application is tightly coupled to a middle tier, that is then sort of tightly coupled to those famous stored procedures, and the application of course has a turbo capability per those IIS Servers.

But today it’s slow. Looking at benchmarks and query times the database is having a hard time dealing with things as is, and the application has outages on a routine basis for a whole variation of reasons. Sometimes tracing and debugging solves the problems quickly, other times the servers just oversubscribe resources and sit thrashing.

Where does this application go? How does one resolve the database loading issues? They’ve already sunk a half million on servers, they’re pegged out already, horizontally scaling isn’t an option, they’re tightly coupled to Window Servers running IIS removing the possibility of effectively scaling out the application servers via container technologies, and other issues. Without recourse, this is the type of lock in that will kill the company if something is changed in a massive way very soon.

To add, this is the description of an actual company that is now defunct. I phrased it as existing today only to make the point. The hard reality is the company went under, almost entirely because of the costs of maintaining and unsustainable architecture that caused an exorbitant lock in to very specific tools – largely because the company drank the cool aid to use the tools as suggested. They developed the product into a corner. That mistake was so expensive that it decimated the finances of the company. Not a good scenario, not a happy outcome, and something to be avoided in every way! This is truly the epitomy of negative lock in.

Of course there’s this distinctive lock in we have to steer clear from, but there’s the lock in associated with languages and other technology capabilities that will help your company move forward faster, easier, and with increasing capabilities. Those are the choices, the ties to technology and capabilities that decision makers can really leverage with fewer negative consequences.

The “Lock In” That Enables

IT Security icons. Simplus seriesOne common statement is, “the right tool for the job”. This is of course for the ideal world where ideal decisions can be made all the time. This doesn’t exist and we have to strive for balance between decisions that will wreck the ship or decisions that will give us clear waters ahead.

For databases we need to choose the right databases for where we want to go versus where we are today. Not to gold plate the solution, but to have intent and a clear focus on what we want our future technology to hold for us. If we intend to expand our data and want to maintain the ability to effectively query – let’s take the massive SQL Server for example – what could we have done to prevent it from becoming a debilitating decision?

A solution that could have effectively come into play would have been not to shard the relational database, but instead to either export or split the data in a more horizontal way and put it into a distributed database store. Start building the application so that this system could be used instead of being limited by the relational database. As the queries are built out and the tight coupling to SQL Server removed, the new distributed database could easily add nodes to compensate for the ever growing size of the data stored. The options are numerous, that all are a form of lock-in, but not the kind that eventually killed this company that had limited and detrimentally locked itself into use of a relational database.

At the application tier, another solution could have been made to remove the ties to IIS and start figuring out a way to containerize the application. One way years ago would have been to move away from .NET, but let’s say that wasn’t really an option for other reasons. The idea to mimic containerization could have been done through shifting to a self-contained web server on Windows that would allow the .NET application to run under a singular service and then have those services spin off the application as needed. This would decouple from IIS, and enable spreading the load more quickly across a set number of machines and eventually when .NET Core was released offer the ability to actually containerize and shift entirely off of Windows Server to a more cost efficient solution under Linux.

These are just some ideas. The solutions of course would vary and obviously provide different results. Above all there are pathways away from negative lock in and a direction toward positive lock in that enables. Realize there’s the balance, and find those that leverage lock in positively.

Nuanced Pedantic Notes:

  • Note I didn’t say all examples, but just that this combo has left more than a few companies out on a limb over the years. There are of course other technologies that have put companies (people actually) in awkward situations too. I’m just using this combo here as an example. For instance, probably some of the most notorious lock in comes from the legal ramifications of using Oracle products and being tied into their sales agreements. On the opposite end of the spectrum, Stack Overflow is a great example of how choosing .NET and scaling with it, SQL Server, and related technologies can work just fine.

Kubernetes Networking Explained & The Other Projects

Still stumbling through determining what Kubernetes does for networking? Here’s a good piece written up by Mark Betz, titled “Understanding Kubernetes Networking: Pods”. Just reading Mark’s latest on Kubernetes is great, but definitely take a look at his other writing too, it’s a steady stream of really solid material that is insightful, helpful, and well thought out. Good job Mark.

I’ve got two more blog entries on getting Kubernetes deployed and what you get with default Terraform configuration setups in Azure and AWS (The Google Cloud Platform write up is posted here). Once complete that’s a wrap for that series. Then I’m going shift gears again and start working on a number of elements around application and services (ala microservices) development.

Always staying nimble means always jumping around to the specific details of the things that need done! This, among many efforts to jump around to the specific thing that needs done, actually feels more like a return to familiar territory. After all, the vast majority of my work in the last many years has been writing code implemented against various environments to ensure reliable data access and available services for customers; customers being web front end devs, nurses in a hospitals, GIS workers resolving mapping conflicts, veterinarians, video watching patrons on the internet, or any host of someone using the software I’ve built.

In light of that, here’s a few extra thoughts and tidbits about what’s in the works next.

  • Getting the Data Diluvium Project running, a core product implemented, and usable live out there on the wild web.
  • Getting blue-land-app (It’s a Go Service), blue-world-noding (It’s a Node.js Service), and blue-world-making (It’s the infrastructure the two run on) are all working and in usable states for prospective tutorials, sample usage, and for speaking from in presentations. They’re going to be, in the end, solid examples of how to get up and running with those particular stacks + Kubernetes. A kind of a from zero to launch examples.

Other Links of Note: