Cloud Out Loud Podcast
Cloud Out Loud Podcast
Open Source Reality Check
The dream of open source is freedom, speed, and shared progress—but the reality gets messy when it meets cloud-scale business and security. We explore how Docker kickstarted containers while Google’s Kubernetes turned them into an operational standard, and why that split shaped everything from engineering culture to company strategy. From there, we compare the cloud giants’ philosophies: Google’s foundation-first approach, Microsoft’s transformation from anti-OSS to stewarding GitHub and popularizing VS Code, and Amazon’s more transactional stance that sparked high-profile forks.
The heart of the story is tension between ideals and incentives. Elastic’s licensing shift to block AWS’s managed service and Amazon’s OpenSearch fork set off years of license churn across databases, with Redis and others experimenting with “source-available” models. That turbulence pushed developers and CFOs into new due diligence: reading licenses, evaluating governance, and planning for change. It’s not just legal; it’s operational risk. We unpack what to look for in a healthy project and how to avoid license whiplash when a dependency changes course.
Security adds another layer. The XZ Utils backdoor revealed how small packages can enable state-level infiltration, while malicious NPM uploads showed how easy it is to sneak malware into developer workflows. We revisit the infamous LeftPad collapse to explain dependency fragility and why reproducible builds, version pinning, artifact mirrors, and SCA tools are essential. Our playbook focuses on practical defenses—signed releases, SBOMs, automated alerts, and least-privilege build pipelines—so teams can keep the benefits of open source without gambling their stack.
We close with a preview: AI is retracing open source’s path, from community energy to license debates and platform power. If you build in the cloud, this conversation offers grounded lessons on choosing, securing, and sustaining the code you don’t control. Enjoy the episode, then subscribe, share with a teammate, and leave a review with your biggest open source win—or worst dependency scare.
What is Open Source?
Docker
Kubernetes
Cloud Native Computing Foundation
Open Source Initiative Approved Licenses
Google's Open Source Projects
AWS Open Source Projects
The Hackers Book
Microsoft Visual Studio
XZ Utils Backdoor
Left Pad Incident
Welcome to Cloud Out Loud Podcast with your hosts John Gallagher and Logan Gallagher. Join these two skeptical enthusiasts or are they enthusiastic skeptics as they talk to each other about the cloud out loud? These two jets are determined to stay focused on being lazy and cheap as they evaluate what's going on in the cloud, how it affects their projects and company cultures, and sometimes how it affects the world outside of computing infrastructure. Please remember that the opinions expressed here are solely those of the participants and not those of any cloud provider, software vendor, or any other entity. As with everything in the software industry, your mileage may vary.
Jon Gallagher:Welcome back, everybody. So we're going through we're still on the subject of open source, and last time we talked about the history of using open source and in the context of containers, uh Docker specifically, and Kubernetes, the coordination system for using containers. We talked about the evolution of containers and the coordination. We also talked about the fact that there were two approaches here, one of which was successful and one of which was not as successful. That the successful, well the unsuccessful approach was the Docker folks never ended up being able to create the business that they wanted. Even though they came up with this great innovation of packaging software and running it and and creating it creating a system that's becoming that became dominant in the container area area, but not quite realizing how to build a business out of it. They had thought that we'll release the container for uh container formats as open source and we'll sell the ancillary software to do it. Maybe they knew, maybe they didn't know that Google had been using containers for a very, very long time, had run into many of the problems of using containers, and had software that addressed those problems. So Docker releases the container format. Google follows that up with the Kubernetes system that coordinates uh coordinates containers, and Google took the approach of our business is not selling uh container coordination software, container management software. Our business is all the things that Google does. So they created a foundation or gave it to a foundation to support. That's been very, very successful. Um, Google started off by providing at one point in time, I think 50% of the pull requests for things like Kubernetes were from Google engineers. That ratio's dropped. So the communities pick that up and move forward. And at the same time, the Docker container format has evolved to a more open source system, the container D format. So we talked about how open source worked, how the community worked together on things. We're going to talk in this episode more about kind of the dark side, the kind of things that run in that run into problems of open source. Now, basically, there's a definition to open source. There are three things that make something open source. Logan, what are those three things?
Logan Gallagher:Yep, so for something to be truly open source, and there are many, many licenses have been created to define software as open source, but the the group of licenses for what we would define as truly open source allow anyone to be able to download and use that software to modify that software and to be able to freely distribute that software. And I think that third component is where there have been some challenges for open source projects when they realize that if you release your software for free and let anyone use it, modify it, and distribute it, someone might go ahead and do that. And someone might go ahead and build a business out of doing that, potentially to your chagrin. I think you could call this episode open source gets a reality check. We're going to talk about some examples of times when open source projects ran into conflict with the cloud providers, when users of open source ran into conflicts with the open source project and their maintainers. But we can learn a lot of lessons from these growing pains that the open source projects and the open source community encountered, especially in the last 10 years.
Jon Gallagher:Absolutely. One of the so again, setting some context here, we talked a little bit about the cloud providers and the boost that they the cloud providers themselves gave open source. Three cloud providers have very different had very different approaches to open source. They may be kind of coalescing towards the same thing, but if you looked at it in terms of a spectrum, Google was very much uh supportive of open source, as we saw with Kubernetes, but we and we mentioned before about them supporting Linux and extensions to Linux or or or creations within the Linux kernel that facilitated containers. There have been a lot of products that Google has released itself. And as we mentioned, Google's thought processes and papers have been the foundation of many open source projects. On the opposite end of the scale is Microsoft. Now I mentioned the book Hackers, and we're gonna again put the links in here. There's a part of Hackers which talks about Bill Gates' uh experience at the very beginning of the computer revolution, where he was infuriated by the people who were in his eyes stealing the basic interpreter. And legally, absolutely, they were stealing it. The Microsoft corporation that existed at that time was selling a basic compiler, uh sorry, basic interpreter, and people were just making copies and giving it to their friends. Bill Gates saw that money not existing in his pocket, being sent out to other people, and had a very negative reaction to that. So Bill Gates famously was anti-open source, the traditional open source that approach that Logan just talked about, of free to download, free to update, free to distribute. You have a company that was founded on that, led by someone who's essentially anti-open source, and I don't mean that as a slur, I just mean that as a as a business approach. Then Microsoft gets led by Steve Ballmer. That era was Microsoft more focusing on the enterprise, more focusing on making tools that they could sell to businesses to run their operations. And then the newest era that we're in, where Microsoft is led by a gentleman by the name of Sacha Nadella, and it is a very different Microsoft now. So Microsoft as a cloud provider is sitting on the opposite end of the spectrum initially from Google, but then they kind of have a they kind of come to open source, we believe, through their experience in in the cloud and the demands of the customer, but also their experience of using the cloud technologies.
Logan Gallagher:Absolutely. I mean today uh I can say I use personally a free and open source code editor, uh, IDE from from Microsoft, VS Code. It's become practically an industry standard. Um ironically, we'll talk a little more next week next time about AI. Ironically, it seems like every single AI assistant code editor is built on top of VS Code. I do sort of wonder how Microsoft feels about that, uh, since they're giving away VS Code for free, and anyone can, again, update and distribute their version of VS Code, maybe with a large language model fit in there.
Jon Gallagher:And this is Microsoft giving up hundreds of dollars per copy for Visual Studio.
Logan Gallagher:And it is definitely worth noting that Microsoft acquired the website GitHub. GitHub is the biggest online repository of source code in the world and of open source projects especially. The majority of open source projects are hosted on GitHub where people can go and view the code and download the code. And this being owned by Microsoft, being a part of the Microsoft ecosystem, I think has had an impact on Microsoft as an organization.
Jon Gallagher:So we from our perspective, Microsoft, by essentially creating Azure, opening itself up to trying to prov provide cloud services, they had an enormous learning curve. Amazon, Google. I used to argue that there were only three companies in the world that could do cloud at scale. It was Amazon, Google, and Facebook. Microsoft wasn't part of that. Microsoft never really had data center expertise. Their slice of the marketplace was the small, medium enterprise and then maybe giving office automation into larger enterprises. But they weren't doing the back-end services. They weren't doing the multi-hundreds of gigabytes, terabytes of databases, for example. It's not something SQL Server was initially built to do. So they had to scale up the experience of running data centers and the experience of running infrastructure at cloud scale. The other thing they needed to do was understand who was using the cloud and what the demands of those folks were. And the the demands weren't for things like Windows. The demands were for an inexpensive operating system that can be turned on and turned off as needed, because ideally the cloud scales up as you need it, scales down as you don't. And that's a very different set of operating circumstances than Windows was designed for. So Microsoft had an enormous learning curve, and as part of that learning curve, got exposed to and started to adopt open source systems themselves. Microsoft a couple years ago was proudly advertising they're one of the largest providers of the Linux operating system through Azure. So the growth and change that happened where Microsoft is adapting to these sorts of things also drove things like releasing Visual Studio as VS Code, open source, and ostensibly leaving behind hundreds of dollars, but of occupying a mind space that allows them to play in the markets of machine learning, of the cloud. So Google on one side and Microsoft on the other. Both of them are examples of growth and change and contributing to the open source ecosystem. In the middle, we have Amazon. And we have at least one cautionary tale of Amazon and open source.
Logan Gallagher:Logan, you want to set the stage for us? Absolutely. So Amazon has always embraced open source, but when we're talking about this before recording, we're trying to really think of when Amazon itself has put out open source projects in the same vein as Google developing and releasing Kubernetes. And the open source projects that Amazon has released have largely been projects that you would utilize in order to deploy software to Amazon. Some of their tools, like the uh CDK, the cloud uh development kit, gives you uh infrastructure as code tool for deploying your your servers and your databases and your load balancers to AWS. Um but another example of of an open source project they're involved in is OpenSearch. And that I think is an illustrative story because OpenSearch is a fork of another open source project called Elastic. Elastic is a full text search database. If you need to be able to put in a search query for some document that you have stored in your database and you're searching for a sentence fragment or a particular string, you can easily, or a particular pattern I should mention, you can easily search for, and which of my application logs that I have stored in the Elastic database do I have this particular log warning message? Elastic makes it very easy to do that. So Elastic has become popular for companies that are building search engines for their e-commerce website or search engines over databases, or sorry, over documents, and especially popular for storing your application and your system logs, the log messages that get generated by your app, generated by your operating system. You can ingest all of those and send them over to Elastic for searching over. Amazon released a managed service version of Elastic called Elasticsearch. And Elasticsearch became a very popular service on AWS. People would create an Elasticsearch database that was running the Elastic software, but that cloud customer was paying Amazon. No money was landing in Elastic's pockets. And uh I think Elastic felt pretty, pretty wronged uh by getting cut out of this lucrative business that Amazon was generating. So Elastic, because they felt like Amazon was taking advantage of them, using their free software and turning it into a product to sell to Amazon's customers. Elastic was also trying to sell their own managed service version of Elastic. So they were in direct competition with Amazon's Elasticsearch managed database. They changed a license on Elastic and made it a more restrictive license, made it so that someone like Amazon could not just take that software and offer it as their own product. So Amazon retaliated. This was in the late 2010s, early 2020s. Amazon then retaliated and forked, which is to say, took a copy of the open source version of that code and said, we're taking this copy, which we will now manage, modify, and distribute, and we're releasing it over and under a separate open source license. That fork of Elastic became open search. This was Amazon's way of getting into open source was forking an existing open source product that was converting to a more restrictive license. And this began a trend of database open source projects converting to more restrictive licenses to prevent the cloud providers from being able to take advantage of them in the perspective of the of these companies. So Redis did something similar. Redis is an in-memory caching database. If you needed to be able to quickly retrieve data and you didn't want to run a longer database query from a relational database, you could quickly retrieve data out of a Redis caching database. Amazon was using Redis for their Elastic Cache service. So again, an open source fork of Redis was created called Valky, and Amazon was involved in that fork as well. So we we saw this kind of thrashing around of these, particularly the database open source projects, trying to find a path forward where the cloud providers were using their software for free and turning them into businesses and products that the open source projects were uncomfortable with. Many of them were trying to also offer their own managed service version of their databases. So today, Elastic and Redis, amongst others, have now reverted back to a more open license. And that does cause some confusion. We're seeing them go from open to restrictive to back to mostly open. And that has caused some confusion for the developers using this software. Having to figure out, okay, how under the terms of this new license, how can I use this software? And not wanting to fall in violation of this license. A lot of other database software is open under or is available under an open-ish license that has some caveats and restrictions around how you're allowed to modify or distribute the software. So it would not be a truly open source software. Databases like Mongo and CockroachDB, amongst others, have created licenses to distribute their software that are sort of open but don't strictly fall under the definition of open source. And now us as developers have to be more careful when we're using this software to make sure we understand are we using the software in a way that is compliant with that software's license? So it's caused a lot of confusion.
Jon Gallagher:This is the sort of thing that turns CFO's hair gray. Because he you you get approached by your technical team. Managers or or directors, or even the the chief technical officer comes to you and says, Well, we want to standardize on this particular technology. We need you to make sure that we are not violating any license. How do you how do you know, and in particular, how do you know this will be the license on an ongoing basis? Do you do assume the license at time X that you develop the software and and deploy the software applies? That may be reasonable, but as you as you go on, as you release a new version of the software, and maybe the license changes, how do you how do you uh wrestle that to the ground? So it's one of the many things that that was operating against the acceptance of open source. And it's why you have so many people who immediately dive into well, what license is it released under? Is it Apache? Is it MIT? Is it GNU? We we are in no way experts on this. That's again why we hang out with gray-haired CFOs. What we are arguing is that as techno technology people, we want to be engineers. We want to pick the best tool for the problem that we're trying to solve. And it's one of those things that I would argue Microsoft learned in deploying the Azure Azure Cloud is that they are more involved with with the engineers, and they realize that there's an opportunity to supply the engineers with the tools. You know, it's the old story about the California gold rush. Maybe a few miners made made money off the gold rush, but the people who made real money were the ones who sold tools to the miners in the gold rush. Again, why NVIDIA is such an incredibly valuable company. So it's what we want to set the context of is where open source fits in the cloud environment, what we the the reason for. For our podcast, and how you can make choices on that. Now, the vast majority of the time you're going to make choices if you're going to be using open source software, you'll be making choices that are essentially safe. If you're choosing a web server, you essentially have two open source servers to choose from. Nginx, which I looked at today, is now one-third of all the web servers out there, or Apache, which is right behind it, one quarter of all web servers. And arguably, right behind that is the Cloudflare server, which is not a standalone piece of software itself. It's a man it's a tool to manage traffic in and out. So there's entirely it's entirely possible that you could write uh you could stand up a web service that's using Nginx, but then using Cloudflare for your for your protection or for your CDN and have layers of open source involved in that. You don't necessarily have to get involved in the licenses in that, but you need to be aware that they are it is one of the problems that may that may pop up.
Logan Gallagher:Yeah, we're not gonna get into any we're not gonna get into the weeds and the difference of Apache versus GNU versus MIT. That will become a whole different podcast.
Jon Gallagher:And and one in which we would just be burbling.
Logan Gallagher:Now on topic of staying safe as a developer, the other reality check and growing pain that we have experienced in open source, especially in the cloud era, has been security. Many people will implicitly trust that the software package that they found online is trustworthy and that they can use it in their application and rely on it. And we have seen that by many people, especially in recent years, we covered a uh an example of that probably now eight months ago, maybe even a year ago, with the XZutils backdoor exploit, where there was a small open source compression library that had been adopted by certain Linux operating systems. And it appears that hackers backed by China were able to convince the maintainers of that small open source package to add their code to the XC software. And their code was malware. Their code was being used to infiltrate and run their software inside of someone's system and root around and discover vulnerabilities, see if they could perform additional attacks like a privilege escalation attack. And it really highlighted by an extremely sophisticated actor. Here we have these state-backed hackers being able to use this small little compression library as a supply chain attack and be able to get into people's systems. We've seen some far less sophisticated attacks even in recent weeks. Last week, social media was full of discussion of packages on the popular package registry NPM. NPM is for the Node.js software uh programming language. Node.js is another open source success story. NPM is where people can host the packages and libraries that they've written in Node.js for other members of the Node.js community to download and use. And bad actors, hackers, have uploaded packages that when you download them, these packages are going to look to see if you have any Bitcoins, any crypto wallets on your computer that they can steal. And so this is a far less sophisticated supply chain attack compared to what the XE UTE's backdoor showed us is possible. But both of these attacks really underline that you have to make sure that you are using software from trusted sources and that you're being diligent about where you're downloading from what you're trusting. And then turning back a little bit further in history, there was an incident.
Jon Gallagher:Now we're not going to pick on, we are not specifically picking on Node. Uh NPM, the Node Node.js.
Logan Gallagher:It's not our fault. There's so many illustrative examples from Node.
Jon Gallagher:Yeah. Uh the Node.js Package Manager, NPM, allows you to specify, hey, I I need this specific utility, I need this specific version of the utility. It is a great way of ensuring that you have the right capabilities loaded into your software at the right time. And this has become a capability. NPM was one of the first ones to do this, the ability to choose versions and create in the specific environments that you want to do, because it used to drive people crazy. I'm compiling the same code as I did before, and something broke. Well, you had a version dependency. And the versions have marched on, but you were depending on something previous to that. So NPM had a great way of dealing with this. I you could specify I need this version, or I need to any version before version 3.0, or any version after version 3.0. It's very power it is a very powerful approach to ensuring your code is aligning with the libraries that it that it needs. At this time, this anecdote is about someone who's not a bad actor, but just got into a conflict with someone else in the node community. So back in 2016, a gentleman by the name of and I'm gonna butcher this name, I am so sorry. Azir Kochulu, K K-O-C with a little tail at the bottom, Ulu, wrote a and wrote some utilities for Node.js. One of them was called LeftPad. LeftPad did something very simple but very powerful. You could you took a string and you were you added characters to the left side of a string, specified characters, and padded it out. So let's say that you have a requirement for loan numbers. This happens all the time. For some reason, people who generate loans think of don't realize that numbers are numbers and text is text. And they think that having a loan number of 0009876543, that that should be the loan number, not realizing that's a text string. The loan number, the real number will be 9867543. So when you're doing things in in finance and you're having to you're having to take a number and then turn it into the proper format for it of put in leading zeros, this left pad was very very powerful. You should say, okay, I want this to be a t uh character string, numeric character string. If it doesn't have 12 digits, then pad it out with this particular character. So if you have number 98754321 and you wanted to pad it out to twelve, you would call left pad and say zero and say pad it out to twelve. So it add three three zeros at the beginning. The if you haven't dealt with financial things or particularly like residential loan numbers, be glad you haven't. So anyway, this is uh it gets used a lot, left pad. Everything's fine, it's it's a benign utility, but then in 2016, Cochulu uh gets in a dispute with a company called Kick Interactive. Now, Cochulu had named his utility package Kick, K-I-K. Means something in in his his native language. But Kik insisted that they had copyright over the three letters K, I, and K. So they said, you'll have to get you'll have to remove your control of the KIK package name and allow us to have KIC. So Cochulu said fine, or probably said something like something schatological, and said, Okay, it's yours. I'm just gonna take my code back. So he did something where he said he basically removed his utilities from the NPM registry. And they were his and he had hit at the time he had they were his, he had control of them, he had every right to do that. No one realized how much of NPM depended on this ability to pad out a string on the left side. NPM collapsed, programs collapsed all over the all over the world. People couldn't use Node.js for some period of time. Logan, you were working with Node.
Logan Gallagher:Uh I was working with developers that were using it. I I um have never never been a Node.js developer, but uh uh but I know that it caused all types of it caused a lot of folks to scramble because all of a sudden if your code referenced this package or if the package you're referencing referenced that package and it was no longer no longer present on npm, it caused lots of things to break.
Jon Gallagher:So the npm maintainers literally had to put that that code back in. So take it away from from Mr. Cotulu, put it put the left pad back in because so much of Node.js depended on that utility. It was at the time they basically said they fixed the internet, that the internet had been broken by removing left pad, and they fixed the internet by putting it back in. Now, what this told us is that this idea of having publicly maintained libraries that essentially don't go through a review process for uh for submissions or deletions is fragile. So we learned a lot about the software supply chain. The fix for that now is that we're going to that node has you cannot unpublish a package that's been published for more than 24 hours, and certainly you cannot unpublish a package that has internal dependencies within within the NPM environment. So a reasonable approach to things and something that as software engineers, software engineers should think about. I mean, we all are supposed to do code reviews for additions and make sure that we aren't inserting bugs, but when we are deleting things, we need to be just as aware. Are we deleting things that have dependencies? And we've we've all done this. We've all said, oh, no one ever calls this function, and kaboom, our our package collapses. We bring this up because we want you to be sensitive to the idea of what security is. Security in a compute environment, the the idea that we're protecting access from someone outside accessing the things that they shouldn't. That's only one part of security. A system is secure if outsiders are prevented from access, but insiders are only allowed the access that they're authorized to have, and we have an authorization uh we have an authorization structure that makes sure that the people who only need to read information don't have the ability to write information, for example. We also need to ensure that the information remains resilient, remains reliable. So that if we are looking at a database, no one has gone in and changed the database. Maliciously changed the database, or scrambled the database. And then finally, acts of God, acts of evil people can frequently cause our data to be not available to us, and we have to have the ability to restore that data. So our data needs res needs traditional security, but also resiliency, and also the ability to restore itself. So that is something that you need to understand about the open source world. If you are depending on an open source package, do you have a copy of the original? You know, you cannot depend on the internet. They they say the internet never forgets, unless you really need it. In which case Murphy gets in there and deletes that packet from you. So you the open source imposes a responsibility on you as a user that you have to acknowledge. That I that there will be no XYZ company that I can turn to that I've paid a support contract to to maintain this for me. I need to maintain it. Because my version may go away. Or there may be a dispute between a developer and someone who wants his trademark. There may be bad actors in the in the supply chain. You need to take responsibility for what for the tools you're using.
Logan Gallagher:Absolutely. I think the the fruit with these different examples today is that open source has had to go through a process of growing up. Um and it has gotten a bit of a and has had some growing pains. There are steps we can take to address each of these issues. We can ensure that whenever we're deciding to use a certain open source project, that we're familiar with this license, that we are familiar with the organization that's maintaining the software, and we have confidence that they will be around and will continue to deliver a high-quality project. We can make sure that we are using tools to scan our packages that we're using and scan them for vulnerabilities and exploits and make sure that we are not inadvertently downloading and using malware in our applications. We've had to learn some lessons the hard way, that you can't just blindly trust an open source project to always be there in its exact same state that changes can occur that we might have to adapt to, uh, and that we need to make sure that we protect ourselves on our side.
Jon Gallagher:Absolutely. And by the way, going back to the stories that we started off this off with, Microsoft owning and using GitHub, GitHub is a great resource. GitHub, in using GitHub, it will review the license, the I'm sorry, the libraries you're using and ensure that you're not using bad supply chain for things. And I I get those emails all the time that we've discovered a vulnerability, it's a particular version of uh XYZ that you that you're using. So the the infrastructure is there for you to be able to lean on to get to make sure that your systems are safe, but you have to use it. Um as Logan said, you can't just blindly use these tools. You have to be like any crafts person, aware of the tool's capabilities, its limitations, and what your responsibilities are as a tool user.
Logan Gallagher:I think uh what has been funny to watch as a bit of a preview for what we'd like to talk about next time. What has been funny to watch is these growing pains that we have noticed, especially in recent years, open source go through. We are now watching rerun. We're now watching them replay again in the world of AI. Uh AI has been going through an a moment in time of enormous growth and uh adoption and popularity. And so they are retreading some of the same good things and mistakes that we watched open source and the cloud go through in the past 10 to 15 years. So that's what we hope to talk about next time. But in our first episode, we painted a pretty rosy picture of this idealistic world of open source. Hope to have added a few asterisks there to underline that you do have to be a responsible consumer of open source.
Jon Gallagher:And then the upcoming episode we'll be talking about the blossoming of all of this. Absolutely. And many things blossom. Flowers blossom, fruit trees blossom, and also the death flower blossoms. So there is there's a lot of you cannot avoid the responsibility that you have as a tool user and tool implementer for being aware of what your tools do, what they can't do, and how to maintain them. Certainly. Okay, that's it for this time. Thank you all for joining us.
Announcer:Thank you. Thank you for listening to Cloud Out Loud Podcast. Please let us know in comments if you caught either of the gents calling a product or technology by the wrong name. Other information and suggestions are welcome too. Or feel free to tweet us at at cloud outloud pod or email us at cloudoutloud at ndhsw.com. We hope to see you again next week for another episode of Cloud Out Loud.