#135: Infrastructure-as-Code

In this episode I introduce Infrastructure-as-Code - a way of defining your Operation's infrastructure as code - an example of DevOps in practice with our Ops teams learning from our Dev teams.

Why you may be interested in this episode:

  • For an explanation of Infrastructure-as-Code
  • The advantages of using it
  • And some of the common excuses for not using it - and how to address them

Or listen at:

Published: Wed, 08 Jun 2022 17:28:46 GMT


Hello and welcome back to the Better ROI from Software Development Podcast.

In this episode I want to introduce you to Infrastructure-as-Code.

So why might you be interested in this episode?

First off, I'll give you a basic introduction to Infrastructure-as-Code - potentially, you're hearing about it from your teams, maybe they're asking to invest time in it. I'll talk about some of the advantages you get from its use. And I'll look at some common excuses for not using it.

Let's start with what Infrastructure-as-Code is.

It's a practise that comes out of DevOps, and it's where the Ops team are starting to learn from the development team. Infrastructure-as-Code gives us a way of being able to automate previously manual tasks.

And we're learning these practises from the development team.

In doing so, we're documenting repeated tasks in code which allows us to reduce the manual effort and thus work faster. It reduces the manual mistakes and it provides an audit trail of the who, what, when and why.

If I equate this to software development, I remember back in my history where I've had to write chapter and verse documents for releasing software. In one organisation, the release of my software had to go through release management, which amounted to an individual, that wasn't me, actually having to install my software on the servers.

And in doing that, I had to provide an exceptionally detailed word document. I had to tell that individual exactly which folder to copy individual files to, which configuration settings to set on each machine - it had to go to the level of "right click here to see this, then type that".

To be correct, and to work for another person that didn't understand what I was trying to achieve, it had to be exceptionally detailed and correct.

Now, obviously, that took a lot of time to get right - and it was error prone. I only needed to miss one step out of that function, and that release manager would struggle to actually do the job.

Moving forward into Continuous Integration, Delivery and Deployment, we now automate those processes. So where previously I would have written a document for somebody written in word with a step by step guide, I now create instructions for the computer to do it - which is something that I can test and refine over time.

I can try it on various environments. I can see that it does exactly what I expected it to do in our test environment. Thus, I can have confidence it will do the same when I try to use it in the production environment.

That's because that automation makes it repeatable.

And we gain benefits from that automation. It's less work when I want to deploy it. It's faster because of it, and more reliable. And of course, because we have that level of reliability, we also have auditability and non-repudiation because we have those instructions written down as to what the computer carried out.

And because we're doing this in an automated way, there are no manual interactions - so it becomes exceptionally good at being able to provide evidence for a security or maybe some form of audit.

Everything goes through the automation and thus is tracked and verified by it.

Now Infrastructure-as-Code provides that same benefit for the Ops team.

We would traditionally have to manually configure and set up our servers, our network, our storage, etc - very similar to that release process I describe for delivering my software. We would have a word document with all the steps documented out, if lucky. In many cases we actually found that it was commonly knowledge locked away in an individual. If so, that knowledge was not repeatable, certainly wasn't scalable, and definitely wasn't auditable.

And when over time, inevitably there was drift between what was documented, assuming there was one, and production, then nobody knew what happened. Many people would have had their hands on those servers, on that equipment.

And this was making unique, non-repeatable infrastructures.

If we lost them for any reason, they're almost impossible to recreate.

We'd be unable to clone them into another environment. For example, if you wanted to create a dev and test version of production, you couldn't - which led to very little confidence when we wanted to use those testing environments for our software prior to deploying them into production.

Infrastructure-as-Code is one of the benefits of DevOps - where Ops have learnt from some of the Dev practises. And certainly over recent years there's been an explosion in the use of infrastructure as code.

There are many different solutions for infrastructure as code, and I'll not talk about the specifics in this episode, however, I would say the big players in the market, such as AWS and Azure invest heavily in the ability to use Infrastructure-as-Code - they recognise, as does the industry, that it's a good way of managing environments, a good way of creating servers, network, and storage.

And through using Infrastructure-as-Code, they gain considerable benefits.

We're able to track changes using source control in the same way as we do with software development. We get to tell when somebody has made a change - we get a what, when, who and why of a change.

We have the ability to revert to a prior known good state.

We have that auditability - we can tell who's changed, what and when.

We remove the risk of manual mistakes and we remove some of the security risks that we would have had with these manual changes by hand.

So let's talk about why it might not be used.

Now, I've certainly seen a reluctance in some historical operations teams. It's a change to how they've done things traditionally, and they're starting to be asked to behave and do things in a "development" way. Now, that might not sit well with many people that have done their job for many years in a certain way - so certainly it can be scary.

You can also find sometimes the time is simply not allowed for it.

There is an investment in setting that infrastructures code up. It does take time and it can be seen as an extraneous or extra job, that if a project, for example, is short on time, can be seen as gold plating - "we just need to get it done". They focus on the short term of just getting in there and doing it, versus the long term maintainability and reliability and the benefits you get from it.

To address these, I think you have to look at training and encouragement.

So in terms of training, you have to bring that team on to understand these are the benefits, this is how you do it, here are examples of good work and why.

And then the encouragement of actually "no, we do believe this is important", "we do believe there is value in you doing this", "we are not going to make you take shortcuts in favour of the short term win over the longer term win".

That comes down to the need to have a certain maturity in that technical management.

I've talked previously about professionalism within software development, and I think there are certain characteristics that have to be shown by a software developer to be really considered professional.

And many of those same characteristics are here as well for Infrastructure-as-Code. By following those principles, by following many of those principles of making sure that it's repeatable, understood, auditable, and has some level of control around it, it's providing a level of professionalism. And as such, most organisations should be asking for this as basic table stakes.

In this episode, I've introduced you to what infrastructure-as-Code is - it's taking away that manual effort of building our systems, our servers, our network, our storage and defining it in code.

I've talked about the advantages of doing this, having it encode similar to the same as how we have software code. We get a audit trail of who did what, when and why. We have an audit trail in terms of understanding the changes that were made. We reduce the risk of manual mistakes. We reduce some of the security risks of manual work. We gain levels of repeatability of being able to repeat those changes from our production environment into our test environment, into our development environment. We gain confidence that those environments feel and behave the same way as production. Thus how greater confidence when we're testing our software on them.

And I've covered some of the excuses as to why it wouldn't have been done - that scariness of a change of job role for the traditional operation staff or the short term thinking brought on by the organisation of "just get it Done" - doing that short term thing which will hurt us in the longer term, rather than investing the effort in the training and the encouragement to making sure that our teams are capable and confident in using Infrastructure-as-Code.

Thank you for taking the time to listen to this episode. I look forward to speaking to you again next week.