There are three steps to deploying software
- Creating and Managing the infrastructure in which your application will run (hardware, networking, middleware, external services)
- Installing correct version of your application into it
- Configuring the application including any data or state it requires.
Creating and managing the infrastructure
It is a known fact that most projects fail due to people problems rather than technical problems. Almost all medium and large companies separate the activities of development and infrastructure management (or Operations) into two groups or silos. It is often the case that these two group of stakeholders have an uneasy relationship. This is because development teams are incentivized to deliver changes as rapidly as possible whereas operations teams aim for stability.
Probably the most important thing to keep in mind is that all stakeholders have a common goal : making the release of valueable software a low-risk activity. We have discussed this earlier that best way to do this is to keep releasing small increments of software through creation of Continuous Delivery pipeline. This ensures that there is as little changes as possible between releases. Given this context, here are some of the most important high-level concerns of operations teams.
- Documentation and Auditing : Operations managers want to ensure that any changes to any environment they control is audited and documented so that if things go wrong they can find the relevant changes that caused the problem.
- Alerts for Abnormal Events : Operations managers will have systems in place to monitor their infrastructure and the applications running, and will want to be alerted when an abnormal conditions occurs in any of the systems they manage so that they can minimize any downtime.
- Logging : Your applications should log a WARNING level every time a network connection times out or is found to be unexpectedly closed. You should log a INFO or DEBUG level every time you close a connection. If something goes wrong, the Operations team should have forensic tools available which would help them with creating the event in their testing environment so that they can prevent it from happening going forward.
- IT Service continuity Planning : Operations managers will be involved in the creation, implementation, testing and maintenance of their organizational IT service continuity plan. You should have tested performing backups, recovery and archiving of your applications data as part f the business continuity testing, as well as retrieving and deploying of any given version of your applications and providing the operations team with the process for performing each of these activities as part of your release plan.
- Use the Technology the Operations team is familiar with: Operations team wants changes to be made to their environments using technology that is familiar to their team so that they can own and maintain their environments. It is required that Development team and operations team sit down at the beginning of every project and decide how deployments of the application will be performed. It may be necessary for either the operations team or the software development team to learn an agreed upon technology – perhaps a scripting language such as Perl, Ruby or Python. It is extremely important that both teams understand the deployment system, because the same process must be used to deploy changes to every environment.
Modeling and Managing Infrastructure
There are many different classes of configuration information at play in any environment, all of which should be provisioned and managed in an automated fashion.
Even if you do not have control over the selection of your infrastructure, If you intend to fully automate your build, integration, testing and deployment, you must address the following questions.
- How will we provision our infrastructure?
- How will we deploy and configure the various bits of software that forms the part of our infrastructure?
- How do we manage our infrastructure once it is provisioned and configured?
As with every other aspects of your delivery process, you should keep everything you need to create and maintain your infrastructure under version control. At least, that means,
- Operating System install definitions
- Configuration for data center automation tools such as Puppet
- General infrastructure configurations such as DNS files, SMTP settings etc
- Any Scripts you use for managing your insfrastructure.
These files in version control form inputs to the deployment pipeline the same way the source control does. The job of the deployment pipeline in the case of infrastructural changes is threefold
- First it should verify that all application will work with any infrastructural changes before they get pushed out to production environment, ensuring that every affected application functional and non functional tests pass against the new version of the infrastructure.
- Second, it should be used to push changes out to operations-managed testing and production environments.
- Third, the pipeline should perform deployment tests to ensure that new infrastructure configuration has been deployed successfully.
Following are some of the things which are necessary in controlling your infrastructure configuration
- Controlling access to your Infrastructure :
o Controlling access to prevent anyone from making a change without approval
o Defining an automated process for making changes to your infrastructure
o Monitoring your infrastructure to detect any issues as soon as they occur
- Making Changes to Infrastructure
o Every change whether its updating firewall or deploying a new version of yours service, should go through the same change management process
o This process should be managed using a single ticketing system that everybody can log into
o The exact change that is made should be logged so that it can be easiy audited
o It should be possible to see a history of changes made to every environment including deployments
o The changes you want to make should have been tested on one of your production-like testing environments
o The changes should be made to version control and then applied through your automated process for deploying infrastructural changes
o There should be a test to verify that change has worked
Managing Server Provisioning and Configuration
Provisioning servers and managing their configuration is often overlooked in small and even medium-sized organizations.
- Provisioning Servers : At a high level, provisioning servers – whether for testing or production environments starts with putting a new box in your data center and wiring it in. There are several ways of creating operating system baselines
o A fully manual process
o Automated remote installation
- Virtualization : The fundamental enabler of the cloud is virtualizationover hundreds of thousands of hosts accessible over the internet. In cloud computing, a virtual machine (VM) is an emulation of a physical machine. A VM image is a file that contains a bootable operating system and some software installed on it. A VM image provides the information required to launch a VM.
Three of the unique aspects of the cloud that impact DevOps are
- The ability to create and switch environments
- The ability to create VMs easily
- Management of Databases
Virtualization has following benefits
o Fast Response to changing requirements
o Consolidations
o Standardizing hardware
o Easier to maintain baselines
- Ongoing Management of Servers : Once you have the Operating System installed, you will need to ensure that the configuration doesn’t change in an uncontrolled manner. This means ensuring first that nobody is able to log into the boxes except the operations team and second, that any changes are performed using an automated system. That includes applying Os Service packs, upgrades, installing new software, changing settings or performing deployments.
- Highly Parallel Testing with Virtual Environments : Virtualization provides an excellent way to handle multi-platform testing. Simply create virtul machines with examples of each of the potential environments that your application targets and create VM templates from them. Then run all of the stages in your pipeline on all of them in parallel.
Managing Data
Data and its management and organization pose a particular set of problems for testing and deployment processes for two reasons.
- First, there is the sheer volume of information that is generally involved. The bytes allocated to encoding the behavior of our application—its source code and configuration information—are usually vastly outweighed by the volume of data recording its state.
- Second is the fact that the lifecycle of application data differs from that of other parts of the system. Application data needs to be preserved—indeed, data usually outlasts the applications that were used to create and access it. Crucially, data needs to be preserved and migrated during new deployments or rollbacks of a system.
In most cases, when we deploy new code, we can erase the previous version and wholly replace it with a new copy. In this way we can be certain of our starting position. While that option is possible for data in a few limited cases, for most real-world systems this approach is impossible. Once a system has been released into production, the data associated with it will grow, and it will have significant value in its own right. Indeed, arguably it is the most valuable part of your system. This presents problems when we need to modify either the structure or the content. As systems grow and evolve, it is inevitable that such modifications will be required, so we must put mechanisms into place that allow changes to be accomplished while minimizing disruption and maximizing the reliability of the application and of the deployment process. The key to this is automating the database migration process. A number of tools now exist that make automating of data migration relatively straightforward, so that it can be scripted as part of your automated deployment process. These tools also allow you to version your database and migrate it from any version to any other. This has the positive effect of decoupling the development process from the deployment process—you can create a migration for each database change required, even if you don’t deploy every schema change independently. It also means that your database administrators (DBAs) don’t need a big up-front plan—they can work incrementally as the application evolves.
Database Scripting
As with any other change to your system, any changes to any databases used as part of your build, deploy, test, and release process should be managed through automated processes. That means that database initialization and all migrations need to be captured as scripts and checked into version control. It should be possible to use these scripts to manage every database used in your delivery process, whether it is to create a new local database for a developer working on the code, to upgrade a systems integration testing (SIT) environment for testers, or to migrate production databases as part of the release process. Of course, the schema of your database will evolve along with your application. This presents a problem because it is important that the database has the correct schema to work with a particular version of your application. For example, when deploying to staging, it is essential to be able to migrate the staging database to the correct schema to work with the version of the application being deployed. Careful management of your scripts makes this possible. Finally, your database scripts should also be used as part of your continuous integration process. While unit tests should not, by definition, require a database in order to run, any kind of meaningful acceptance tests running against a database-using application will require the database to be correctly initialized. Thus, part of your acceptance test setup process should be creating a database with the correct schema to work with the latest version of the application and loading it with any test data necessary to run the acceptance tests. A similar procedure can be used for later stages in the deployment pipeline.
Initializing Databases
An extremely important aspect of our approach to delivery is the ability to reproduce an environment, along with the application running in it, in an automated fashion. Without this ability, we can’t be certain that the system will behave in the way we expect. This aspect of database deployment is the simplest to get right and to maintain as your application changes through the development process. Almost every data management system supports the ability to initialize a data store, including schemas and user credentials, from automated scripts. So, creating and maintaining a database initialization script is a simple starting point. Your script should first create the structure of the database, database instances, schemas, and so on, and then populate the tables in the database with any reference data required for your application to start.
The simplest process for deploying a database afresh is as follows
- Erase what was there before
- Create the database structure, database instances, schemas, etc
- Load the database with data
Incremental Change
Continuous integration demands that we are able to keep the application working after every change made to it. This includes changes to the structure or content of our data. Continuous delivery demands that we must be able to deploy any successful release candidate of our application, including the changes to the database, into production (the same is also true for user-installed software that contains a database). For all but the simplest of systems, that means having to update an operational database while retaining the valuable data that is held in it. Finally, due to the constraint that the data in the database must be retained during a deployment, we need to have a rollback strategy should a deployment go wrong for some reason.
Versioning Your Database
The most effective mechanism to migrate data in an automated fashion is to version your database. Simply create a table in your database that contains its version number. Then, every time you make a change to the database, you need to create two scripts: one that takes the database from a version x to version x + 1 (a roll-forward script), and one that takes it from version x + 1 to version x (a roll-back script). You will also need to have a configuration setting for your application specifying the version of the database it is designed to work with (this can be kept as a constant in version control and updated every time a database change is required). At deployment time, you can then use a tool which looks at the version of the database currently deployed and the version of the database required by the version of the application that is being deployed. The tool will then work out which scripts to run to migrate the database from its current version to the required version, and run them on the database in order. For a roll forward, it will apply the correct combination of roll-forward scripts, from oldest to newest; for a roll back, it will apply the relevant roll-back scripts in reverse order.
Managing Orchestrated Changes
Managing Orchestrated Changes In many organizations, it is common to integrate all applications through a single database. This is not a practice we recommend; it’s better to have applications talk to each other directly and factor out common services where necessary (as, for example, in a service-oriented architecture). However, there are situations in which it either makes sense to integrate via the database, or it is simply too much work to change your application’s architecture. In this case, making a change to a database can have a knock-on effect on other applications that use the database. First of all, it is important to test such changes in an orchestrated environment—in other words, in an environment in which the database is reasonably production-like, and which hosts versions of the other applications that use it. Such an environment is often known as a systems integration testing (SIT) environment, or alternatively staging. In this way, assuming tests are frequently run against the other applications that use the database, you will soon discover if you have affected another application.
Rollback Databases and Zero Downtime releases
Once you have roll-forward and roll-back scripts for each version of your application, it is relatively easy to use an application at deploy time to migrate your existing database to the correct version required by the version of the application you are deploying.
Rolling Back without Losing Data
In the case of a rollback, your roll-back scripts (as described in the previous section) can usually be designed to preserve any transactions that occur after the upgrade took place.
Decoupling Application Deployment from Database Migration
Another strategy is to decouple the database migration process from the application deployment process and perform them independently.
Configuration Management
Configuration management is all about trying to ensure that the files and software you are expecting to be on a machine are present, configured correctly, and working as intended.
When you have only a single machine this is fairly simple. When you have five or ten servers, it is still possible to do this manually, but it may take all day. However, when your infrastructure scales up into the thousands we need a better way of doing things.
Version Control
What is “version control”, and why should you care? Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. For the examples in this book you will use software source code as the files being version controlled, though in reality you can do this with nearly any type of file on a computer.
If you are a graphic or web designer and want to keep every version of an image or layout (which you would most certainly want to), a Version Control System (VCS) is a very wise thing to use. It allows you to revert files back to a previous state, revert the entire project back to a previous state, compare changes over time, see who last modified something that might be causing a problem, who introduced an issue and when, and more. Using a VCS also generally means that if you screw things up or lose files, you can easily recover. In addition, you get all this for very little overhead.
Best Practices of Version Control
- Keep absolutely everything in version control : Developers should use version control for source code (of course), but also they should use it for tests, database scripts, build and deployment scripts, documentation, libraries and configuration files for your applications.
- Check In Regularly to Trunk : Once the changes are checked in into version control, they are available to the entire team.
- Using Meaningful Commit Messages : Always use detailed Multi-paragraph commit messages when you check in. This might save hours of debugging later on if an error happens. In a Multi-paragraph commit message, first paragraph is a high level details and the entire details in the remaining paragraphs.
Managing Components and Dependencies
- Managing External Libraries : External Libraries usually come in binary form, unless you are using a interpreted language. There are 2 reasonable ways of managing the libraries.
o Check them into Version control. This approach is the simplest solution and will work fine for small projects. However for larger projects and larger libraries, the approach may make the version control system too heavy and this approach may be unviable.
o Another one is to declare them and use a tool like Maven or Ivy to download libraries from Internet repositories to your own artifact repository.
- Managing Components : It is a good practice to split your application into smaller components. Doing so limits the scope of the changes to your application, reducing regression bugs. Also it encourages reuse and enables a much more efficient development process on large projects.
Managing Software Configuration
Configuration is one of the three key parts that comprise an application along with binaries and its data. Configuration information can be used to change the behavior of software at build time, deploy time and run time. Delivery teams need to consider carefully what configuration options should be available , how to manage them throughout the application life and how to ensure that the configuration is managed consistently across components, applications and technologies. You should treat the configuration of the system in the same way you treat your code. You should subject it to proper management and testing. There are three questions to consider when managing your application configuration:
- How do you represent your configuration information?
- How do your deployment scripts access it?
- How does it vary between environments, applications and versions of applications?
Each configuration setting can be modeled as a tuple (A data structure consisting of multiple parts – typically an ordered set of values). Generally the set of the tuples available and their values typically depend on three things
- The Application
- The version of the application
- The environment it runs on
Principles of managing Software Configurations
Some of the principles of managing configuration are as follows
- Consider where in your application lifecycle it makes sense to infect a particular piece of configuration.
- Keep the available configuration options for your application in the same repository as its source code.
- Values of the configuration should be managed separately
- Configurations should always be performed by automated process using values taken from your configuration repository.
- Use clear naming conventions and avoid obscure naming conventions.
- Do not repeat the information.
- Be minimalist. Keep the configuration information as simple as possible.
- Avoid over-engineering the configuration system.
- Ensure you have tests for your configurations.
Please refer to the following links to know more about DevOps Infrastructure