AgileJazz - Russ Wangler's Blog: DevOps Handbook Summary 4 of 4

DevOps Handbook Summary 4 of 4 - Security

Book summary of The DevOps Handbook by Gene Kim et. al. Excerpted content is formatted in italics.

Part VI: The Technical Practices of Integrating Information Security, Change Management, and Compliance

The DevOps goal is to make security a part of everyone's job. We'll look for opportunities to augment our controls with audit-able automation. This automation will minimize the need for separation of duties and change approvals that unnecessarily impede the value chain. Once automated and baked into everyone's daily work the controls are less variable, more audit-able, and significantly stronger than the manual controls they replace. Some critical controls will remain manual.

We do this by:

Making security a part of everyone’s job
Integrating preventative controls into our shared source code repository
Integrating security with our deployment pipeline
Integrating security with our telemetry to better enable detection and recovery
Protecting our deployment pipeline
Integrating our deployment activities with our change approval processes
Reducing reliance on separation of duty

Information security as everyone's job every day

Integrate security:

Into development iteration demos - this means they are their own stories or are acceptance criteria of all relevant stories
Into defect tracking and post mortem's - security defects should be tracked with all other defects and security incidents and security implications of any incident are subject to post mortem reviews

We must concern ourselves not only with application and data center security but with end to end value chain security.

Like all of our code be it application, infrastructure, operations etc. security capabilities reside in our shared code repository standards making them easy to find, understand and use. We may include items such as:

Code libraries and their recommended configurations (e.g., 2FA [two-factor authentication library], bcrypt password hashing, logging)
Secret management (e.g., connection settings, encryption keys) using tools such as Vault, sneaker, Keywhiz, credstash, Trousseau, Red October, etc.
OS packages and builds (e.g., NTP for time syncing, secure versions of OpenSSL with correct configurations, OSSEC or Tripwire for file integrity monitoring, syslog configuration to ensure logging of critical security into our centralized ELK stack)

Next we'll integrate security into our development pipelines by including as many automated security tests as we can to run with all our other automated tests.

Tools such as Gauntlt have been designed to integrate into the deployment pipelines, which run automated security tests on our applications, our application dependencies, our environment, etc. Remarkably, Gauntlt even puts all its security tests in Gherkin syntax test scripts, which is widely used by developers for unit and functional testing. Doing this puts security testing in a framework they are likely already familiar with. This also allows security tests to easily run in a deployment pipeline on every committed change, such as static code analysis, checking for vulnerable dependencies, or dynamic testing.

Ensure security of the application

Developers, often focussed on happy path tests of correctness, will need security training to ensure use of sad or bad path automated tests and tools such as:

Static analysis tools - Brakeman and Code Climate
Dynamic analysis - Focusses on run time behavior. Tools include: Arachni, OWASP Zap, Nmap and metasploit
Dependency scanning - for malicious or vulnerable binaries
Source code integrity and signing - all developers are identified and use a security key e.g. PGP, all packages generated by continuous integration should be signed and inventoried for audit-ability.

Ensure the security of our environments

In this step, we should do whatever is required to help ensure that the environments are in a hardened, risk-reduced state. Although we may have created known, good configurations already, we must put in monitoring controls to ensure that all production instances match these known good states.

We do this by generating automated tests to ensure that all appropriate settings have been correctly applied for configuration hardening, database security settings, key lengths, and so forth. Furthermore, we will use tests to scan our environments for known vulnerabilities.

Another category of security verification is understanding actual environments (i.e., “as they actually are”). Examples of tools for this include Nmap to ensure that only expected ports are open and Metasploit to ensure that we’ve adequately hardened our environments against known vulnerabilities, such as scanning with SQL injection attacks. The output of these tools should be put into our artifact repository and compared with the previous version as part of our functional testing process. Doing this will help us detect any undesirable changes as soon as they occur.

Incorporate security into telemetry

Environmental examples:

OS changes (e.g., in production, in our build infrastructure)
Security group changes
Changes to configurations (e.g., OSSEC, Puppet, Chef, Tripwire)
Cloud infrastructure changes (e.g., VPC, security groups, users and privileges)
XSS attempts (i.e., “cross-site scripting attacks”)
SQLi attempts (i.e., “SQL injection attacks”)
Web server errors (e.g., 4XX and 5XX errors)

Application examples:

Successful and unsuccessful user logins
User password resets
User email address resets
User credit card changes

Protect our development pipeline

Example countermeasures:

Hardening continuous build and integration servers and ensuring we can reproduce them in an automated manner
Reviewing all changes introduced into version control
Instrumenting our repository to detect when test code contains suspicious API calls
Ensuring every CI process runs on its own isolated container or VM
Ensuring the version control credentials used by the CI system are read-only

Protect our deployment pipeline

Integrate security and compliance into the change approval processes

Change management processes typically these address three types of changes:

Standard - low risk, maybe pre-approved
Normal - higher risk, typically requiring multiple party review
Urgent - high risk, often requiring executive approvals

Our goal is demonstrate that as a result of all of the automated and manual controls we have in place that a large majority of changes are standard changes and similarly that many urgent changes may be treated as normal changes.

Reduce reliance on separation of duty controls

When we did production deployments less frequently (e.g., annually) and when our work was less complex, compartmentalizing our work and doing hand-offs were tenable ways of conducting business. However, as complexity and deployment frequency increase, performing production deployments successfully increasingly requires everyone in the value stream to quickly see the outcomes of their actions.

Separation of duty often can impede this by slowing down and reducing the feedback engineers receive on their work. This prevents engineers from taking full responsibility for the quality of their work and reduces a firm’s ability to create organizational learning.

Consequently, wherever possible, we should avoid using separation of duties as a control. Instead, we should choose controls such as pair programming, continuous inspection of code check-ins, and code review. These controls can give us the necessary reassurance about the quality of our work. Furthermore, by putting these controls in place, if separation of duties is required, we can show that we achieve equivalent outcomes with the controls we have created.

To accomplish this we need to ensure we have documentation and proof for auditors and compliance officers.

As technology organizations increasingly adopt DevOps patterns, there is more tension than ever between IT and audit. These new DevOps patterns challenge traditional thinking about auditing, controls, and risk mitigation.

As Bill Shinn, a principal security solutions architect at Amazon Web Services, observes, “DevOps is all about bridging the gap between Dev and Ops. In some ways, the challenge of bridging the gap between DevOps and auditors and compliance officers is even larger. For instance, how many auditors can read code and how many developers have read NIST 800-37 or the Gramm-Leach-Bliley Act? That creates a gap of knowledge, and the DevOps community needs to help bridge that gap.”