Home > continuous integration > Introduction to Continuous Integration

Introduction to Continuous Integration

Sometimes you got the right people, a good language and frameworks, but working in a project is a mess and nobody knows what will happen when you update the code and try to compile or running any functionality.

When you are working in a team with big and complex systems, trying to accomplish deadlines there are certain issues that you have to avoid. Tired of facing the same problems, some years ago I started researching about Continuous Integration as the key to enter into a new level of quality.

What problems could be solved?

Integration use to be an unpredictable process.

I remember one of my first works at university (algorithms III), I coded a chess game with another guy. After one or two weeks we got all the code. Unfortunately it took around the double of time to make both pairs of the system to run. It was only after making some unit and system testing (which we also learned in that subject) the system worked. Similarly, the same situation use to happen in important software companies, were individual portions of a system, managed to break it all or some functionalities, sometimes very frequently.

Merge code from different sources lead to hard bug detection.

Sometimes we realize that there is a flaw in some functionality that spotted some (unknown) time before. Was it because an external component, or any house-made code? In any case, no one should waste his time looking for a (usually) preventable problems.

Untouchable code.

Ugly code survives in the project because of fear of breaking something. This is also a common issue, specially in old systems where you can find different “layers” of functionality. This layers doesn’t follow an MVC or any other pattern but the age of the code. So if any new functionality has to be added, usually a hack will be used. If any portion of code deserves a redesign or better a refactor, none of this will happen, because of fear of breaking some functionality in that intricated bloated code. If it worked, then “don’t touch it” is the common phrase that names all the passed opportunities to enhance a system, or better, to learning something new from breaking it.

Clearly, if we have more information of exactly what remains working correctly after a modification, then we could touch things with more confidence.

More problems

svn update / p4 sync / git fetch. After an update, the system start failing!

It doesn’t matter what software configuration manager (scm) tool you are using. If there is not an automatic way to ensure the quality of the software -and more importantly- the quality of the artifacts you are downloading from the repository then you could end up with compilation or running errors after updating. The use of a scm tool is the proper way of tracking changes and rolling back to a previous state, but if you don’t get automatic information of exactly what is going on in each revision, then you could end up spending a lot of time in realizing the true state of the last version.

No one noticed that XYZ functionality has stopped working weeks ago.

Stealth like a ninja, a bug could be added without being noticed from a long time. This could ruin the trust over a system. Usually the solution is to add some layers of tests around the code, but unless they are regularly executed without relying on previous built artifacts, special environment configurations or data schemas, they can’t be as helpful as possible. A Continuous Integration system could help working this out.

No one knows what change broke the code.

If we already have a repository where we commit and get updates, a Continuous Integration system could be helpful by tracking and testing all the single versions of our system.

Delays. One hour task ends up lasting half a day.

Usually you feel (and name) something as an easy an quick task. But as many times, the code manages to surprise you, taking down your estimations and your boss’s expectations. But software shouldn’t be unpredictable and you can simply let a third party (a CI system) testing every single commit to the repository.

Solutions

Catch the problems as soon as they happen.

A proactive attitude against this kind of problems is not just to say “we have to be very careful with this errors” like I heard many times. Because, it doesn’t matter how careful you are crossing an avenue that lacks of a semaphore. We should accept that human errors will always happen, but with the proper process we will address most of the named issues.

Treat integration as a non-event.

Working in a group or with different components means that at least sometimes we will have to merge different pieces of code. Instead of waiting to make it in some point of the future we should treat integration as a “continuous” event. That’s where its name come from.

Origin of Continuous Integration

The first references to Continuous Integration techniques were first described by Grady Booch in his book “Designing Strategies for Object Technology” (1997)

“In the context of continuously integrating a system’s architecture, establish a projects rhythm by driving to closure certain artifacts at regular intervals”

“Rather than set aside a period of formal testing at the end of the life-cycle, the object-orientated life cycle tends to integrate the parts of software (and possibly hardware) at more regular intervals…”

“..at regular intervals, the process of continuous integration yields executable releases that grow in functionality each release…”

But it was with the raise of Extreme Programming (XP) that this and other techniques took force.

From Agile Manifesto

“Our highest priority is to satisfy the customer through early and continuous delivery of valuable software…”

The first of the principles behind the Agile Manifesto points the process of software development as a flow of increasing value for the customer. I don’t imagine a better way of accomplish this but with the use of a Continuous Integration system.

From XP (extreme programming) rules.

  • Continuous integration
  • Design improvement
  • Small releases

This three XP rules are group together as “Continuous Process” pointing not only to the use of Continuous Integration, but to some practices around it, like taking opportunities while making a correction to refactor and redesign. For instance if our CI server notified about a mistake, we could possibly refactor the solution in a general function used wherever makes sense, or if we see an error in a lengthy, complex code or class maybe we could split it in smaller, cohesive pieces. Of course there other not so obvious opportunities to redesign and XP tell us to do it whenever it is clear that adds value to the system, while the use of a CI system let us know the post-refactor health status of the system.

Another good technique in a symbiotic relationship with CI is the costume of making small “and frequent” releases to just avoid looking for errors in a lengthy bunch of lines and files of code.

So, what is Continuous Integration?

At this point surely we have a good picture of what Continuous Integration is. In lack of a proper taxonomy in CS, what I collected as the main principles of this practice include:

Team members integrate their work early and often

As the name suggest there should be a continuous flows of commits to the code repository, usually a daily commit per developer is a good target to start with.

Each integration is verified by an automated build

There should be something acting as the Continuous Integration server that takes the last state of the code and builds it in an independent environment. This process should include also running as many tests as possible as we get a better knowledge of the state of the system.

Is much more than installing a tool

As always, there are not silver bullets and to implement Continuous Integration is much more than choosing and installing a CI server. Of course, such a tool could be a great help, but we need to be sure that we are following all the practices to let the process be a real earning for our team and our product.

Best practices while implementing a CI server

There are some simple practices that will let to get the most of our CI server like:

-Including all elements in the repository.

Without controlling all elements in a single place, we will never be able to reliably deploy a new instance in our server and to track the changes in our product that could affect any part of it.

-Tests. Included within the build process.

If we only compile the system in the CI server, we will be loosing important information and never realize about any affected functionality. Today there are many xUnit frameworks and auto-testing tools that are capable of adding valuable information to the integration process.

-Use an integration machine. In an isolated and virgin environment.

In a perfect situation, a single script should set up all the environment variables, servers, schema instances, libraries and so on, automatically and absolutely from scratch. In this “big bang” way of working, our system will be far more reliable than any other that needs special screws and screw drivers to be adjusted. As a positive side effect there won’t be any effort to set up a “new hire” environment, or deploying a new instance and the team could focus in just adding new business value.

-Frequent submits.

Subdivide the work to find bugs sooner and communicate the team what we are doing.

-Fast! Every minute counts. Keep the build fast to optimize everybody’s work time.

The whole process should be prepared to give feedback as soon as possible. If we find that, for example some of the tests takes too long, or a whole compilation takes more than an hour, we could configure a nightly and a daily integration. While the first should be taking the long path, the daily integration should catch most of the errors in few minutes. We should wisely try to apply Pareto principle (#6) to include the 20% most importants test in the daily process in order to always try to catch 80% of the errors.

-Communication Everyone should easily know the state and the changes made to the system.

A common repository, plus daily submits, plus a Continuous Integration server is also helpful as a communication tool. Everybody in a team will feel better and get updated about everybody’s work.

-Production. Make the integration and tests in a production clone

It is highly desirable to try clone every aspect of production in our CI server just for letting the less possible points of failure without being tested.

Benefits

Some of the benefits of implementing this practice:

  • Reduced risks.
  • No debug. As you can always return to a known bug free state.
  • Fact. Projects that use CI techniques usually have less bugs.
  • You will be always developing on a known stable base.
  • Solution to the Broken Window syndrome.
  • Frequent deployment.
  • Early warnings.
  • Integration is a natural continuous process.
  • Constant availability of a “current” build for testing, demo, or release purposes.
  • Incentive for developers to code incrementally.
  • Quality. Due to metrics generated from automated testing and CI (such as code coverage, code complexity, features complete)

References:

  1. Patrick Rozier
    October 10th, 2010 at 23:02 | #1

    Thanks for sharing. Share is caring after all.

  1. No trackbacks yet.