Sunday, May 1, 2011

Why is Build automation important?

Build automation is the lifeblood of any software organization. When it works, everyone can practically ignore it and focus on the bottom-line. When it doesn't work, it impacts every decision made by each person in the organization.

The biggest observable problems a build can have are:
  1. Slow incremental build speed.
  2. Poorly designed build scripts.
Slow incremental builds can be caused by many factors. False dependencies cause too much stuff to be built. Bad or missing dependencies create mistrust for a build, and require a full rebuild (maximum cost!) to work around.

Poorly designed build scripts are the other big problem. Build scripts are not amenable to change if a large number of files must be edited in order to re-arrange code, branch a module within the overall repo, or add a custom build step. High degrees of coupling between modules through the build system is another source of maintenance issues.

Slow builds bring individual developer productivity to a halt.
Developers work on an iteration cycle of repeat { edit -> build -> test }. The faster a dev can iterate, the faster they can converge upon a solution to a given problem. When build and test time are zero, then you are purely limited by developer editing speed -- that is the ultimate goal.
Qualitatively, who do you think will achieve more success : a team with incremental build-time of 1 second, 30 seconds, or 2 hours?

Slow builds discourage active code-reviews and participation.
In my experience, requiring code-review before committing is a huge determiner of code quality. In this kind of successful environment, reviewers must apply patches, build the software, and potentially tweak it as they find bugs.
What is the single greatest deterrent to applying a patch? The impending doom of a 2 hour build, of course! A 2 hour build forces each developer to think, "Do I get to work on my own objectives today, or will I lose the day to this review?"

Slow builds discourage good source-control practices.
You know how everyone says to keep commits minimal and orthogonal? That's really hard to do when each one-line fix costs you 2 hours to verify! So let's say you make the decision to commit without building first (if you're awesome enough to try this, of course). There's a good chance that the continuous build server will do something dumb and spend 2 hours on each of your three one-liner back-to-back commits.
Long build times tend to encourage developers to glob larger changes into single commits. And that's a bad thing.

Slow builds discourage developers from updating from source-control as often.
If it costs 2 hours of build time after you update/sync/rebase to the latest version of the code-base, then at most you will do this once per day.
Good-bye to best practices like sync'ing and building one last time before committing your code. Which of course implies that the build will be broken a higher % of the time.

Slow incremental builds make it harder to pin down build failures.
When the build takes 2 hours, continuous build servers will always be processing large batches of commits at a time. So when the build breaks due to one of the commits, you have a slew of problems:
  1. You found out at best 2 hours after the commit occurred. More likely, you found out after 4 hours (2 hours for previous build to complete + 2 hours for the breaking change).
  2. Which commit is it? You probably have 10 commits to look through.
  3. Back-to-back breaking changes (especially by different submitters) are especially hard to work through.
If a developer leaves work at time T, he will not commit code after T - buildTime. Therefore, with D developers, you are losing O(D*buildTime) time on your team.
Developers need to be around in case their commits break the build. If a build takes 2 hours, the developer will be sitting on a set of changes for 2 hours at the end of each day! What a waste of time. Hopefully he has some orthogonal piece of work, or else he's literally blocked.

Slow builds discourage the creation of multiple branches.
You're telling me I have to integrate these N changes into the Developer and Release branches, and build both? Woe unto me if the build mucks with some system-global state -- then I can't even run the two branches' builds in parallel!

Inflexible build scripts lead to more bad practices.
Example : if you can't easily add a code-generation target to the build, you might just check in the generated files. Now you have another maintenance problem!

Build scripts that don't implement 'clean' correctly have a high cost.
When 'clean' doesn't work, then developers are always manually deleting stuff in the file system.
Couple this with bad or missing file-level dependency checking, and you find yourself manually deleting files and rebuilding on every edit. Ouch.

Hopefully all these examples have convinced you of the importance of a correct, fast, and maintainable build.

And finally, an analogy to make this all feel a bit more "real".
Consider each developer to be a thread; the build server and repository server are also threads. Each thread has a copy of the repo in its TLS (thread local store).
The repository has a commit FIFO (a write FIFO); developer threads are producing commits and enqueueing them in some order. There is a similar FIFO on the build server, that is pushed into by the repository server after every commit has gone through.
The build time determines the frequency at which developer threads produce commits. Faster build time implies greater throughput to the build server. The build time also determines the build server's frequency of producing a result.

I leave you with this mighty pbrush creation:


No comments:

Post a Comment