Thursday, May 5, 2011

Coninuous Integration for Sitecore projects

Continuous Integration in development is gaining its role as an essential part of project efficiency. We are in Sitecore got understanding of that quite some time ago and now even looking for a CI Specialist who will be serving our build framework full-time.

However the goal of this article is not to describe CI in Sitecore, but to find best CI practices for all Sitecore-based solutions.

What does distinguish Sitecore solutions from the rest of Solutions?

  • -        Any solution is an “superstructure” on top of CMS
  • -        A solution contain not only files, but content items which cannot be just copied as files
  • -        Heavy dependency on underlying CMS version
  • -        Similarity of steps related CMS-base part of a solution

Let’s see how to make building Sitecore solutions effective. I’ve grouped principles by a few categories depending on where they are applied. We follow these principles in Sitecore.
Hereafter I assume Cruise Control .NET, nAnt and SVN are meant, though the principles below should be valid for any set of tools. Bundle CC.NET+nAnt+SVN looks the optimal since free, easy configurable and being developed further.


Architecture of CI process

1.      Separate build server and project scripts, consider build server a live project
Re-use common operations which are not specific to a particular project, i.e. checkout, cleanup, etc. The best way is to leave this on build machine. This way you not only avoid duplicating appropriate code, but is able to maintain this part separately and is sure it works for any project.
Build server is a machine which:
o    contains up-to-date build framework
o   responsible for common actions
o   calls project build script
o   evaluates result of project build script execution
If you consider build framework a usual project and store in version control system, you will not need take care of deploying changes on build server. Add a project on the build server which builds the build framework. If you make any changes to a source of it, you are it will be picked up by each server and “deployed” locally.
The same applies to configuration files: store config files in the repository and let CC.NET update own configuration. When you need make changes to server’s configuration, you can do that by committing a new configuration file to a repository.
Aside effect of this approach is a security: you don’t need access to a build server to manage it. Since build servers are dealing with source code, they should protected appropriately. You can store config files in not so secured repository (build server reads it, not vice versa), therefore grant access to more people and not to be a bottleneck or manage server from VPN connection. In details the approach is described at http://confluence.public.thoughtworks.org/display/CCNET/Configure+CruiseControl.Net+to+Automatically+Update+its+Config+File

2.      Use CC.NET config validation
When you make build server upgrade own configuration, be sure to validate new config file, otherwise the whole server falls. Ccnet.exe console application has an option –validate, which allows check that CC.NET can be started with such a configuration file.

3.      Build server scalability
Updating a build framework by a build server can be assured by having an obligatory configuration for all servers. This can be achieved by separating config files and using include feature.
Imagine the setup:
svn://server/builds
                                 |__ ccnet.common.config
                                 |__ buildserver1.config
                                 |__ buildserver2.config
                                 |__ buildserver3.config

Each buildserver*.config files includes the common configuration config file.
Name of the *.config file corresponds to the names of the servers, a build server takes config with proper name.
Adding a new server is restoring it from disk image and committing a new configuration file into the repository.

4.      Let project build itself, developers are responsible for own project
Depending on a project size, build process can be either delivered to separate integration team or done by developers. In general a person (team) who is responsible for delivery, should be responsible for building. Assuming size of the Sitecore solutions, passing build script to a separate team is not the most efficient way which leads to slow turnover. Build script is a part of source code and maintained by developers (who are in charge of delivery and know better what should be delivered).
Build infrastructure should be the same in a company and not build for each project separately. I.e. once a company builds the build infrastructure and then developers just use facilities without digging into build servers work, they interact with build infrastructure with according to common rules.
Build script is backed up as a part of source code and therefore never lost.
When you should get back to a project later or something happens to a build server, you should not assembly parts of a project to resume work on it – everything is one bundle.

5.      Ensure script run on developers’ PCs
Good thing of having build script a part of source code, is that you can build a project on any PC with proper environment (e.g. nAnt). A developer can test build script on own machine before committing it to the source repository (like he compiles before commit), debug the script.

6.      Ensure logic progressiveness
Each project can be covered by a few of build projects with raising number of steps:
-        short one (cruise), which check that a new piece of code doesn’t break the compilability of a project trunk (or working branch). It runs on each commit and makes minimal compilation to ensure there is sense to run wider steps.
-        full one (nightly) – the complete build run in the night for a project trunk (or working branch). It passes all the steps necessary for delivery (thanks to Cruise the code is assured to compile), might include running autotests; usually passed to QA for testing.
-        full one (release) – this one is equal to nightly, but can be run for svn tags. Having separate release build decreases time for switching a sandbox to a new location, also you can force it using other special conditions (like creating a new tag in the repository).

7.      Include automated tests into CI
That is undoubted disaster when QA returns build immediately because of silly errors. Try to include running as many automated tests as possible so developers could see results before QA picks up the build.
Taking into account nightly builds, you shift the tests execution to nights, that saves working time.
A separate server can be used for running tests. It includes special test environment and frees build server for other projects. Consider the setup: build script copies the project output to some shared folder over network, the Sitecore installation is installed and test are run (all is possible with usual CC.NET + nAnt).

8.      Separate a project and underlying CMS
It’s often said that a project should be ready to be run right after checkout. However I cannot agree it’s true for all cases. E.g. upgrade on CMS can lead to extensive changes to source code, while no actually changes to a project done. In case you store your project files only, you can checkout it to any CMS website root folder. Even if you want to tight you code to a specific CMS version – keep it in source code as a ZIP file and make unzip part of a build process.
The same applies to configuration files – either use include files or patch files during build.
You can also restore items from serialization during building.

9.      Share resources
Create a storage (e.g. SVN repository) which can host resources for using by projects. Great example here is CMS: if a few projects use one CMS version as a base, it’s not efficient to keep it along with each project source code. Also svn operations with binary files (checking for modifications) is slow.

10.   Build in Sandbox, avoiding unnecessary copying.
I didn’t make special measurements, but disk operations remain the slowest, especially with huge number of small files. Build in the sandbox and clean it up afterwards – that will be much faster than making a complete copy of a sandbox.
Cleanup can be a part of a build server tasks, not repeated in each script. Build server revert changes in the sandbox before starting building a project itself.

11.   Special read-only account to access sources
To avoid accidental breaking the source code and sharpen security, use special read-only accounts to access source code by build script from build server.
Build server can store cached authorization, so nothing about access credentials is present in the build script.

12.   Launching Sitecore during build
It’s pretty easy to launch Sitecore during build for executing some operations in Sitecore context and using Sitecore API. Use free development web server from Microsoft WebDev.WebServer.exe
at some port reserved for build server.
To launch Sitecore and call API, not to use its backend, you don’t need the complete installation, therefore can save time for this step.

1 comment:

Anonymous said...

Hi Dmitry,

This is a great blog post, CI has become a must have on any Sitecore project but there really hasn't been a lot of information out there about it.

I've also been recommending a stress testing tool such as JMeter constantly polling the CI server. With this you can watch the effect new code has on the project and ensure the overall performance doesn't drop below the requirements.

Cheers,

Steve Green