• Subversion Best Practices

    Subversion 1.4.x

    These best practices were compiled by the Subversion open source project.

    This is a quick set of guidelines for making the best use of Subversion in your day-to-day software development work. These best practices are taken from the book Version Control with Subversion, which can be downloaded from openCollabNet


    Use a Sane Repository Layout

    There are many ways to lay out your repository. Because branches and tags are ordinary directories, you need to account for them in your repository structure. The Subversion project officially recommends the idea of a project root, which represents an anchoring point for a project. A project root contains exactly three subdirectories:

    • /trunk
    • /branches
    • /tags

    A repository may contain only one project root, or it may contain a number of them.

    Commit Logical Changesets

    When you commit a change to the repository, make sure your change reflects a single purpose: fixing a specific bug, adding a new feature, or some particular task. Your commit creates a new revision number which can forever be used as a "name" for the change. You can mention this revision number in your bug database, or use it as an argument to svn merge if you want to undo the change or port it to another branch.

    Use the Issue Tracker Wisely

    Try to create as many two-way links between Subversion changesets and your issue-tracking database as possible:

    • If possible, refer to a specific issue ID in every commit log message.
    • When appending information to an issue (to describe progress or to close the issue), name the revision number(s) responsible for the change.

    Understand Mixed-Revision Working Copies

    Your working copy's directories and files can be at different "working" revisions: this is a deliberate feature which allows you to mix and match older versions of things with newer ones. But there are few facts you must know:

    1. After every svn commit, your working copy has mixed revisions. The things you just committed are now at the HEAD revision, and everything else is at an older revision.
    2. Certain commits are disallowed:
      1. You can not commit the deletion of a file or directory which doesn't have a working revision of HEAD.
      2. You can not commit a property change to a directory which doesn't have a working revision of HEAD.
    3. svn update brings your entire working copy to one working revision, and is the typical solution to the problems mentioned in point #2.

    Be Patient with Large Files

    A nice feature of Subversion is that by design, there is no limit to the size of files it can handle. Files are sent "streamily" in both directions between Subversion client and server, using a small, constant amount of memory on each side of the network. Of course, there are a number of practical issues to consider. While there's no need to worry about files in the kilobyte-sized range (for example, typical source-code files), committing larger files can take a tremendous amount of both time and space (for example, files that are dozens or hundreds of megabytes large.)

    To begin with, remember that your Subversion working copy stores pristine copies of all version-controlled files in the .svn/text-base/ area. This means that your working copy takes up at least twice as much disk space as the original dataset. Beyond that, the Subversion client follows a (currently unadjustable) algorithm for committing files:

    1. Copies the file to .svn/tmp/
      This can take a while, and temporarily uses extra disk space.
    2. Performs a binary diff between the tmpfile and the pristine copy, or between the tmpfile and an empty-file if newly added.
      This can take a very long time to compute, even though only a small amount of data might ultimately be sent over the network.
    3. Sends the diff to the server, then moves the tmpfile into .svn/text-base/

    So while there's no theoretical limit to the size of your files, understand that very large files may require quite a bit of patient waiting while your client chugs away. You can rest assured, however, that unlike CVS, your large files won't incapacitate the server or affect other users.

    Work Around Commands that Don't Understand Copies/Renames

    When a file or directory is copied or renamed, the Subversion repository tracks that history. Unfortunately, in Subversion the only client subcommand which actually takes advantage of this feature is svn log. A number of other commands (such as svn diff and svn cat) ought to automatically follow rename-history, but aren't doing so yet.

    In all of these cases, a basic work around is to use 'svn log -v' to discover the proper path within the older revision. For example:

    1. Suppose you copied /trunk to /branches/mybranch in revision 200, and then committed some changes to /branches/mybranch/foo.c in subsequent revisions.
    2. Now you want to compare revisions 80 and 250 of the file.
    3. If you have a working copy of the branch and run svn diff -r80:250 foo.c, you see an error about /branches/mybranch/foo.c not existing in revision 80.
    4. To remedy this, you run svn log -v on your branch or file to discover that it was named /trunk/foo.c prior to revision 200, and then compare the two URLs
      directly: $ svn diff http://.../trunk/foo.c@80 \

    Know When to Create Branches

    This is a hotly debated question, and when you create branches really depends on the culture of your software project. Rather than prescribe a universal policy, here are three common suggestions:

    The Never-Branch System

    This is often used by nascent projects that don't yet have runnable code.

    • Users commit their day-to-day work on /trunk .
    • Occasionally /trunk "breaks" (doesn't compile or fails functional tests) when a user begins to commit a series of complicated changes.

    Pros: Very easy policy to follow. New developers have low barrier to entry. Nobody needs to learn how to branch or merge.

    Cons: Chaotic development, code could be unstable at any time. A side note: this sort of development is a bit less risky in Subversion than in CVS. Because Subversion commits are atomic, it's not possible for a checkout or update to receive a "partial" commit while someone else is in the process of committing.

    The Always-Branch System

    This is often used by projects that favor heavy management and supervision.

    • Each user creates/works on a private branch for every coding task.
    • When coding is complete, someone (original coder, peer, or manager) reviews all private branch changes and merges them to /trunk .

    Pros: /trunk is guaranteed to be extremely stable at all times.

    Cons: Coders are artificially isolated from each other, possibly creating more merge conflicts than necessary. Requires users to do lots of extra merging.

    The Branch-When-Needed System

    This is the system used by the Subversion project itself.

    • Users commit their day-to-day work on /trunk .
    • Rule #1: /trunk must compile and pass regression tests at all times. Committers who violate this rule are publicly humiliated.
    • Rule #2: a single commit (changeset) must not be so large as to discourage peer-review.
    • Rule #3: if rules #1 and #2 come into conflict (for example, if it's impossible to make a series of small commits without disrupting the trunk), then the user creates a branch and commits a series of smaller changesets there. This allows peer-review without disrupting the stability of /trunk .

    Pros: /trunk is guaranteed to be stable at all times. The hassle of branching/merging is somewhat rare.

    Cons: Adds a bit of burden to users' daily work: they must compile and test before every commit.