• What Is Subversion?

    Product:
    Subversion 1.4.x

    Component:
     

    Summary:

    This article discusses Subversion from the viewpoint of a Subversion tools developer and consultant.



     

    What Is Subversion?

    Depending on whom you ask, Subversion can be many things to many people. This article will explain, from my eyes, what Subversion is. As part of this, I will step into the shoes of a few key users of Subversion to explain their view of Subversion and how their view may differ from others. Before we go into the details of Subversion, let’s learn exactly what Subversion is from a high-level perspective and then get more detailed information by walking in the shoes of our theoretical users.

    Subversion at a Glance

    Out of the box, and in its simplest form, Subversion is nothing more than an advanced, open source version-control system. Its sole purpose is to help you track the changes to directories of files under version control. This isn't to say that Subversion can not be the cornerstone of your build management, release management, and continuous integration efforts, which we will discuss later. But out of the box, Subversion just cares about the directories and files it is supposed to track the changes to.

    Subversion History Abridged

    Back in 2000, CollabNet decided to create a replacement for CVS. This decision came after running into problems and limitations of CVS. These problems were not only throughout development, but also in regard to the CVS integration into their flagship product CollabNet Enterprise Edition (a collaboration and development platform for distributed development). CollabNet reached out to Karl Fogel, author of Open Source Development with CVS, to ask if he would like to be involved. Coincidentally, he and Jim Blandy had already started talking about this and they agreed to work on this project. Their plan was to create a tool that did not deviate too much from CVS's development/usage model, but would fix the apparent problems of CVS. To make a long story short, Subversion was born.

    Subversion Features

    Let’s look at a few of its more impressive features to get a better understanding of what Subversion brings to the table.

    Merge Tracking

    Merge tracking is Subversion’s ability to record information about merge history and then use that history to help avoid the problematic duplicate merge scenario. Merge tracking also allows for cherry-picking. With Subversion’s merge tracking, you can do merges more easily and have access to a wealth of information about merges and paths impacted by merges.

    Directory Versioning

    Directory versioning is the idea of versioning a directories structure, just as you do the structure/content of a versioned file. Subversion uses a virtual filesystem to allow for directory versioning. The end result is that you can track changes to directory structures just like you can the contents of files.

    True Version History

    True versioning allows you to copy and rename resources so that the newly created resource has its own history and is seen as a new object. Because copying and renaming resources is extremely common, true version history is a nice feature: it allows you to view each object as its own entity, regardless of whether the new entity was the result of a copy or rename.

    Atomic Commits

    Atomic commits are the concept where your commit is either entirely committed or it is not committed at all. Unlike with non-atomic commits where you can have a partial commit, atomic commits basically allow Subversion to undo any portion of the commit transaction in the event that a problem arises. This means that interrupted commit operations do not cause any corrupt or inconsistent state in the repository.

    Versioned Metadata

    Versioned metadata is the ability to apply key-value tuples to a versioned object. This metadata is called a property and properties are versioned just like the objects to which they are applied.

    Choice of Network Layers

    Subversion's access layer has been abstracted to allow for multiple avenues when accessing a repository. This abstraction allows you to develop your own access method or you can use an existing method. This flexibility means that you can use what works instead of being forced to use a particular access model. Another layer of flexibility is Subversion's use of WebDAV, allowing for repository interaction over http/https, which usually poses no problem when accessing behind a firewall and/or proxy.

    Consistent Data Handling

    Subversion uses a binary differencing algorithm when storing version history that works the same on text files and binary files. This means that Subversion uses the same process for versioning text and binary files, Subversion stores the files/differences the same on the server regardless of file type, and Subversion sends differences across the wire the same regardless of file type.

    Efficient Branching and Tagging

    Subversion's approach to branching/tagging means that branching and tagging is not proportional to the size of the project being branched/tagged. Subversion uses something similar to a hard-link on the server side when the branch/tag is created. This means that branching/tagging in Subversion takes a very small amount of time and storage regardless of your project's size.

    Hackability

    Subversion is its own project, built from the ground up with a well-defined C API. This means that you can maintain, extend, and integration Subversion into other projects easily. It is also worth noting that Subversion has bindings for many languages like Java, Perl, and Python.

    Subversion in Detail

    The list of features above isn't fully comprehensive, but it can give you and idea of what Subversion can do. Let’s now outline Subversion’s functionality and concepts.

    Automatable and Scriptable

    Subversion's output is both human readable and parseable. This means that those of you who want to automate or script any part of Subversion should have no issues doing so.

    Change Sets

    Subversion was built to be efficient over the wire and on the disk. To put perspective behind this statement, Subversion wants to send as little data across the wire and to store as little information on the disk. Subversion does this via change sets. Every time you create a commit, you create a change set. Each change set contains the changes required to reproduce that commit. Since Subversion doesn't do file-level versioning, change sets are Subversion’s way of communicating changes in between revisions. This is efficient because it allows Subversion to send and store only what is required to reproduce the commit creating the subsequent revision. In the end, the costs are proportional to change size and not to file size.

    Choice of Client

    Because Subversion abstracts the access and interaction into well-defined APIs, you can use the particular Subversion client that fits your needs or environment. You can even mix-and-match which clients you use, depending on your interaction needs.

    Choice of Parallel Development Model

    Subversion lets you pick and choose which parallel development methodology you want to use and when. This means that if you want to use the Lock-Modify-Unlock model for your binary files, so be it. If you want to use the Copy-Modify-Merge model for all non-binary files, that is great. You can even mix and match, depending on your specific likes and needs.

    Internationalization

    Subversion was built for global consumption and this commitment is shown by its internationalized messages.

    Global Revisioning

    Subversion uses a global revision number, as opposed to using file-level revision numbers. This means that each revision contains the state of the repository as it exists for that particular revision. This allows for many of the necessary features that Subversion has implemented.

    Historical Tracking

    Subversion's built-in capabilities are not limited just to versioning the files/directories instructed. Subversion also comes with a complete toolkit for analyzing the history of the files/directories under version control. Change reports, release management, and many other features are at your fingertips, thanks to Subversion's built-in historical tracking capabilities.

    Subversion in Use

    We now know what Subversion is, but we still haven't really considered Subversion from the eyes of its users. Let’s look at this in the next section. These users are a product developer, a product manager, a release manager, a repository administrator, and a network/systems administrator. I will not write a book on each, but the idea is to look at Subversion from their eyes and to figure out how Subversion best accommodates these users and how.

    The Product Developer

    A product developer is solely concerned with Subversion in the context that it historically tracks the files/directories from which the developer is developing against. Nothing more. They need to be able to locate resources, compare differences between revisions of resources, and work on multiple products/releases/efforts at the same time. Subversion accommodates this, in that it facilitates parallel development by its design. Its simplicity in interaction allows the developer to worry more about the product than the intricacies of the version-control tool. These are the most important things to a product developer:

    • Simplicity: Each Subversion tool is extremely well document and is designed to allow for the simplest migration path from another version control tool. Also, there are only a handful of Subversion features that a developer needs to understand to be able to do day-to-day development.

    • Flexibility: Developers have the ability to use the client that best fits their needs. This means that you can choose whatever client that makes you the most efficient. Clients are not the only level of flexibility in the eyes of a developer: Subversion users also can pick and choose which development methodology they want when interacting with a Subversion repository. This allows development teams to build their own development processes.

    • Traceability: Beyond the typical interaction with the repository during development, developers also need to be able to do minor historical tracking. Whether they need to know who added a particular line of code or who deleted a file, product developers must be able to get historical data. Subversion's built-in historical capabilities are more than enough for creating traceability for a development project.

    Developers are probably the easiest to please in respect to Subversion. With Subversion's efficiency over the wire, simple and document commands, and historical tracking capabilities, Subversion is an excellent candidate for a version-control system in the eyes of a developer.

    The Product Manager

    While the product developer is mainly concerned with the simplicity of interaction with the repository, a product manager probably wants to do more historical tracking in order to properly manage the team working on the product. The manager also wants to work on multiple releases of the product in parallel. (Think about working on the current release, a bug fix release, and a proof-of-concept release at the same time.) To a product manager, the following are the most important:

    • Branching: To facilitate parallel development, a requirement when working on multiple releases at the same time, a product manager is interested in Subversion's branching capabilities. Branching is the cornerstone of allowing parallel development on multiple efforts at the same time.

    • Traceability: Traceability is where the developers’ and managers’ needs slightly overlap. While developers need traceability to be able to understand code changes, product managers need traceability for other reasons. Product managers manage developers, so traceability is important for code reviews, change reports, defect reports, and release reports. Subversion accommodates these needs with its full features historical tracking features.

    • Simplicity: Most managers want to be able to manage without having to fully understand the underlying tooling. Subversion abstracts the access layer so that managers can use WebDAV clients, like Windows Web Folders, to simplify Subversion repository interaction. This, coupled with thoroughly documented commands, makes a manager’s job easy when managing a project using Subversion for the version-control system.

    Product managers are extremely easy to please when it comes to Subversion. They want an easy way to interact with the repository, an easy way to trace releases and developer contributions, and the ability to manage multiple releases at the same time. Subversion makes a manager's job easy and I'm sure managers agree.

    The Release Manager

    Think of the release manager as the same as a product manager, but while a product manager manages the developers of the project, a release manager manages the releases of the projects. Release managers are solely concerned with being able to work on multiple releases in parallel and being able to trace changes between releases. Here is how Subversion accommodates release managers:

    • Branching: As with product managers, release managers need to be able to make sure that multiple releases are being developed in parallel with cross-contaminating releases with the needs of other releases. Since branching is the only real way to facilitate parallel development in isolation, branching is a hot topic for release managers.

    • Tagging: Release managers need to be able to archive releases: Subversion allows you to do this with tags. A tag is basically a human-readable name given to a particular revision of a directory tree. Tagging makes life easier in that release managers can locate the tags directory and identify which releases have shipped, without having to memorize or document the underlying revision of the directory tree to locate a release point. Releases are as simple as having a tag with the release name, like "Release 1.0."

    • Traceability: Release managers need traceability in order to identify what was added, removed, or fixed from one release to another. Subversion's historical tracking capabilities make this simple in that you can create a change log between releases, you can create defect reports between releases (with the proper process to facilitate this), and you can even create other more detailed reports from one release to another, depending on your business needs.

    You can begin to see that Subversion's historical tracking can be extremely powerful and useful. Beyond that, release managers lives are made much easier with a few convenience mechanisms, like tagging, thanks to Subversion.

    The Repository Administrator

    The repository manager has one central focus: repository layout and permissions. Here are the areas of concern for a repository manager:

    • Flexibility: Subversion does not require or mandate any particular repository layout. Subversion also allows you to change just about any aspect of your repository whenever you feel the need to. Want to change from a single project repository to a multi-project repository? Or use a non-standard repository layout? Subversion allows you to make the decisions and even allows you to change your mind easily, with minimal downtime and effort.

    • Permissions: Depending on your server configuration, a Subversion repository administrator can integrate into many external authentication schemes for repository access. Once access is granted, the administrator can even do file-level access control via a simple text file. There are no difficult configurations or administrative needs to create a fully secure Subversion repository.

    • Backup/Recovery: Subversion's backup and recovery tools are very simple to use. Subversion's scripability makes this process extremely easy and easy to produce.

    Subversion was built to make things simple in all aspects, including repository administration. Repository administrators have the flexibility to choose the best practice for repository layout for their projects, and can even change the repository configuration at any time, thanks to Subversion's design.

    The Network/Systems Administrator

    Network/systems administrators are concerned only with security of the server and the network to which the server is attached. Subversion's access capabilities make their job a lot easier:

    • Unobtrusive: Subversion gives you the flexibility to choose which network layer to use to expose your repository. With this flexibility comes the ability to expose a repository without having to include network and systems administrators in most cases. Since you can access are well-configured Subversion repository via http/https, you can usually provide access to a Subversion repository from behind a corporate firewall and/or proxy without having to create access rules to open new ports and so on.

    Thanks to its unobtrusive nature, Subversion can usually be installed without really needing to talk to a network or system administrator. This makes things a lot easier for implementing Subversion into your corporation securely.

    Summary

    As you can see, Subversion has a lot to offer to a lot of people. Out of the box, Subversion is a commercial quality version-control system, but Subversion's real value proposition is in the eye of the beholder. Developers enjoy Subversion's ease of use and flexibility. Product managers appreciate the Subversion’s ability to handle multiple efforts being tracked concurrently. Release managers welcome the ease of tracing releases. Repository managers appreciate the flexibility Subversion gives you when providing access to your repository.

    Regardless of how you use Subversion, there is a lot to be gained by using Subversion. Subversion was built to be simple, flexible, and powerful. Subversion provides many innovative features that give you the flexibility and power that you need from your version-control system.