Wednesday, October 21, 2009

Values Engendered by Revision Control

Introduction
This paper presents a Value-Sensitive-Design (VSD) conceptual investigation of revision control. It focuses on revision control as it is employed when constructing software applications. Firstly, the sociotechnical problem space is explicated by 1) defining revision control and 2) explaining how organizations can implement revision control through the use of specialized tools. Next, implicated values are identified and defined in terms of interactions with version control tools. Finally, stakeholders are identified and the effect of implicated values on stakeholders is analyzed

Sociotechnical Problem Space
Revision control (also known as version control) is used to manage changes in documents, source code, or any other type of file stored on a computer. Revision control is typically used in software engineering when many developers are making contributions/changes to source code files. As changes to a source code file are committed, a new version of the file is created. Versions of a file are identified either by date or by a sequence number (i.e. – “version 1”, version 2”, etc.). Each version of the file is stored for accountability and stored revisions can be restored, compared, and merged.

The fundamental issue that version control sidesteps is the race condition of multiple developers reading and writing to the same files. For instance, Developer A and Developer B download a copy of a source code file from a server to their local PC. Both developers begin editing their copies of the file. Developer A completes his/her edits and publishes his/her copy of the file to the server. Developer B then completes his/her edits and publishes his/her copy of the file, thereby overwriting Developer A’s file and ultimately erasing all of Developer A’s work.

There are two paradigms that can be used to solve the race condition issue: file locking and copy-modify-merge (Collins-Sussman).

File locking is a simple concept that permits only one developer to modify a file at any given time. To work on a file, Developer A must “check out” the file from the repository and store a copy of the file to his/her local PC. While the file is checked out, Developer B (or any developer for that matter) cannot make edits to the file. Developer B may begin to make edits only after Developer A has “checked in” the file back into the file repository. File locking works but has its drawbacks. File locking can cause administrative problems. For example: a developer may forget to check in a file effectively locking the file out and preventing any other developer from doing work. File locking also causes unnecessary serialization. For example: two developers may want to make edits to different parts of a file that don’t overlap. No problems would arise if both developers could modify the file, and then merge the changes together. File-locking prevents concurrent updates by multiple developers so work has to be done in-turn.

In the copy-modify-merge paradigm, each developer makes a local “mirror copy” of the entire project repository. Developers can work simultaneously and independently from one another on their local copies. Once updates are complete, the developers can push their local copies to the project repository where all changes are merged together into a final version. For example: Developer A and Developer B make changes to the same file within their own copies of the project repository. Developer A saves his/her changes to the global repository first. When Developer B attempts to save his/her changes, the developer is informed that their copy is out of date (i.e. – other changes were committed while he/she was working on the file). Developer B can then request that Developer A’s changes be merged into his/her copy. Once the changes are merged, and if there are no conflicts (i.e. – no changes overlap), Developer B’s copy is then saved into the repository. If there are conflicts, Developer B must resolve them before saving the final copy to the project repository.

A development organization may implement revision control through the use of specialized tools dedicated to source code management. There are several open-source and commercial tools available, each with their advantages and drawbacks. Subversion, an open-source software package, is a well-known and widely used tool (Tigris). Subversion (“SVN”) uses a client-server model. Source code files are stored on the SVN server (aka “repository”) and can be accessed by any PC’s running the SVN client. This allows many developers to work on source code files from different locations/PC’s. Some key features of SVN are: utilization of copy-modify-merge (and file-locking if needed), full directory versioning, atomic commits to the repository, and versioned metadata for each file/directory.

Values Defined
The use of a good revision control methodology engenders several values within a development organization. This section identifies and defines some of these values.

By leveraging revision control, an organization fosters collaboration between its developers. Gray defines collaboration as “a process of joint decision making among key stakeholders of a problem domain about the future of that domain” (Gray, p.11). Source control permits developers to work in teams where each individual can contribute to the overall goal of delivering a quality software product. Each individual makes decisions on which piece of code will work best to reach that goal. The future of the domain, or software release, is defined by the collaborative effort of developers within the workspace.

Revision control usage also engenders accountability. In their book, Friedman et al write: “accountability refers to the properties that ensure the actions of a person, people, or institution may be traced uniquely to the person, people, or institution” (Friedman). Upon change commit (i.e. - submitting a change to the repository), revision control tools record the responsible developer and place a timestamp on the new version of the file. Moreover, the developer can enter comments to describe changes that he/she has made. For these reasons, revision control tools provide a good mechanism for accountability as a complete audit trail of change is recorded.

Another value brought about by revision control is work efficiency. This is especially true when the copy-modify-merge paradigm is utilized. The major advantage of this paradigm is that it allows developers to work individually and concurrently, thereby maximizing available development time. Compare this to the file-lock paradigm where developers can be locked out a file at any given time. Additionally, copy-modify-merge minimizes the coordination effort and expense between developers.

Along with the values stated above, revision control also: enhances communication between developers, prevents loss of work through backups, enables better coordination of efforts, manages code merges, and provides code stability by allowing organizations to rollback to previous versions of the code (O'Sullivan).

Stakeholders
The most apparent direct stakeholders are the software developers. Revision control benefits developers by providing them with a more stable work environment. Without revision control, it is very easy to experience loss of work. Race conditions can occur if multiple developers are sharing the same copy of files. The danger of overwriting updates is real, and it increases exponentially as the project size and organization size increase. Moreover, a complete loss of data can be avoided as copies of code files are constantly being generated and backed-up.

Another benefit for developers is comprehensibility of the system code lifecycle. Developers can review the ancestry of files and by reading other developer’s comments they can elicit the reasoning behind code changes. This information helps ensure that they stay the course of the current branch of development.

In a hierarchical organization, the indirect stakeholders are members of management (ex. - IT Team Leaders). IT Team Leaders are rated on how well their teams meet project timeline and budgetary expectations. Development teams have a better chance at hitting targets with a revision control strategy, as pitfalls that cause delays and unexpected costs can be avoided. Consequently, benefits of meeting targets get cascaded up to higher levels of management within the organization.

End users of the constructed software product are also indirect stakeholders. All of the benefits garnered from revision control are ultimately parlayed into building a more usable and functionally accurate software product that is intended for end user consumption.


References
Collins-Sussman, Ben. "Version Control with Subversion". Tigris.org. 10/20/2009 <http://svnbook.red-bean.com/en/1.4/index.html>.

Friedman, B., Kahn, P. H., Jr., & Borning, A. (2006). Value Sensitive Design and information systems. In P. Zhang & D. Galletta (eds.), Human-Computer Interaction in Management Information Systems: Foundations, (pp. 348-372). Armonk, New York: M.E. Sharpe.

Gray, Barbara. Collaborating: Finding Common Ground for Multiparty Problems. San Francisco: Jossey-Bass, 1989.

O'Sullivan, Bryan. "Making Sense of Revision-control Systems". ACM. 10/20/2009 <http://queue.acm.org/detail.cfm?id=1595636>.

Tigris. "Subversion Home Page". Tigris.org. 10/19/2009 <http://subversion.tigris.org/>.

No comments: