What is Version Control and Why You Need It
This page is a work in progress. It is currently incomplete but is slowly expanding. Last updated November 18th, 2018.
Single Source of Truth
I'd like to begin by telling a story. Once upon a time, when my living arrangements were somewhat in flux, I had a P.O. box and all of my mail was directed there. I also had a checking account with a large national bank and they sent all statements and correspondence to my P.O. box.
One day, I moved to a place (where I stayed put for 13 years) and decided it was high time to get rid of that P.O. box and start getting my mail at home like normal people. I called my bank and told them the good news. They punched my new address into the system and my statements started coming to my house.
Since the P.O. box is paid in advance for six months or a year at a time, I still had my P.O. box for several additional months, and I continued to monitor it. I discovered that while my statements were coming home, other correspondence from the bank was still going to the P.O. box! It took several additional calls to the bank to sort this out, and they explained to me that, well, they have several different systems and sometimes when they update something in one system, it doesn't necessarily update in some or all of the other systems.
This "split-brain" situation is actually a very common problem in any type of organization. The larger the organization, the larger the problem.
It begins with the fact that organizations are usually divided up into different departments. Each department has a different job, and each department has its own "system" for keeping track of information. Some of that information is unique to each department, but other information is duplicated in several systems across departments. My change of address is an example of such an instance, where the department that generates statements received my updated address but the department that generates marketing materials didn't get the memo, so to speak.
Some time ago, I read an article in some business-oriented magazine, but don't ask me which one it was because I don't remember. (In fact, I tried to look for the article before I wrote this so that I could cite it properly and look all official, but alas I just don't remember which exact magazine it was in.) Anyway, this article described a study that was done by some group of experts at some large organization. They were looking for exactly those types of discrepancies between systems, and they found them. Just as a very basic example, that organization (the one being studied) had a supplier whose name was capitalized differently in each of three different systems. From my hazy memory, the supplier was called ACME in one system, Acme in another, and A.C.M.E. in a third.
While that is only a minor discrepancy in some detail that may never affect the organization's ability to function, chances are that the same sort of thing happens in other places, where it is important.
In Figure 1, we see an example organization:
Figure 1: An example organization
This organization has five departments, called Department A through Department E, each represented by a different shape. Each department needs various types of information to do its job, and has its own system for keeping track of that information.
But departments do not exist in a vacuum. In Figure 2, we see what happens if there is no big-picture strategy for dealing with information. You end up with each department having to chase after each other department to get updated information. The arrows show the various directions that information moves through our example organization.
Figure 2: Flow of information through our organization
What a spaghetti! And imagine what happens if some departments are less diligent than others in chasing down information, or in updating other departments when information changes. If you looked closely, you noticed that departments C and E do not communicate with each other, nor do departments B and D. If they get any updated information, it's through the grapevine.
To make matters worse, each department houses information in an incompatible system, so there is no straightforward automatic way to compare and see which information has changed. This leads to the situation where you count on everyone in the organization to generate memos and notify everyone else at the exact moment when some piece of information changes, or else the fact that it has changed is soon forgotten and the rest of the organization is not brought up to date. If some or all of the memos get lost or misplaced along the way, you end up with discrepancies. Even if you have a foolproof memo delivery system in place and no memos are lost, the information may end up modified by mistake during data entry, or (as with the ACME/Acme/A.C.M.E. example) by someone choosing to capitalize text differently.
This is why you need Single Source of Truth.
Single Source of Truth means that while you may have disparate systems throughout your organization, you have one "blessed" system that is considered to hold the "truth." Everyone in the organization must know that only what's in that "blessed" system counts, and if there is a copy of information elsewhere and there is a discrepancy between that copy and the "blessed" system and you don't know why (e.g., it's not because you are, right this minute, in the process of updating a piece of information like my change of address), then the information in the "blessed" system is considered to be the truth and must take precedence.
When you have such a system, the flow of information in the organization becomes this:
Figure 3: Flow of information through our organization when we have implemented Single Source of Truth
This is much easier to understand. This is much easier to explain. This is much easier to implement and maintain.
Ideally, the Single Source of Truth system should automatically provide the information to the other systems, and (as in the case of my change of address) if the information is updated in any of those systems, there should be a mechanism in place that notifies the Single Source of Truth of the change. When that happens, the Single Source of Truth will update itself and then update the other systems to match.
The very first thing that a Version Control System is supposed to do is to give you this Single Source of Truth, with the automatic mechanism to deliver information from the "blessed" system to everyone who needs it, to allow changes to be made and reported back to the "blessed" system, and to get updates from the "blessed" system when you're ready for them. It keeps teams large and small on the same sheet of paper, helps people collaborate, and addresses a whole range of human errors that occur when you don't use Version Control.
History
Why do we bother to study history?
Since I began the previous section with a story, I think I'll do the same thing here.
Like many people who live in L.A. (including famous people), I attended Santa Monica College. (Famous people such as former California governator Arnold Schwarzenegger, for example.) And I took a math course there.
Santa Monica College has a bunch of fancy new buildings, but my math class was on the second floor of one of the old buildings that still remain. This building is long and narrow, and on my way to math, I used to pass a history class taught by Professor Harvey Stromberg. Now, someone like me who isn't exactly a history buff might expect to see a bunch of uninterested, exhausted, and sleeping students in a history class, but this was not the case with Professor Stromberg, who taught at Santa Monica College for forty years! Not only were his students awake and paying full attention, but all of them looked totally captivated by his lecture. If they had paid to attend a play, they wouldn't pay such close attention. I used to stop outside his open classroom door and listen for a few moments before proceeding to my math class.
I had to take history to fulfill my transfer requirements but up to this point I had been putting it off. When I discovered Professor Stromberg, I knew it was his class I had to take. But it was incredibly hard to get into. His sections were already completely full before my enrollment priority came up, so on the first day of classes, I showed up, as did many other students, in an attempt to add the class.
On that first day of class, I found out why Professor Stromberg's students were so captivated. Not only did he have a great sense of humor, but he didn't simply speak or lecture the way most professors do. He explained things in a roundabout way that forced us to think, and he asked questions that seemed to beg obvious answers which he quickly contradicted by offering an unexpected alternative view. In this manner, he discussed study skills, showing us that most students are focused on their grade instead of on learning. He explained that you need to make learning enjoyable, to learn because you want to learn, and that learning will become easier as a result.
But the biggest theme of that first day of class was the question, "Why do we bother to study history?" It was a question which he presented and he didn't answer -- he told us that we'd understand on our own by the end of the semester.
(Unfortunately, I didn't get to do that. All the seats were filled and none of us un-enrolled students succeeded in adding the class. Furthermore, it was Professor Stromberg's last semester of teaching at Santa Monica College, so I couldn't try again the following semester! I had to settle for another history class, which was still a good experience, even if it was disappointing to miss out on the experience I'd hoped for.)
Okay. So why we bother to study history (like world history) is an exercise I'll leave for the reader, but why history is important from a Version Control standpoint is a question I'll begin to explore... right... now:
To be continued!
This article will be greatly expanded to take you through the whole journey of Version Control and why you need it. Bookmark this page and check back soon!Send feedback to: the name of this blog at mail dot com.