Sunday, December 8, 2013

Work Life: Software Versioning - Understanding the basics

The main reasons that someone would be interested in the versions is when they need to identify which features are in a particular version or identify which code to update for remedy a defect. Otherwise, version id is rather useless except for maybe as a marketing gimmick which really does not serve any functional purpose to software development.

At the most fundamental level, version id allows the developer know exactly which code base to identify or modify. In other words, the version id does not have to follow any sort of schema, rules, or process. You can use numbers (version 1), groups of numbers (Version 1.2.3.4), letters (version A), mixed (version 1.2.a.3.b), pictures, names (Windows XP, Vista, Google Jellybean, Ice Cream Sandwich). Basically anything you want it to be as long as you know how the version id goes with which code base.

To expand on the numbers and letters, you will commonly see that there is some incremental process. This is also not required on the fundamental level. You can jump from (v3 to v1.1 to v2.0.4 to vA to vB) like how Microsoft jumped from (v3.1 to 95 to 2000 to ME to XP to Vista to Win 8). No rhyme, no reason needed for version id except that it should at least be unique. I use 'should' because you could technically use the same version id but then that really complicates the matter which I won't cover here because who knows why some people do that (but it really does exist).

But because our memory with versions work well in chronological order, many companies use an incremental pattern type. Not only is it easier to remember, it is also easier to automate. Every time you have an update, you increment by 1 (version 1, then 2, then 3). 

Eventually, you'll find it cumbersome to remember difference between version 15 and version 32 and version 75. So many companies compartmentalize the changes by adding a sub-group (Version 1.1, then 1.2, then 1.3, etc). When you reach a arbitrarily major update, you will increment the super-group from 1.X to 2.0 (or 2.1 depending on your preferences or even 3.X if you want people to go slightly insane). Here, I use X because it can be any number. You can find that you go from 1.9 to 1.10 which does not seem to make mathematical sense, primarily because it is not a mathematical number (technically two numbers thus 1.10 is not the same as 1.1). This can be understood more easily if another sub-group is added like 1.10.20 which completely breaks any relationship to a mathematical number.

You'll find that 2 to 3 level system works for most groups and companies. But for software managed by many people (20-100 developers), you'll notice that there are more levels and that they do not increment in the normal fashion. The reason for this is because of branching of code. Because there are so many people developing at the same time, there is a root application that they all work off of. So everyone could be working off of code version 1.2.1. Developer A will be writing code version 1.2.1.5 while another developer B will be writing code version 1.2.1.8 (maybe he has an intern working on 1.2.1.8.2). Someone will eventually decide that it is time to bring some of the changes together to create version 1.2.1.22. If they find that to work, they could publish version 1.2.1.22. Of course, you could also have a policy that just increments the third-level whenever you merge which then should have been version 1.2.2.

But let's say we kept with 1.2.1.22 because it is a patch update and not really an application update. While testing, this version did not work and deemed to difficult to troubleshoot. Version 1.2.1.22 can be discarded. After the major items are addressed, eventually someone will merge a different set of versions to create Version 1.2.1.45. Then, repeat until they have a working version. If 1.2.1.45 works, they may just publish that so the public will never see version 1.2.1.22. If you use the other policy, you wouldn't see 1.2.2 but a 1.2.3.

At this point, you can also see from the end-user (i.e. customer) that may concern them especially for new vendors. Why did they skip a number if they always had an incremental system? This is why larger corporations that invest in PR develop their own version schema. Company A may say that we don't want to confuse the users with an update from 1.2.1 to 1.2.3. They are going to stick with version 1.2.2.

But on the development side, you cannot simply rename 1.2.3 to marketing's 1.2.2 because it was already used. Thus a cross-reference table needs to be maintained to tracking marketing's version to the development's version.

On a higher level, you can easily see where development and marketing would deviate in version id when there are impacts to technology versus usage. For example, the development major version (level 1) needs to be incremented because of a new software design that does not impact function. To marketing this is not a major change. So development could move to version 2.0.1 while marketing may choose to remain in version 1.x.x format. On the other hand, marketing may see one of the new features that will change how they sell the product but was not a major change to code (like backup and restore, or moving to SSO). For this marketing will sell it as version 3.0.1 while development keeps their 2.x.x format.

After some time, marketing will then start to simplify or complicate the version id towards their market research. That is why you see so many odd versions in today's world. You can have Google's candy theme that increments by the first letter. Apple's animal theme with the same incremental scheme. Or Windows random scheme. Primarily, their versions are easier to remember maybe because we can relate more things to it than just numbers which we see everyday. Also it makes it easier to search because all applications use numbers so searching for version 1.1 will return several results while version Vista is more unique. Or you can complicate it which then many people will not remember the older versions.

So in conclusion, version ids can be anything. Best practices are good guidelines. Choose and/or adapt one that fits your business model. I do not recommend random version id for development as there is little value to that, but who knows perhaps it has value if you are trying to make it difficult for other people the chronological progression of your software development (like for a spy-world tv show).