Storing your OpenOffice, Xmind, ... zippy documents more efficiently in a SCM

I really like having my sourcecode and documents in a SCM since I first discovered CVS about 14 years back and introduced it in two companies thereafter, one of which had tried to use VSS (not really usable at the time, you had to lock files for editing which made you call for the VSS admin when your colleague was not available and did not allow to work on the same document at all), while in the other developers only had been using timestamped ZIP files before, which made team work really hard. In my current company I (maybe) made a mistake by pushing the switch from CVS to SVN about five years ago.
Back then I took a look at one of the first DVCS systems (arch) but found it to be to confusing (at least for me, YMMV). About three years ago I discovered Mercurial and really have liked it since, especially as I really like Python. I tried Bazaar as well because it promised better integration with Subversion but it used several different, incompatible repository formats so I had problems even checking out a remote repository more than once and the speed was not at all convincing as well. Nowadays I use Git sometimes which I like as well and I am especially impressed by the simple underlaying concept of storing things. However I still feel more comfortable with Mercurial right now and use Bitbucket a lot.
After having used DVCS you feel almost crippled by SVNs bad merging support and the idea of having no distinction between branches and tags seems not so clever anymore, we have had some hard times using standard SVN tools after a decision to put release tags in a directory called releases and are sometimes still struggling to find a common point of view on the correct position of trunk and what to store beneath release tags in repositories used by more than one project, so they are unambiguous both for our tooling chain and understandable for humans.
Well, back to the topic: nowadays a lot of software uses ZIP containers to store their information, which will bloat your SCMs because every new zip is so different from it's ancestor, even if you did only include a single new word, because the compression and a preview picture will make the new version very different from the old one. So I wrote a little Python script which will uncompress, delete the included preview and put the remaining files back into an uncompressed ZIP again using the stored method.

1 comment:

  1. Are you monetizing your premium shared links?
    Did you know Mgcash will pay you an average of $0.50 per file download?