Lately I’ve been obsessed with learning and adopting “best practices” for writing code. For some time now I’ve felt very inefficient. I tend to forget that I already solved a problem, which leads to constantly re-discover little tricks. My organization of all the various code and data for my various projects is terrible, and this adds to my inefficiency. I know there are people who are much worse than me, but it still bothers me to know that I waste time retracing my steps.
So I’m trying to change this. Software Carpentry is a great resource for learning how to write code. They heavily advocate the use of version control software, even for a solo researcher! I like this quote from their site:
Good record keeping is essential to good science; without the kind of record-keeping that version control provides, there’s no way to know exactly who did what, when.
This is really meant to apply to people working in teams. For a solo researcher, the “who” is always the same, but the “what” and “when” are still important things to keep track of!
Currently I’m using one of the most popular version control program known as Git, and so far I love it! I have to admit it has a steep learning curve, and it seems daunting and a waste of time at first. But so far it seems to be worth the effort.
For my workflow, I needed a way to synchronize the same set of files across many different machines, both Linux and Mac. This requires a central repository that all the machines can pull from. Git provides a nice way to do this because each repository can be linked with a remote repository hosted by an online service.
At first I went with github.com, and it seemed perfect. It has a huge community, a great user interface, and lots of tutorials. Everything was going along fine until I realized that I could not make any of my code private. This may seem stupid, because as a scientist, who is gonna “steal” from me? shouldn’t science be “open access”?
After a lot of thought, I think that this doesn’t matter 99% of the time. However, there are a couple projects that I do worry about. And even though the chance of someone taking my code, understanding what I’m doing, redoing my research, and publishing the result before me is extremely small, I still want to have the security of having a few repositories kept private.
This is where BitBucket.com comes in! They have a very different pricing philosophy. Instead of charging for private repositories, they charge you when you want to add more users. BitBucket allows you to have infinite private repositories! The community is a little smaller, but the website is just as great as GitHub, if not better.