It's been 3 months now since I have started contributing to an Open Source project and this blog post describes my experience with Open Source Development, what I have learnt so far, how can an aspiring developer get started with doing the same and finally what are the benefits of Open Source Development (best to my knowledge so far). I feel this is relevant because the hardest part of getting started is just that - GETTING STARTED. Everything else begins to come naturally to you once you get through the initial phase and this post intends to clear some of the hurdles that might restrict someone.
I have been contributing to the development version of one of the most commonly used Machine Learning libraries, scikit-learn. My contributions can be seen here.
So, I started in late October when I was just getting familiar to Machine learning and it's implementation in Python. Also, I had been wanting to work on an Open Source project but felt the same fear that any beginner does -
1. The INTIMIDATING codebase
2. Coding language restrictions (I was a mere beginner in Python)
3. Doubts whether you have enough knowledge to make a significant contribution
4. How to seek help??
5. How to make that first contribution??
I'll answer the 4th question first. After all this while, I realize the simplest answer is actually just reaching out. I've mentioned this is greater details in the How to get started section below.
Now, how you do we go past that INTIMIDATING codebase?
Firstly, not all open source projects have a very large codebase. There are many projects where you can work on your own module from scratch, separate from the development version. For those projects having a large codebase, the thing is that you don't need to know how ALL of it works. Yes - for the most part, when you work on improving something, you'll be tackling a very small portion of that codebase. So, every time I worked on fixing an issue, I used to take loads of time reading those specific files that need to be changed, understanding how they work. Now, as you work on different issues, you start becoming more and more familiar with the codebase. This indeed, is a gradual process and it's best learnt by doing. Thinking it this way, certainly reduces this barrier to greater extent.
To be honest, I didn't have a lot of experience with coding in Python prior to this, and that was one of the questions I had before starting. But then again, I wanted to learn ML (Machine Learning) and I am not really comfortable with development in C++ (prefer Java) but that is what most ML libraries use, since it's fast. So, scikit-learn seemed the best place to start. The best part is this - I became much more fluent in Python and it's various functionalities (also, object-oriented programming in Python) by looking for long hours at the codebase, googling whatever seemed new and implementing some of the functionality myself- just because of Open Source. Also, whenever I have to work on a feature/algorithm that I don't know about, I go and read on it - thus, the purpose of learning ML is also met.
So, this has been a really motivating start for me and a great learning curve.
Next up - the BIG question.
How to get started
Firstly, there are tons of articles that guide you through making your first contribution much better than mine and this is just my take on how I got started with the hope that this helps someone too.
As a starting point, I'll suggest anyone to through this blog post thoroughly - Part I. It describes a major portion of what I used myself to get started and is pin-point accurate. So, I am not going to repeat stuff already present there, rather say some of the other things in greater details.
To start off, the first question is - Which project??
Well, I found the best answer of this question is that you should work on:
Once you have the project sorted, one should go through the Project Wiki Page, which contains all the guidelines for setting up the project as well as how to get started with contributing. I can't emphasize how important this is. For example, scikit-learn has one of the best contributing guidelines that I have seen and I keep referring to it whenever I try something new. For most Open Source projects, the core developers of the project are really busy people. It's generally a good idea to go through the contributing guidelines before talking to them, that makes them feel that you have done your homework too. But if you are stuck at something, don't ever hesitate to ask for help. This is one of the benefits of Open Source Development, that there is no one trying to compete with you and hence, most of the projects have a very helpful community where you can easily seek help.
Now, the best way to get started to contributing is to help the organization with documentation. Through this, you start to get familiar with the contributing process, the coding guidelines of the organization and get acquainted with the community. Now the big part - How to make your first Pull request?
Well, I'll link a blog post here, exactly what I used myself. It's a continuation of the blog post linked above - Part 2. Also, Github's guide has a very elaborate description of the entire process.
Benefits of Open Source
Of course, having the motivation as to why you should work on Open Source Development is essential to get started. I am listing the ones that I know of:
Finally, this video series gives an insight as to what a hiring manager thinks while looking at your Github profile and how you can actually use your profile as a portfolio:
In closing, I'd suggest everyone to get their hands dirty with Open Source. It's a great learning curve, you get to contribute to something big and you get to interact with different kinds of people who are (mostly) more knowledgeable than you, which is quintessential to your growth, both personally and skill-wise.
I hope this post gave someone enough motivation or helped someone to tackle the roadblocks to get started with Open Source.
Cheers! To a better tomorrow :)