Data Science Capstone projects and why it is important?

Data science or machine learning is one of those fields where ‘learning’ best happens through ‘doing’. One can attend as many courses or classes on R or Python but there is no alternate for actually working on the R console or writing Python code to manipulate and analyze data. Similarly, it is important to get experience of working on ‘unclean’ data sets – data sets that have missing values, wrong values, incomplete information and other inconsistencies. This is how real life data will be, so it’s better get used to it.

Your portfolio is your secret weapon to showing off what you can really do. At its best, your portfolio will give your prospective team and manager insight into how you think, how you ask questions, how you code, and how you present your results to a non-technical audience. Therefore a capstone project is an important part of any long-term data science course.

Let’s understand with following statements.

  • Your resume (plus maybe a connection through your college roommate’s third cousin) will get you that first interview;
  • And your sick knowledge of pandas, sklearn and Python will get you past a technical screen;
  • but it’s your portfolio of data science projects that will get you that sweet, sweet data science job offer.

In a project you do several activities such as:

  • Choosing data.
  • Cleaned the data(Data wrangling)
  • Data exploration and visualisation
  • Apply machine learning algorithm on the data

Now hope you understand the importance of a capstone projects.

Now here I am putting some of my capstone projects or you can say some of my recent done projects. These projects will give you a fair idea how to build your own machine learning projects.

There are a few things that I have done in my project tutorial which I want to highlight:

  • In first I have explained clearly, what the business problem is and what business is expecting from us as a data scientist?
  • I have explained the available data in details.
  • Lots (and LOTS) of good comments. The best part about Jupyter notebook is that it allows code, figures, and writing to live side by side. so I took advantage of this! As you read through notebooks, take a look at it where and how I explained the work.
  • I have made the comments on most of my code, which will give you a fair idea what the section of code is doing.

You can click on any project title in left side widget area and get the details about that particular project.

You can also download these projects and my other works from my github account. My github account url is: