Data Science – An Introduction

What is Data science?

It is a study that deals with the identification and extraction of meaningful information from data sources with the help of various scientific methods and algorithms. This helps in better decision making, promotional offers and predictive analytics for any business or organization.

What are the skills required to be a Data scientist?

  • Programing Skill
    • Python
    • R
    • Database Query Languages.
  • Statistics and Probability
  • BI Tools – Tableau, Power BI, Qlik Sense
  • Business Domain Knowledge

Data Science Life Cycle:

 

 

Data Scientist VS  Data Analyst VS  Data Engineer

 Data Analyst:

It is an entry-level job for those professionals who are interested in getting into a data-related job. Organisation expect from Data Analyst to understand data handling, modeling and reporting techniques along with a strong understanding of the business. A Data analyst required a good knowledge of visualization tools and database. There are two most popular and common tools used by the data analysts are SQL and Microsoft Excel.

It is necessary for the data analyst to have good presentation skills. This helps them to communicate the end results with the team and help them to reach proper solutions.

Data Engineer:

A Data Engineer specializes in preparing data for analytical usage. They have good idea about Data pipelining with performance optimization. A Data Engineer required strong technical background with the ability to create and integrate APIs. Data Engineering also involves the development of platforms and architectures for data processing.

So what skills required being a Data Engineer?

  • Data Warehousing & ETL
  • Advanced programming knowledge
  • Machine learning concept knowledge
  • In-depth knowledge of SQL/ database
  • Hadoop-based Analytics
 
Data Scientist:

A data scientist is a person who applies their knowledge in statistics and building machine learning models to make predictions to answer the key business questions. They use to deal with big messy data set and a big data wranglers. They apply their math, programing and statistics skills on the data set to clean and organize.

Once data is in clean form then Data scientist apply machine learning algorithm to find hidden insights in the data and draw a meaningful summary out of that.

Skill set for a data scientist:-

  • In depth programing knowledge of SAS/R/Python.
  • Statistics and Mathematics concepts.
  • Machine learning algorithm.
  • Python library such as Pandas, numpy, scypi, Matplotlib, Seaborn, StatsModels.