Python List Comprehensions

Python is famous for allowing you to write code that’s elegant, easy to write, and almost as easy to read as plain English. One of the language’s most distinctive features is the list comprehension, which you can use to create powerful functionality within a single line of code. However, many developers struggle to fully leverage the more advanced features of a list comprehension in Python. Some programmers even use them too much, which can lead to code that’s less efficient and harder to read.

By the end of this tutorial, you’ll understand the full power of Python list comprehensions and how to use their features comfortably.

Benefits of Using List Comprehensions

  • One main benefit of using a list comprehension in Python is that it’s a single tool that you can use in many different situations.
  • In addition to standard list creation, list comprehensions can also be used for mapping and filtering. You don’t have to use a different approach for each scenario.
  • List comprehensions are also more declarative than loops, which means they’re easier to read and understand.

Every list comprehension in Python includes three elements:

  1. expression is the member itself, a call to a method, or any other valid expression that returns a value. In the example above, the expression i * i is the square of the member value.
  2. member is the object or value in the list or iterable. In the example above, the member value is i.
  3. iterable is a list, set, sequence, generator, or any other object that can return its elements one at a time. In the example above, the iterable is range(10).

Let’s explore the concept through jupyter notebook.

NumPy–Introduction

NumPy  is the most basic and a powerful package for working with data in python. It stands for ‘Numerical Python’. It is a library consisting of multidimensional array objects and a collection of routines for processing of array. It contains a collection of tools and technique that can be used to solve on a computer mathematical models of problem in science and engineering.

If you are going to work on data analysis or machine learning projects, then you should have solid understanding of NumPy . Because other packages for data analysis (like pandas) is built on top of NumPy  and the scikit-learn package which is used to build machine learning applications works heavily with NumPy  as well .

What is Array?

A array is basically nothing but a pointer. It is a combination of memory address, a data type, a shapes and strides.

  • The data pointer indicates the memory address of the first bytes in the array.
  • The data type or dtype pointer describes the kind of elements that are contained within the array.
  • The shape indicates the shape of array
  • The strides are the numbers of bytes that should be skipped in memory to go to the next element. If your strides are (10,1) you need to proceed one byte to get the next column and 10 bytes to locate the next row.

So in short we can say an array contains information about the raw data, how to locate an element and how to interpret an element.

Operations using NumPy:

Using NumPy, a developer can perform the following operations −

  • Mathematical and logical operations on arrays.
  • Operations related to linear algebra. NumPy has in-built functions for linear algebra and random number generation.

Installation Instruction:

It is highly recommended you install Python using the Anaconda distribution to make sure all underlying dependencies (such as Linear Algebra libraries) all sync up with the use of a conda install. If you have Anaconda, install NumPy by going to your terminal or command prompt and typing:

conda install numpy
or
pip install numpy

 If you do not have Anaconda and can not install it, please refer to following url http://www.datasciencelovers.com/python-for-data-science/python-environment-setup/

Pandas-Introduction

What is pandas?

Pandas is a python open source library which allow you to perform data manipulation, analysis and cleaning. It is build on top of NumPy . It is a most important library for data science.

According to Wikipedia “Pandas is derived from the term “panel data”, an econometrics term for data sets that include observations over multiple time periods for the same individuals.”

Why Pandas?

Following are the advantages of pandas for Data Scientist.

  • Easily handling missing data.
  • It provides an efficient way to slicing and data wrangling.
  • It is helpful to merge, concatenate or reshape the data.
  • It has includes a powerful time series tool to work with.

How to install Pandas?

To install python pandas go to command line/terminal and type “pip install pandas” or else if you have anaconda install in the system just type in “conda install pandas”. Once the installation is completed, go to your IDE(Jupyter) and simply import it by typing “import pandas as pd”.

In next chapter we will learn about pandas Series.