NumPy–Introduction

NumPy  is the most basic and a powerful package for working with data in python. It stands for ‘Numerical Python’. It is a library consisting of multidimensional array objects and a collection of routines for processing of array. It contains a collection of tools and technique that can be used to solve on a computer mathematical models of problem in science and engineering.

If you are going to work on data analysis or machine learning projects, then you should have solid understanding of NumPy . Because other packages for data analysis (like pandas) is built on top of NumPy  and the scikit-learn package which is used to build machine learning applications works heavily with NumPy  as well .

What is Array?

A array is basically nothing but a pointer. It is a combination of memory address, a data type, a shapes and strides.

  • The data pointer indicates the memory address of the first bytes in the array.
  • The data type or dtype pointer describes the kind of elements that are contained within the array.
  • The shape indicates the shape of array
  • The strides are the numbers of bytes that should be skipped in memory to go to the next element. If your strides are (10,1) you need to proceed one byte to get the next column and 10 bytes to locate the next row.

So in short we can say an array contains information about the raw data, how to locate an element and how to interpret an element.

Operations using NumPy:

Using NumPy, a developer can perform the following operations −

  • Mathematical and logical operations on arrays.
  • Operations related to linear algebra. NumPy has in-built functions for linear algebra and random number generation.

Installation Instruction:

It is highly recommended you install Python using the Anaconda distribution to make sure all underlying dependencies (such as Linear Algebra libraries) all sync up with the use of a conda install. If you have Anaconda, install NumPy by going to your terminal or command prompt and typing:

conda install numpy
or
pip install numpy

 If you do not have Anaconda and can not install it, please refer to following url http://www.datasciencelovers.com/python-for-data-science/python-environment-setup/

NumPy-Functions

NumPy has many built-in functions and capabilities. We won’t cover them all but instead we will focus on some of the most important aspects of NumPy such as vectors, arrays, matrices, and number generation. Let’s start by discussing arrays.

NumPy arrays are the main way we will use NumPy throughout the course. NumPy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly 1-d arrays and matrices are 2-d (but you should note a matrix can still have only one row or one column).

To know more about numpy function check the official documentation https://docs.scipy.org/doc/numpy/user/quickstart.html

Let’s begin our introduction by exploring how to create NumPy arrays. Please go through the jupyter notebook code. I have explained the code with comment, hope it will help you to understand the important functions of NumPy.

NumPy-Indexing and Selection

Indexing and Slicing are the important operations that you need to be familiar with when working with Numpy arrays. You can use them when you would like to work with a subset of the array. This tutorial will take you through Indexing and Slicing on multi-dimensional arrays.

Please refer to following .ipynb file for numpy implementation through python.