Python NumPy | Python Library for Machine Learning and Data Science | AL & ML Programming | Python

Показать описание

From this video, we will start with the NumPy library which is used for data analysis and scientific computing in ML. Before we start learning NumPy, You may have to unlearn a few things you used to do programming in python or any other language when it comes to ML. In ML, we should learn how to deal with data in bulk, not individual data. When we are dealing with billions of data, it is computationally very expensive to walk through each data one by one and that’s why you can’t do programming in ML like a traditional sequential way to go through a loop. In most operations in ML, I’ve to implement parallelism in the code ex, when I square the matrix I don't have to loop through each element, I can just square each element in a single cycle rather than looping through every element of the matrix.
So, as a principle we have to write vectorized code in ML so that compilar can understand and execute the data operations in parallel manner. All the underline operations in Numpy are vectorized , that's why all the numerical analysis is being done using Numpy.

In this video, we will understand the NumPy library, which is the fundamental package for scientific computing in Python. We will cover:
Understand advantages of vectorised code using Numpy (over standard Python ways)
- Create NumPy arrays
- Convert lists and tuples to NumPy arrays
- Subset, slice, index and iterate through arrays
- Compare computation times in NumPy and standard Python lists

NumPy is a library written for scientific computing and data analysis. It stands for numerical python. The most basic object in NumPy is the ndarray, or simply an array which is an n-dimensional, homogeneous array. By homogenous, we mean that all the elements in a NumPy array have to be of the same data type, which is commonly numeric (float or integer).

Advantages of NumPy
- You can write vectorised code on numpy arrays, not on lists, which is convenient to read and write, and concise.
- Numpy is much faster than the standard python ways to do computations.

Vectorised code typically does not contain explicit looping and indexing etc. (all of this happens behind the scenes, in precompiled C-code), and thus it is much more concise.

There are multiple ways to create numpy arrays, the most commmon ones being:
- Initialise arrays of fixed size (when the size is known)

The other common way is to initialise arrays. You do this when you know the size of the array beforehand.
The following ways are commonly used:

Inspect the Structure and Content of Arrays
It is helpful to inspect the structure of numpy arrays, especially while working with large arrays.
Some attributes of numpy arrays are:
- shape: Shape of array (n x m)
- dtype: data type (int, float etc.)
- ndim: Number of dimensions (or axes)
- itemsize: Memory used by each array elememnt in bytes

Compare Computation Times in NumPy and Standard Python Lists
We mentioned that the key advantages of numpy are convenience and speed of computation.
You'll often work with extremely large datasets, and thus it is important point for you to understand how much computation time (and memory) you can save using numpy, compared to standard python lists.

In this case, numpy is an order of magnitude faster than lists. This is with arrays of size in millions, but you may work on much larger arrays of sizes in order of billions. Then, the difference is even larger.

Some reasons for such difference in speed are:
- NumPy is written in C, which is basically being executed behind the scenes
- NumPy arrays are more compact than lists, i.e. they take much lesser storage space than lists

#numpy #python #datascience #machinelearning #pandas