NumPy arrays are a data analysts best friend and can make many processes a lot quicker. Let's look at why, briefly.
A typical sequence to store information in python, is the list data type. The list data type is awesome at handling multiple elements of any data type. Since python is a dynamically typed programming language, it was pretty much built to work this way. But with this flexiability, we lose a bit of the power in being able to do operations on our lists. Now when we do an operation on a list, python has to go element by element first determining the data type of the specific element, and then doing the operation if it is even possible.
NumPy helps us by providing us with arrays. Arrays are homogenously typed, often multidimensional data structures which increase the operational speed of computations. This speed increase is done through a vectorization, but that's for a different post. In short, NumPy arrays take away the flexibility of a python list, but they give us faster performance whenever all elements of our data are the same data type ("homogenous").
The easiest method to creating a NumPy array, is to just pass a list (or nested list) of elements to the numpy array() method.
From the above script and also in the course video, we ask the question of why even use a numpy array when we have access to python lists, which can store all the values we want, even when heterogeneous. So what's the point of limiting ourselves to only using one data type for the sake of the numpy array? The answer to this comes back to the core fundamentals of python as a programming language.
Python is a "Dynamically Typed" programming language. This means that unlike variable creation in other programming languages, in python we don't need to explicitly state the variable data type whenever we create it. This is an awesome feature and saves a lot of time whenever you're prototyping quick solutions. The down side? Python has to dechiper each of the elements, question if an operation can be done on that specific data type, and then do the operation.
Lots of steps are happening here, and it's really more complicated that this simple explaination. A deeper explaination would involve the concepts of how objects in python are really pointers to a memory space, but that's for a separate post. How does Numpy fix this? By using a Numpy array, we no longer have to ask the question of what is the elements data type, are able to vectorize the operation across the entire array, and compute much faster.