If you are a python user and work as data scientist, you definitely use Numpy. But if you are aspiring to be a data scientist, you need to learn Numpy. This is a well established numerical computational tool that is available for free. Learning should be easy because Numpy has well- structured documentation and also communities are very active in stack overflow.

Now let’s see why Numpy is so popular. You may think python list should be enough. All the operations that are available in Numpy, are possible with python list as well. But still people use Numpy because it handles data more efficiently as it works faster and uses much less memory space. Your code will be lot more concise because you won’t have to write the functions. Instead just import Numpy and use their functions.

Here are some basic Numpy operations. First an array was created using a range of numbers. Then the array elements are summed up using regular python sum function and then using Numpy. I used ‘%timeit’ to show the time difference between two ways.

Then two arrays are created. You can see adding, multiplying, dividing and even making exponential operations in Numpy. ‘Ndim’ method gives the dimension and ‘shape’ method provides you with the shape of the arrays. Remember, shape is always a tuple. First element of the tuple is the number of rows and second element is the number of columns. Also ‘size’ method gives the number of elements in the arrays. The next portion is very important. It shows slicing of the arrays. There are so many different ways to slice and extract the data from arrays which is very significant in data analysis.

```
import numpy as np
#create an array
g = list(range(1000000))
```

`%timeit sum(g)`

32.6 ms ± 1.85 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

```
#make a numpy array out of g
g_array = np.array(g)
```

`%timeit np.sum(g_array)`

496 µs ± 11 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

```
#Now create two arrays a and b
a = np.array([1,2,3,4])
b = np.array([10,11,12,13])
a+b
```

array([11, 13, 15, 17])

```
a * b
```

array([10, 22, 36, 52])

```
a / b
```

array([0.1 , 0.18181818, 0.25 , 0.30769231])

```
a ** b
```

array([ 1, 2048, 531441, 67108864], dtype=int32)

```
a
```

array([1, 2, 3, 4])

```
a.ndim #get the dimension
```

1

```
a.shape #returns a tuple that says the number of elements in each dimension
```

(4,)

```
a * 10
```

array([10, 20, 30, 40])

```
np.sin(a)
```

array([ 0.84147098, 0.90929743, 0.14112001, -0.7568025 ])

```
np.exp(a)
```

array([ 2.71828183, 7.3890561 , 20.08553692, 54.59815003])

```
np.log(a)
```

array([0. , 0.69314718, 1.09861229, 1.38629436])

```
a.dtype
```

dtype('int32')

```
a = np.array([1,2,3,4.0])
a.dtype
```

dtype('float64')

```
a = np.array([1,2,3,4.0+1j])
a.dtype
```

dtype('complex128')

```
c = np.array([[10,11,12], [20,21,22]])
c
```

array([[10, 11, 12], [20, 21, 22]])

```
c.ndim
```

2

```
c.shape
```

(2, 3)

```
c.size
```

6

```
c.T
```

array([[10, 20], [11, 21], [12, 22]])

```
c.nbytes
```

24

```
c[0,0] #highly recommended in numpy
```

10

```
c[0][0]
```

10

```
b[-2:] #selects the last two elements of the array
```

array([12, 13])

```
a = np.arange(25).reshape(5,5)
```

```
a
```

array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]])

```
c = np.arange(40).reshape(5, 8)
c
```

array([[ 0, 1, 2, 3, 4, 5, 6, 7], [ 8, 9, 10, 11, 12, 13, 14, 15], [16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39]])

```
c[:, 1] # Selects the second column
```

array([ 1, 9, 17, 25, 33])

```
c[:, 4] #Selects fifth column
```

array([ 4, 12, 20, 28, 36])

```
c[4, :] #Selects 5th row
```

array([32, 33, 34, 35, 36, 37, 38, 39])

```
c[1:4, 2:5]
```

array([[10, 11, 12], [18, 19, 20], [26, 27, 28]])

```
c[:, 1::3]
```

array([[ 1, 4, 7], [ 9, 12, 15], [17, 20, 23], [25, 28, 31], [33, 36, 39]])

```
c[:, 1:-1:3]
```

array([[ 1, 4], [ 9, 12], [17, 20], [25, 28], [33, 36]])

```
c[1::2, :-5:2]
```

array([[ 8, 10], [24, 26]])