Array indexing and slicing are important parts in data analysis and many different types of mathematical operations. We always do not work with a whole array or matrix or Dataframe. Array indexing and slicing is most important when we work with a subset of an array. This article will be started with the basics and eventually will explain some advanced techniques of slicing and indexing of 1D, 2D and 3D arrays. Even if you already used Array slicing and indexing before, you may find something to learn in this tutorial article.
1D Array Slicing And Indexing
First, import Numpy in your notebook and make a one-dimensional array. Here I am using a Jupyter Notebook. But any other notebook is good for this.
import numpy as np
x = np.array([2,5,1,9,0,3,8,11,-4,-3,-8,6,10])
Basic Indexing
Let’s do some simple slicing. Just a reminder, arrays are zero indexed, so count starts from zero. x[0] will return the first element of the array and x[1] will return the second element of the array.
x[0]
output: 2
x[3]
output: 9
x[4]
output: 0
Basic Slicing
Now moving on to some slicing operation of one-dimensional arrays,
x[1:7]
output: array([5, 1, 9, 0, 3, 8])
Here 1 is the lower limit and 7 is the upper limit. Output array starts from the element of index 1 to 7, lower limit included and upper limit excluded. That means it includes the element in index 1 but does not include the element in index 7.
Slicing With Interval
x[2::3]
output: array([ 1, 3, -4, 6])
In this case, 2 is the starting point and 3 is the interval. So the returning array stars from the element in index two. After that it takes every third element of the array till the end.
Say, we don’t need till the end. We only want to output till -4. In that case we can further slice it.
x[2::3][0:3]
array([ 1, 3, -4])
Default Beginning and Ending
Next, I should show a syntax, that is used most commonly. x[0:4] is used to return first four elements, right? Instead, x[:4] can be used to do the same. Because if we do not put any lower limit, by default it will start from the beginning. In the same way, if we do not mention any upper limit, by default it will output till the end. When we do not mention both upper and lower limit, we get the whole array as the output as shown below.
x[:4]
output: array([2, 5, 1, 9])
x[3:]
output: array([ 9, 0, 3, 8, 11, -4, -3, -8, 6, 10])
x[:]
output: array([2,5,1,9,0,3,8,11,-4,-3,-8,6,10])
Slicing With Interval and Both Upper and Lower Limit
x[1:7:2]
output: array([5, 9, 3])
In x[1:7:2], 1 is the lower limit, 7 is upper limit and 2 in the interval. Output starts in the element in index 1 and end in the index 7 but instead of outputting each element in between it outputs every second element as the interval is 2.
Interval Starting From The End
x[-7::2]
array([ 8, -4, -8, 10])
Here, -7 means the seventh element from the bottom or the end and 2 is the interval. Output starts from the seventh element at the bottom and go up till the end.
x[-7::-2]
array([8, 0, 1, 2])
2D Array Slicing And Indexing
Now we will practice the same with two-dimensional array. Generate a two-dimensional array using arange and reshape function. I made a 6×7 matrix for this video. Because it is big enough to show some operation well.
y = np.arange(42).reshape(6,7)
Output the Rows
The easiest thing is to return rows from a two-dimensional array. Simply index through the number of rows.
y[0]
array([0, 1, 2, 3, 4, 5, 6])
y[1]
array([ 7, 8, 9, 10, 11, 12, 13])
y[3]
array([21, 22, 23, 24, 25, 26, 27])
Output the Columns
Returning column columns can be a bit tricky.
y[:, 0]
array([ 0, 7, 14, 21, 28, 35])
y[:, 3]
array([ 3, 10, 17, 24, 31, 38])
Output One Element Only
Let’s see how to return a number from the matrix. Return number 17 from this matrix. Start by finding which row it is in. It is in third row that mean the index of the row is 2 as count start from 0. Next look at the column index. Number 17 is in forth column. So, the column index is 3.
y[2, 3]
17
Get first three elements of second column.
In the matrix below the target element shows in bold. All the elements are in rows 1,2 and 3. The row index to use is 0:3. Next step is to figure out the columns. Three elements are in second column. That is, column index 1.
y[0:3, 1]
array([ 1, 8, 15])
0 | 1 | 2 | 3 | 4 | 5 | 6 |
7 | 8 | 9 | 10 | 11 | 12 | 13 |
14 | 15 | 16 | 17 | 18 | 19 | 20 |
21 | 22 | 23 | 24 | 25 | 26 | 27 |
28 | 29 | 30 | 31 | 32 | 33 | 34 |
35 | 36 | 37 | 38 | 39 | 40 | 41 |
Output a portion of the elements from first two columns shown in the matrix below
All the elements are in row 1,2 and 3. The row index to use is 1:4. The corresponding column indexes are 0 and 1. So, the column indices can be represented as 0:2
y[1:4, 0:2]
array([[ 7, 8], [14, 15], [21, 22]])
0 | 1 | 2 | 3 | 4 | 5 | 6 |
7 | 8 | 9 | 10 | 11 | 12 | 13 |
14 | 15 | 16 | 17 | 18 | 19 | 20 |
21 | 22 | 23 | 24 | 25 | 26 | 27 |
28 | 29 | 30 | 31 | 32 | 33 | 34 |
35 | 36 | 37 | 38 | 39 | 40 | 41 |
Output this three by three subarray (bold elements in the matrix) from the matrix
Solution to this is the same theory as before. Row indexes of the numbers are 2, 3 and 4. So we can slice it by 2:5. Column indexes are also 2,3 and 4. A slice of column also can be taken by 2:5.
y[2:5, 2:5]
array([[16, 17, 18], [23, 24, 25], [30, 31, 32]])
0 | 1 | 2 | 3 | 4 | 5 | 6 |
7 | 8 | 9 | 10 | 11 | 12 | 13 |
14 | 15 | 16 | 17 | 18 | 19 | 20 |
21 | 22 | 23 | 24 | 25 | 26 | 27 |
28 | 29 | 30 | 31 | 32 | 33 | 34 |
35 | 36 | 37 | 38 | 39 | 40 | 41 |
Print every second row from the starting from the first row
y[0::2]
array([[ 0, 1, 2, 3, 4, 5, 6], [14, 15, 16, 17, 18, 19, 20], [28, 29, 30, 31, 32, 33, 34]])
0 | 1 | 2 | 3 | 4 | 5 | 6 |
7 | 8 | 9 | 10 | 11 | 12 | 13 |
14 | 15 | 16 | 17 | 18 | 19 | 20 |
21 | 22 | 23 | 24 | 25 | 26 | 27 |
28 | 29 | 30 | 31 | 32 | 33 | 34 |
35 | 36 | 37 | 38 | 39 | 40 | 41 |
Here 0 is the lower limit and 2 is the interval. The output will start at index 0 and keep going till the end with an interval of 2. That means every second row.
Print every other column starting from the first column.
In the code below, ‘:’ means selecting all the indexes. Here ‘:’ is selecting all the rows. As the column input we put 0::2. I already mentioned the functionality of this above.
y[:, 0::2]
array([[ 0, 2, 4, 6], [ 7, 9, 11, 13], [14, 16, 18, 20], [21, 23, 25, 27], [28, 30, 32, 34], [35, 37, 39, 41]])
0 | 1 | 2 | 3 | 4 | 5 | 6 |
7 | 8 | 9 | 10 | 11 | 12 | 13 |
14 | 15 | 16 | 17 | 18 | 19 | 20 |
21 | 22 | 23 | 24 | 25 | 26 | 27 |
28 | 29 | 30 | 31 | 32 | 33 | 34 |
35 | 36 | 37 | 38 | 39 | 40 | 41 |
This is another way of doing the same. In the code below 0 is the lower limit, 7 is the upper limit and 2 is the interval. The code snippet below will output the same matrix as above.
y[:, 0:7:2]
I suggest, please try to print the pattern as the picture below.
Here is my answer: First grab the rows. It is very second row starting from row 1 till the end. Lower limit 1, upper limit 6 and interval is 2. Similarly, for the column, lower limit is 1, upper limit is 6 and interval is 2.
y[1:6:2, 1:6:2]
array([[ 8, 10, 12], [22, 24, 26], [36, 38, 40]])
0 | 1 | 2 | 3 | 4 | 5 | 6 |
7 | 8 | 9 | 10 | 11 | 12 | 13 |
14 | 15 | 16 | 17 | 18 | 19 | 20 |
21 | 22 | 23 | 24 | 25 | 26 | 27 |
28 | 29 | 30 | 31 | 32 | 33 | 34 |
35 | 36 | 37 | 38 | 39 | 40 | 41 |
There is one more way to do this. In the piece of code below, 1 for the lower limit, 6 for the upper limit (for rows we only have row 0 to row 5. But we need to put 6 as the upper limit because if we put the upper limit 6 we will get the elements of index 5) and 2 is the interval. If you notice we need to use the same formula for the column index.
y[1:6:2, 1:6:2]
array([[ 8, 10, 12], [22, 24, 26], [36, 38, 40]])
3D Array Slicing And Indexing
Let’s make a three dimensional array with this code below. Here it will arrange the numbers from 0 to 44 as three two-dimensional arrays of shape 3×5. Output will look like this.
x = np.arange(45).reshape(3,3,5)
x
array([[[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14]], [[15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]], [[30, 31, 32, 33, 34], [35, 36, 37, 38, 39], [40, 41, 42, 43, 44]]])
Selecting the Two-Dimensional Arrays
We can access each two dimensional arrays in it with simple indexing as follows:
x[0]
array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14]])
x[1]
array([[15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]])
x[2]
array([[30, 31, 32, 33, 34], [35, 36, 37, 38, 39], [40, 41, 42, 43, 44]])
Print the second row of first two-dimensional array
Select first two-dimensional array the way we showed before with this code: x[0]. Then add this to select the second row: x[0][1]
x[0][1]
array([5, 6, 7, 8, 9])
[[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
[[15 16 17 18 19]
[20 21 22 23 24]
[25 26 27 28 29]]
[[30 31 32 33 34]
[35 36 37 38 39]
[40 41 42 43 44]]]
Get the element 22 from the array.
I will solve this problem in few steps.
Select the two-dimensional array in which the element 22 is. That’s the second two-dimensional array. So, select that by using x[1].
Next see where the row index is. Our target element is in the second row of the selected two-dimensional array. The row index is 1. We can select the row with this code: x[1][1].
Finally, the column index is 2 because from the picture above it shows that it is the third element. Combining
x[1][1][2]
22
Return the first rows of the last two two-dimensional array.
First select the two-dimensional array in which these rows belong. One row is in second two-dimensional array and another one is in the third two-dimensional array. We can select these two with x[1:]. As both of the rows are the first row of its corresponding two-dimensional array, row index is zero.
x[1:, 0]
array([[15, 16, 17, 18, 19], [30, 31, 32, 33, 34]])
[[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
[[15 16 17 18 19]
[20 21 22 23 24]
[25 26 27 28 29]]
[[30 31 32 33 34]
[35 36 37 38 39]
[40 41 42 43 44]]]
Slice through both columns and rows and print part of first two rows of the last two two-dimensional arrays
Like the previous problem, all the target elements are in second and third two-dimensional arrays. So, we can select those as before with x[1:]. All the elements are in first and second rows of both the two-dimensional array. Row index should be represented as 0:2. Column index is 1:4 as the elements are in first, second and third column. Combining all together:
x[1:, 0:2, 1:4]
array([[[16, 17, 18], [21, 22, 23]], [[31, 32, 33], [36, 37, 38]]])
[[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
[[15 16 17 18 19]
[20 21 22 23 24]
[25 26 27 28 29]]
[[30 31 32 33 34]
[35 36 37 38 39]
[40 41 42 43 44]]]
I hope this helps. Please try with different numbers and slices to learn more.
#numpy #numpyarray #python #dataanalysis #datascience #dataanalytics
Cezary
3 Dec 2020Dear Madam,
Thank you for this wonderful tutorial.
I have trouble with creating an array of one particular pixel [x,y] from a series of video frames
(10mframes)
source = cv2.VideoCapture('example.mp4')
# running the loop
while True:
# extracting the frames
ret, img = source.read()
# I do not know how to understand the img as an array.
j (0:9)
x=100
y=100
for i in j:
new_array = img[i,x,y]
What will be a proper way?
Thank you
Cezary
cezary4you@gmail.com
rashida048
5 Dec 2020Try, np.array(source.read()). Don't forget to import numpy using, "import numpy as np". You cannot index or slice a pthon list so easily. You need to convert it to a Numpy array first.