R tutorial - Learn How to Subset Matrices in R

preview_player
Показать описание
Discover how you can subset matrices using R.

Just as for vectors, there are situations in which you want to select single elements or entire parts of a matrix to continue your analysis with. Again, you can use square brackets for this, but the fact that you're dealing with two dimensions now, complicates things a bit.

Have a look at this matrix containing some random numbers.

If you want to select a single element from this matrix, you'll have to specify both the row and the column of the element of interest. Suppose we want to select the number 15, located at the first row and the third column. We type m, open brackets, 1, comma, 3, comma, close brackets.

As you can probably tell, the first index refers to the row, the second one refers to the column. Likewise, to select the number 1, at row 3 and column 2, we write the following line:

Works like a charm! Notice that the results are single values, so vectors of length 1.

Now, what if you want to select an entire row or column from this matrix? You can do this by letting out some of the indices between square brackets. Instead of writing 3, comma, 2 inside square brackets to select the element at row 3 and column 2, you can leave out the 2 and keep the 3, comma part. Now, you select all elements that are in row 3, namely 6, 1, 4 and 2.

Notice here that the result is not a matrix anymore! It's also a vector, but this time one that contains more than 1 element. You selected a single row from the matrix so a vector suffices to store this one-dimensional information. To select columns, you can work similarly, but this time the index that comes before the comma should be removed. To select the entire 3rd column, you should write m, open brackets, comma, 3, close brackets.

Again, a vector results, this time of length 3, corresponding to the third column of `m`.

Now, what happens when you decide not to include a comma to clearly discern between column and row indices? Let's simply try it out and see if we can explain it. Suppose you simply type m and then 4 inside brackets.

The result is 11. How did R get to that? Well, when you pass a single index to subset a matrix, R simply goes through the matrix column by column from left to right. The first index is then 5, the second one 12, the third one 6 and the fourth one is 11, in the next column. This means that if we pass m[9], we should get 4, in the third row and third column.

Correct! There aren't a lot of cases in which using a single index without commas in a matrix is useful, but I just wanted to point out that the comma is really crucial here.

In vector subsetting, you also learned how to select multiple elements. In matrices, this is of course also possible and the principles are just the same. Say, for example, you want to select the values 14 and 8, in the middle of the matrix. This command will do that for you:

You select elements that are on the second row and on the second and third column. Again, the result is a vector, because 1 dimension suffices. But you can't select elements that don't have one of row or column index in common. If you want to select the 11, on row 1 and column 2, and 8, on row 2 and column 3, this call

will not give the wanted result. Instead, a submatrix gets returned, that spans the elements on row 1 and 2 and column 2 and 3. These submatrices can also be built up from disjoint places in your matrix. Creating a submatrix that contains elements on row 1 and 3 and on columns 1 , 3 and 4, for example, would look like this

Now, remember these other ways of performing subsetting, by using names and with logical vectors? These work just as well for matrices. Let's have a look at subsetting by names first. First, though, we'll have to name the matrix:

In fact subsetting by name works exactly the same as by index, but you just replace the indices with the corresponding names. To select 8, you could use the row index 2 and column 3, or use the row name r2 and column name column c:

You can even use a combination of both:

Just remember to surround the row and column names with quotes Selecting multiple elements and submatrices from a matrix is straightforward as well. To select elements on row r3 and in the last two columns, you can use:

Finally, you can also use logical vectors. Again, the same rules apply: rows and columns corresponding to a TRUE are kept, while those corresponding to FALSE are left out. To select the same elements as in the previous call, you can use:

The rules of vector recycling also apply here. Suppose that you only pass a vector of length 2 to perform a selection on the columns:

The column selection vector gets recycled to FALSE, TRUE, FALSE, TRUE:

Giving the same result.
Рекомендации по теме
Комментарии
Автор

you are an integral part of economics, statistics and data science students' lives.

rook
Автор

thank you so much, man! it really helps me .

wais
Автор

Thank you for clarifying ... was kinda tricky

Rahul-fqkf
Автор

How would it work let's say given the same matrix that you use at timeline 5:10, that you wanted R to return value in the matrix that's greater than 5 that's in the first row, could you do it in a simple command? I know how to do it if you wanted to search the whole matrix like m <- m[m>5], but how would you do it if you wanted to result to be strictly for the 1st row?

MrJbauer
Автор

so at 3:39, how do we select, if we want to select elements 11 and 8

vanshikasinghal
Автор

How do you use logical operators to subset a matrix and NOT a dataframe? For example, if I have a 3X4 matrix with numeric values, how can I subset all the negative values in my matrix? or all the values greater than 2? etc

yeliangarcia
Автор

This is fantastic in content but I struggle to understand what he's saying and I'm a native English speaker. The automatic captions are also inaccurate. And this is an important, rather than pedantic, message because hearing instruction helps learning.

edongoogle