Move PyTorch Tensor Data To A Contiguous Chunk Of Memory

Video Transcript

This video will show you how to use PyTorch contiguous operation to move a PyTorch tensor’s data to a contiguous chunk of memory.

First, we import PyTorch.

import torch

Then we print the PyTorch version we are using.

print(torch.__version__)

We are using PyTorch 0.4.0.

Next, let’s create an initial tensor without specifying what the values are.

pt_initial_matrix = torch.Tensor(5,3)

We print the pt_initial_matrix Python variable to see what we have.

print(pt_initial_matrix)

We see that it’s a tensor, and we see that it has basically just garbage values.

So these are all placeholder values.

We have no values in the matrix yet.

Next, we use the in-place fill operation, so .fill_ to fill our tensor with the values of two.

pt_initial_matrix.fill_(2)

So we say pt_initial_matrix.fill_(2), then we evaluate.

Note that we don’t assign the result to any Python variable because we’re using the in-place fill operation, so it just fills in the values in the same memory that was initially allocated by the torch tensor operation and then assigned to the pt_initial_matrix Python variable.

So we can print the pt_initial_matrix Python variable to see what we have.

print(pt_initial_matrix)

We see that it is our matrix that has been filled in with the number twos.

Even though we know what size we initially wanted, we wanted a 5x3 matrix, let’s check the pt_initial_matrix size by using the PyTorch’s .size operation and we print the result.

print(pt_initial_matrix.size())

We see that it’s a torch size object that has the values 5x3.

Let’s now transpose our pt_initial_matrix Python variable to start exploring what happens to the memory where the tensor is stored.

pt_transposed_matrix = pt_initial_matrix.t()

So we say pt_initial_matrix.t().

So that’s the PyTorch transposed operation, and we’re going to assign the transposed matrix to the pt_transposed_matrix Python variable.

We print the pt_transposed_matrix Python variable to see what we have.

print(pt_transposed_matrix)

We see that we have one, two, three rows and one, two, three, four, five columns, whereas before, we had one, two, three, four, five rows and three columns.

So our matrix has been transposed.

The one thing to note and which is what this video sets up to show you is that to save memory, PyTorch has not created a new memory location for the pt_transposed_matrix.

It’s still referring to the initial memory location set aside for the pt_initial_matrix.

The way we can see this is when we try to access different parts of the tensor.

So first, let’s access the last element of the first column of the pt_initial_matrix.

pt_initial_matrix[4,0]

So 4,0.

Again, remember that Python is a zero-based index so this 4 means the fifth row and 0 means the first column.

When we do that, we get 2, which is expected.

Next, because we know that pt_transposed_matrix and pt_initial_matrix share the same memory, we can try to access the same element.

But this time, rather than using the pt_initial_matrix, we’re going to do pt_transposed_matrix[4,0].

pt_transposed_matrix[4,0]

When we evaluate that, we get the error IndexError: index is out of bounds for dimension.

So what that means is that PyTorch is checking to make sure well, this matrix has three rows, how on Earth are you checking for a fifth row? That doesn’t make any sense, so it says index is out of bounds.

That’s great.

What will be surprising is that... Let’s try to flatten the pt_initial_matrix by using the PyTorch view operation to reshape the matrix into a vector using the -1.

pt_initial_matrix.view(-1)

So we say pt_initial_matrix.view(-1).

So we’re going to take this matrix, the initial matrix, and we’re going to flatten.

When we do that, we see that it’s a tensor and there’s a comma here, so it’s one long vector of just twos.

That makes sense.

All right, we get a vector.

We can check our initial matrix to see if it’s still the same.

pt_initial_matrix

It’s still the same.

So the .view operation does not replace our initial matrix.

It just returns a new vector.

Let’s now try to flatten our transposed matrix.

pt_transposed_matrix.view(-1)

So our transposed matrix was three rows by five columns.

Given that we transposed it, one would think we would be able to flatten it into a vector.

When we do that, we run into a big error.

The error reads: Invalid argument.

View size is not compatible with input tensor’s size and stride.

Call .contiguous() before .view().

The reason this error comes up is the way PyTorch stores the tensors in the memory location.

The pt_transposed_matrix still references the initial matrix memory locations.

So even though we’ve transposed it, when we go to flatten it, it can’t do the correct computations because the memory allocation isn’t contiguous for the shape of the tensor.

To fix this, we have to use the PyTorch contiguous operation to allocate the memory correctly for the transposed matrix.

pt_contiguous_transposed_matrix = pt_transposed_matrix.contiguous()

So very much like what the error told us to do, we’re going to say pt_transposed_matrix and then we use the .contiguous() operation.

We’re going to assign that result to the Python variable pt_contiguous_transposed_matrix.

Now that we’ve gotten the memory to be contiguous, let’s check to see if we can flatten the PyTorch tensor.

pt_contiguous_transposed_matrix.view(-1)

So pt_contiguous_transposed_matrix.view(-1).

Perfect! We were able to flatten that transposed matrix into a vector.

Just to check, let’s see if we can flatten our pt_transposed_matrix.

pt_transposed_matrix.view(-1)

So pt_transposed_matrix.view(-1) gives us the same error.

So what that means is that when we use the .contiguous operation, it’s not actually doing an in-place operation for the tensor that we are applying this operation to.

It’s taking that tensor and it’s creating a new tensor which we then assign to the Python variable.

Perfect - We were able to use the PyTorch contiguous operation to move a PyTorch tensor’s data to a contiguous chunk of memory so that we could operate on it with the PyTorch view operation.