2️⃣ Batched Operations

Learning Objectives
  • Learn about some important concepts related to batched operations, e.g. broadcasting and logical reductions
  • Understand and use the einops library
  • Apply this knowledge to create & work with a batch of rays

In this section, we'll move onto batched operations. First, it's necessary to cover some important tips for working effectively with PyTorch tensors. If you've gone through the prerequisite material then several of these should already be familiar to you.

Tensor Operations - 5 Tips

Tip (1/5) - Elementwise Logical Operations on Tensors

For regular booleans, the keywords and, or, and not are used to do logical operations and the operators &, |, and ~ do and, or and not on each bit of the input numbers. Analogously, we use the operators &, | ~ on tensors to perform these operations on each element of the tensor, e.g. x & y returns the tensor with elements x[i] and y[i].

A few important gotchas here:

  • Don't try to use and, or or not on tensors, since Python will try to coerce the tensors to booleans, and you'll get an exception.
  • Remember about operator precedence! For instance, v >= 0 & v <= 1 will actually throw an error because it's actually evaluated as (v >= (0 & v)) <= 1 (because & has high precedence) and 0 & v is not a valid operation. When in doubt, use parentheses to force the correct parsing: (v >= 0) & (v <= 1).

Tip (2/5) - einops

Einops is a useful library which we'll dive deeper with tomorrow. For now, the only important function you'll need to know is einops.repeat. This takes as arguments a tensor and a string, and returns a new tensor which has been repeated along the specified dimensions. For example, the following code shows how we can repeat a 2D tensor along the last dimension:

x = t.randn(4, 3)
x_repeated = einops.repeat(x, 'a b -> a b c', c=2) # copies x along a new dimension at the end

assert x_repeated.shape == (4, 3, 2)
t.testing.assert_close(x_repeated[:, :, 0], x)
t.testing.assert_close(x_repeated[:, :, 1], x)

Tip (3/5) - Logical Reductions

In plain Python, if you have a list of lists and want to know if any element in a row is True, you could use a list comprehension like [any(row) for row in rows]. The efficient way to do this in PyTorch is with torch.any() or equivalently the .any() method of a tensor, which accept the dimension to reduce over. Similarly, torch.all() or .all() method. Both of these methods accept a dim argument, which is the dimension to reduce over.

Aside - tensor methods

Most functions like torch.any(tensor, ...) (which take a tensor as first argument) have an equivalent tensor method tensor.any(...). We'll see many more examples of functions like this as we go through the course.

Tip (4/5) - Broadcasting

Broadcasting is what happens when you perform an operation on two tensors, and one is a smaller size, but is copied along the dimensions of the larger one in order to apply to it. There are some complicated broadcasting rules which we'll get into later in the course (you can also review the broadcasting section in the prerequisite material which comes before this chapter), but for our purposes the only thing you need to know is this: if you perform an operation on 2 tensors A, B where A.shape == B.shape[-A.ndim:] (i.e. A matches the last dimensions of B), then A will be copied along its leading dimensions until it has the same shape as B. For example:

B = t.ones(4, 3, 2)
A = t.ones(3, 2)
C = A + B
print(C.shape) # torch.Size([4, 3, 2]), the size of the broadcasted tensor
print(C.max(), C.min()) # tensor(2.) tensor(2.), because all elements are 2

Tip (5/5) - Indexing

Indexing is a pretty deep topic in PyTorch, and it takes a while to get fully comfortable with it. However, we'll specifically cover some features which will be useful for the following exercises.

Using ellipses. You can use an ellipsis ... in an indexing expression to avoid repeated : and to write indexing expressions that work on varying numbers of input dimensions. For example, x[..., 0] is equivalent to x[:, :, 0] if x is 3D, and equivalent to x[:, :, :, 0] if x is 4D.

Boolean indexing. Boolean indexing allows you to select elements of a tensor based on a corresponding boolean tensor. Only the elements where the boolean value is True are selected. The following example shows boolean indexing and broadcasting:

D = t.ones(2)  # Tensor with shape (2,)
E = t.zeros(3, 2)  # Tensor with shape (3, 2)
E[[True, False, True], :].shape  # Shape: (2, 2)
E[[True, False, True], :] = D  # Assign values from D to selected rows
print(E)
# output:
# tensor([[1., 1.],
#         [0., 0.],
#         [1., 1.]])

Note that E gets modified when E[[True, False, True], :] gets modified - this is because the latter is a view of the former, not a new tensor (i.e. it points to the same place in memory) - we'll look a bit more at this topic later.

Multi-dimensional indexing. You can also apply boolean indexing to multiple dimensions simultaneously. This allows you to select elements across several dimensions based on a boolean condition. The following example shows multi-dimensional boolean indexing and broadcasting:

D = t.ones(2)  # Tensor with shape (2,)
F = t.zeros(2, 2, 2)  # Tensor with shape (2, 2, 2)
F[[[True, True], [False, True]], :].shape  # Shape: (3, 2)
F[[[True, True], [False, True]], :] = D
print(F)
# output:
# tensor([[[1., 1.],
#          [1., 1.]],

#         [[0., 0.],
#          [1., 1.]]])

This works because our indexing creates a tensor of shape (3, 2), formed by stacking the three tensors F[0, 0], F[0, 1] and F[1, 1] (those where the corresponding index value is True) along a new leading dimension.

Summary of tips

  • Use ... to avoid repeated : in indexing expressions
  • Use &, |, and ~ for elementwise logical operations on tensors
  • Use parentheses to force the correct operator precedence
  • Use torch.any() or .any() to do logical reductions (you can do this over a single dimension, with the dim argument)
  • Tensors can broadcast along leading dimensions to operate together
  • e.g. t.ones(2) + t.ones(3, 2) works because the former tensor gets broadcasted to shape (3, 2)
  • We can index with boolean tensors to select multiple elements
  • e.g. if A.shape = (3, n1, n2, ...) then A[[True, False, True]] has shape (2, n1, n2, ...) and contains the first & third rows of A
  • We can index with multi-dimensional boolean tensors too
  • e.g. if A.shape = (2, 2, n1, n2, ...) then A[[[True, True], [False, True]], :] has shape (3, n1, n2, ...) and contains the elements A[0, 0], A[0, 1] and A[1, 1], stacked along a new leading dimension

Batched Ray-Segment Intersection

We'll now move on to implementing a batched version of our previous function which takes multiple rays, multiple line segments, and returns a boolean for each ray indicating whether any segment intersects with that ray.

Note - in the batched version, we don't want the solver to throw an exception just because some of the equations don't have a solution - these should just return False.

Exercise - implement intersect_rays_1d

```yaml Difficulty: 🔴🔴🔴🔴⚪ Importance: 🔵🔵🔵🔵⚪

You should spend up to 25-40 minutes on this exercise. This will probably be one of the hardest exercises you'll complete today. ```

One part you might find difficult is dealing with the zero determinant cases. Previously we dealt with those by using try / except, but here we can't do that because we want to perform all the operations at once. Instead, we can use the following clever trick to find all the pairs of intersecting rays and segments:

  1. Figure out which matrices have zero determinant (e.g. with determinants.abs() < 1e-8)
  2. Replace those matrices with the identity matrix t.eye(2), since this will certainly not raise an error when solving
  3. Find the matrices s.t. u, v are in the required range and our original matrix was non-singular

This way, we've identified all pairs of rays and segments where an intersection point exists and that intersection point is valid (i.e. it's actually on the positive side of the ray, and somewhere in the middle of the segment).

Once we have this 2D array of booleans representing whether each ray intersects with each segment, we can reduce using the torch function t.any to find the rays which intersect any segment.

def intersect_rays_1d(
    rays: Float[Tensor, "nrays 2 3"], segments: Float[Tensor, "nsegments 2 3"]
) -> Bool[Tensor, "nrays"]:
    """
    For each ray, return True if it intersects any segment.
    """
    raise NotImplementedError()


tests.test_intersect_rays_1d(intersect_rays_1d)
tests.test_intersect_rays_1d_special_case(intersect_rays_1d)
Help - I don't know how to set all my matrices to the identity

You should have some variable mat which is a batch of matrices, i.e. it has shape (n_rays, n_segments, 2, 2). You can then define is_singular as a boolean tensor of shape (n_rays, n_segments) which is true wherever the matrix is singular. Now, indexing mat[is_singular] returns a tensor of shape (N, 2, 2) where the [i, :, :]-th element is the i-th singular matrix. Thanks to broadcasting rules, you can set mat[is_singular] = t.eye(2), and the identity matrix with shape (2, 2) will get broadcasted to the left hand shape (N, 2, 2).

Solution
def intersect_rays_1d(
    rays: Float[Tensor, "nrays 2 3"], segments: Float[Tensor, "nsegments 2 3"]
) -> Bool[Tensor, "nrays"]:
    """
    For each ray, return True if it intersects any segment.
    """
    NR = rays.size(0)
    NS = segments.size(0)
# Get just the x and y coordinates
    rays = rays[..., :2]
    segments = segments[..., :2]
# Repeat rays and segments so that we can compuate the intersection of every (ray, segment) pair
    rays = einops.repeat(rays, "nrays p d -> nrays nsegments p d", nsegments=NS)
    segments = einops.repeat(segments, "nsegments p d -> nrays nsegments p d", nrays=NR)
# Each element of rays is [[Ox, Oy], [Dx, Dy]]
    O = rays[:, :, 0]
    D = rays[:, :, 1]
    assert O.shape == (NR, NS, 2)
# Each element of segments is [[L1x, L1y], [L2x, L2y]]
    L_1 = segments[:, :, 0]
    L_2 = segments[:, :, 1]
    assert L_1.shape == (NR, NS, 2)
# Define matrix on left hand side of equation
    mat = t.stack([D, L_1 - L_2], dim=-1)
    # Get boolean of where matrix is singular, and replace it with the identity in these positions
    dets = t.linalg.det(mat)
    is_singular = dets.abs() < 1e-8
    assert is_singular.shape == (NR, NS)
    mat[is_singular] = t.eye(2)
# Define vector on the right hand side of equation
    vec = L_1 - O
# Solve equation, get results
    sol = t.linalg.solve(mat, vec)
    u = sol[..., 0]
    v = sol[..., 1]
# Return boolean of (matrix is nonsingular, and soln is in correct range implying intersection)
    return ((u >= 0) & (v >= 0) & (v <= 1) & ~is_singular).any(dim=-1)

Using GPT to understand code

Note, the world of LLMs moves fast, so this section is likely to get out of date at some point!

Next week we'll start learning about transformers and how to build them, but it's not too early to start using them to accelerate your own learning!

We'll be discussing more advanced ways to use GPT 3 and 4 as coding partners / research assistants in the coming weeks, but for now we'll look at a simple example: using GPT to understand code. You're recommended to read the recent LessWrong post by Siddharth Hiregowdara in which he explains his process. This works best on GPT-4, but I've found GPT-3.5 works equally well for reasonably straightforward problems (see the section below).

Firstly, you should get an account to use GPT with if you haven't already. Next, try asking GPT-3.5 / 4 for an explanation of the function above. You can do this e.g. via the following prompt:

Explain this Python function, line by line. You should break up your explanation by inserting sections of the code.

python def intersect_rays_1d(rays: Float[Tensor, "nrays 2 3"], segments: Float[Tensor, "nsegments 2 3"]) -> Bool[Tensor, "nrays"]: NR = rays.size(0) NS = segments.size(0) rays = rays[..., :2] ...

I've found removing comments is often more helpful, because then GPT will answer in its own words rather than just repeating the comments (and the comments can sometimes confuse it).

Once you've got a response, here are a few more things you might want to consider asking:

  • Can you suggest ways to improve the code?
    • GPT-4 recommended using a longer docstring and more descriptive variable names, among other things.
  • Can you explain why the line mat[is_singular] = t.eye(2) works?
    • GPT-4 gave me a correct and very detailed explanation involving broadcasting and tensor shapes.

Is using GPT in this way cheating? It can be, if your first instinct is to jump to GPT rather than trying to understand the code yourself. But it's important here to bring up the distinction of playing in easy mode vs playing in hard mode. There are situations where it's valuable for you to think about a problem for a while before moving forward because that deliberation will directly lead to you becoming a better researcher or engineer (e.g. when you're thinking of a hypothesis for how a circuit works while doing mechanistic interpretability on a transformer, or you're pondering which datastructure best fits your use case while implementing some RL algorithm). But there are also situations (like this one) where you'll get more value from speedrunning towards an understanding of certain code or concepts, and apply your understanding in subsequent exercises. It's important to find a balance!

When to use GPT-3.5 and GPT-4

GPT-3.5 and 4 both have advantages and disadvantages in different situations. GPT-3.5 has a large advantage in speed over GPT-4, and works equally well on simple problems or functions. If it's anything that Copilot is capable of writing, then you're likely better off using it instead of GPT-4.

On the other hand, GPT-4 has an advantage at generating coherent code (although we don't expect you to be using it for code generation much at this stage in the program), and is generally better at responding to complex tasks with less prompt engineering.

Additional notes on using GPT (from Joseph Bloom)

  • ChatGPT is overly friendly. If you give it bad code it won't tell you it's shit so you need to encourage it to give feedback and/or show you examples of great code. Especially for beginner coders using it, it's important to realise how under critical it is.
  • GPT is great at writing tests (asking it to write a test for a function is often better than asking it if a function is correct), refactoring code (identifying repeated tasks and extracting them) and naming variables well. These are specific things worth doing a few times to see how useful they can be.
  • GPT-4 does well with whole modules/scripts so don't hesitate to add those. When you start managing repos on GitHub, use tracked files so that when you copy-paste edited code back, all the changes are highlighted for you as if you'd made them. (VSCode highlights changes with small blue bars next to the line numbers in your python files.)

Here are some things you can play around with:

  • Ask GPT it to write tests for the function. You can give more specific instructions (e.g. asking it to use / not to use the unittests library, or to print more informative error messages).
  • Ask GPT how to refactor the function above. (When I did this, it suggested splitting the function up into subfunctions which performed the discrete tasks of "compute intersection points")

2D Rays

Now we're going to make use of the z dimension and have rays emitted from the origin in both y and z dimensions.

Exercise - implement make_rays_2d

```yaml Difficulty: 🔴🔴🔴⚪⚪ Importance: 🔵🔵⚪⚪⚪

You should spend up to 10-20 minutes on this exercise. ```

Implement make_rays_2d analogously to make_rays_1d. The result should look like a pyramid with the tip at the origin.

def make_rays_2d(
    num_pixels_y: int, num_pixels_z: int, y_limit: float, z_limit: float
) -> Float[Tensor, "nrays 2 3"]:
    """
    num_pixels_y: The number of pixels in the y dimension
    num_pixels_z: The number of pixels in the z dimension

    y_limit: At x=1, the rays should extend from -y_limit to +y_limit, inclusive of both.
    z_limit: At x=1, the rays should extend from -z_limit to +z_limit, inclusive of both.

    Returns: shape (num_rays=num_pixels_y * num_pixels_z, num_points=2, num_dims=3).
    """
    raise NotImplementedError()


rays_2d = make_rays_2d(10, 10, 0.3, 0.3)
render_lines_with_plotly(rays_2d)
Click to see the expected output
Help - I'm not sure how to implement this function.

Don't write it as a function right away. The most efficient way is to write and test each line individually in the REPL to verify it does what you expect before proceeding.

You can either build up the output tensor using torch.stack, or you can initialize the output tensor to its final size and then assign to slices like rays[:, 1, 1] = .... It's good practice to be able to do it both ways.

Each y coordinate needs a ray with each corresponding z coordinate - in other words this is an outer product. The most elegant way to do this is with two calls to einops.repeat. You can also accomplish this with unsqueeze, expand, and reshape combined.

Solution
def make_rays_2d(
    num_pixels_y: int, num_pixels_z: int, y_limit: float, z_limit: float
) -> Float[Tensor, "nrays 2 3"]:
    """
    num_pixels_y: The number of pixels in the y dimension
    num_pixels_z: The number of pixels in the z dimension
    y_limit: At x=1, the rays should extend from -y_limit to +y_limit, inclusive of both.
    z_limit: At x=1, the rays should extend from -z_limit to +z_limit, inclusive of both.
    Returns: shape (num_rays=num_pixels_y  num_pixels_z, num_points=2, num_dims=3).
    """
    n_pixels = num_pixels_y  num_pixels_z
    ygrid = t.linspace(-y_limit, y_limit, num_pixels_y)
    zgrid = t.linspace(-z_limit, z_limit, num_pixels_z)
    rays = t.zeros((n_pixels, 2, 3), dtype=t.float32)
    rays[:, 1, 0] = 1
    rays[:, 1, 1] = einops.repeat(ygrid, "y -> (y z)", z=num_pixels_z)
    rays[:, 1, 2] = einops.repeat(zgrid, "z -> (y z)", y=num_pixels_y)
    return rays