Hands-On RISC-V

Talking to Memory: Loads, Stores, and Arrays

beginnerRISCVLesson 4 of 4

execution trace
  1. addi t0, zero, 0 # sum = 0
  2. lw t1, 0(a0) # loop: t1 = mem[a0] (current element)
  3. add t0, t0, t1 # sum += element
  4. addi a0, a0, 4 # advance the pointer one word (4 bytes)
  5. addi t2, t2, -1 # one fewer element to go
  6. bne t2, zero, loop # repeat until the count reaches 0

registers

memory / stack

pc & flags

0 / 0

Branches and loops gave a loop the power to repeat. But a loop that only churns registers runs out of things to do — registers are few. The interesting work lives in memory, and you reach it with just two instructions: one to read, one to write.

Load and store

Registers and memory are separate worlds. The CPU computes on registers, but it can only move data between them and memory — it never does arithmetic directly on memory. Two instructions bridge the gap:

lw  t1, 0(a0)        # LOAD word:  t1 = mem[a0 + 0]
sw  t1, 0(a0)        # STORE word: mem[a0 + 0] = t1

lw copies a word out of memory into a register; sw copies a register’s value into memory. (You met both briefly in the stack — this is the lesson that owes you the full story.)

Addresses and offsets

Memory is one giant numbered array of bytes, and an address is just an index into it. The 0(a0) syntax is an address built from two parts:

lw  t1, 8(a0)        # address = a0 + 8
#       │  └─ base register: holds a starting address
#       └──── offset: a constant added to it

The base register holds where you are; the constant offset reaches a fixed distance from there. This is exactly how the stack lesson read 12(sp)sp is the base, 12 is the offset. The same pattern walks structs, arrays, and stack frames alike.

Bytes versus words

A word on RV32 is 4 bytes, so consecutive words sit 4 addresses apart: mem[0], mem[4], mem[8], …. When you don’t need a whole word, narrower loads and stores reach a single byte:

lb  t1, 0(a0)        # load byte, sign-extended  (e.g. for signed chars)
lbu t1, 0(a0)        # load byte, zero-extended  (e.g. for raw bytes / ASCII)
sb  t1, 0(a0)        # store the low byte of t1

lb fills the upper bits with the byte’s sign so -1 stays -1; lbu fills them with zeros, which is what you want for text and unsigned data. Picking the wrong one is a classic source of “why is my character a huge negative number?” bugs.

Walking an array

Put load and loop together and you can sweep an entire array. Keep a pointer in a register, read through it, then advance the pointer by one element each time around — exactly the loop shape from the previous lesson, now with memory:

        addi t0, zero, 0     # sum = 0
loop:   lw   t1, 0(a0)       # read the element a0 points at
        add  t0, t0, t1      # accumulate
        addi a0, a0, 4       # march the pointer to the next word
        addi t2, t2, -1      # count down the remaining elements
        bne  t2, zero, loop  # go again until none are left

Notice the offset never changes — it stays 0(a0). The pointer does the moving. That’s the heart of array traversal: a fixed access pattern over a marching base address.

Step through it

Use the visualizer and watch two panels at once. The memory panel shows the array [10, 32] parked at 0x2000. Press step ▶ and follow a0 in the registers panel: each pass through the loop it climbs 0x2000 → 0x2004 → 0x2008, while the very same lw 0(a0) pulls a different value each time. The sum lands on 42 right as the counter hits zero and the branch falls through.

Try this

  • The array holds 4-byte words. What if you wrote addi a0, a0, 1 instead of 4? (You’d advance one byte, landing in the middle of the first word and reading garbage — element size and the stride must match.)
  • How would you find the largest element instead of the sum? (Keep a “best so far” register; replace add with a blt/bge compare-and-update — same walk, different body.)
  • Why must lw use a word-aligned address like 0x2004 and not 0x2003? (Most loads expect natural alignment; a misaligned word access traps or runs slowly.)

Memory is where programs keep everything bigger than a handful of registers. With loads, stores, and a marching pointer, you can now touch all of it — and that’s the foundation the stack was quietly standing on the whole time.