Page number and offset

I think your primary and secondary confusions are due to general confusion on the subject :)

Let me talk around this a bit and hopefully I can be of some help. First, an analogy - imagine that you're trying to locate a house in a city. Imagine that each house was given a unique number - you can imagine that the number of houses would soon get very large and confusing. Now imagine that you introduce the concept of streets - the house numbers now become a bit more managable as you've grouped them into nice chunks. So: Streets = Page number, house number = offset address.

The whole point of having virtual memory pages is to allow the computer to carve memory up into managable chunks and not waste too much of it. Carving it into chunks (pages) allows granular control of access, paging and other nice things like that. The smaller your pages, the less memory you're going to waste (if process A requires 32k of memory, and your page size is 64k, you're going to end up with some which isn't used), but the higher the overhead on the system.

As to why page sizes are powers of 2, this is down the not wasting space within the address. As computers are based on binary (at the moment), everything tends to boil down to powers of 2. Imagine if you have stuff based on factors of 10. 10 in binary is 1010 - you've got to use 4 bits to hold it, so why not go for the full range of values you can get out of 4 bits: 0000 - 1111 (0 to 15 = 16 values).

Sorry I've waffled on a bit - I hope this nudges you in the right direction!


I have the same confusion but if I have understood it right then it's like the following: the power of 2 case is slightly besides the general understanding of the topic. It's more like a convention, since we are dealing with binary values and need appropriate division among bits for which the power of 2 fits appropriately.

E.g., if a pGe has 64k words and there are 4 words per frame then 2^x=64 -> x=6

Which means each frame can have a physical address constituting 6 binary values I.e. 0 or 1 in which 4 will represent the frame number. And last denoting the exact location of the word amongst the 4.

Note that here each frame cannot have 5 or any other value or the so called convention fails.


I like the GHC analogy with the streets and cities as to why we need paging. Also grouping bytes of the memory into pages allows the CPU to larger amounts of memory.

Suppose the following properties are given:

  • virtual address is 32 bits
  • page offset is 12 bits
  • physical address is 30 bits
  • the RAM is 1GiB

Here's a digram that I made which shows how the page number and page offset are used to address a specific cell in the memory:

enter image description here

There's a virtual address which is generated by the CPU and consists of Virtual page number (20 bits) and page offset (12 bits).

Also there's a pagemap used for virtual page number to physical page number mapping (additionally Dirty bit shows if the page has been changed/Resident bit shows if the page is resident in the memory) and on the right is how the memory is partitioned into pages (in blue on the diagram).

The virtual page number is passed to the pagemap using 20 address bits. Since the page number is passed in binary having 20 address bits means that the pagemap can have up to 2^20 records (since with 20 bits you can get 2^20 different numbers) This is also the reason why the page numbers are powers of 2.

So using the pagemap you can find which physical page number is mapped to the requested virtual page number, the page offset is not altered. Having the physical page number and page offset you have the physical address. Using the page number you go to specific page of the memory and using the offset you go the specific byte cell. (Also the page offset defines the page size since 12 bits for offset means that we can address 2^12 = 4096 cells (in orange on the diagram) within a page)

In green you can see an example where we request virtual page number 2 with page offset 4095. According to the page map virtual page number 2 maps to physical page 15, which gives us the physical address with physical page number 15 and offset 4095. (normally virtual/physical page numbers and page offsets would be displayed in hexadecimal, but I used decimal just to simplify)

PS:

The example data is taken from this lecture - https://www.youtube.com/watch?v=3akTtCu_F_k - it gives very good overview of virtual memory.