How do I find out at compile time how much of an STM32's Flash memory and dynamic memory (SRAM) is used up?

The information you need is all in the output from size (aka arm-none-eabi-size):

   text    data     bss     dec     hex filename
   2896      12    1588    4496    1190 STM32F103RB_Nucleo.elf
  • text is the size of all code in your application.

  • data is the size of initialized global variables. It counts against both flash memory and RAM, as it's copied from flash to RAM during startup.

  • bss is the size of global variables which are initialized to zero (or are uninitialized, and hence default to zero). They're stored in RAM only.

  • dec and hex are the sum of text + data + bss in decimal and hexadecimal. This value doesn't really mean much on a microcontroller, so it should be ignored. (In environments where a program must be loaded into memory before running, it would be the total memory footprint of the program.)

To calculate the RAM usage of your program, add the data and bss columns together.

To calculate the FLASH usage of your program, add text and data.


TLDR

Jump straight down to the "Summary" at the bottom.

Details:

@duskwuff -inactive- answered the crux of my question in her/his answer here, but I'd like to add some additional insight and also answer my own follow-up questions I wrote in the comments under his answer.

First off, the "Basic Linker Script Concepts" section of the GNU Linker manual, included in its entirety in my "References" section below, was key to learning this information. Refer to it at the bottom of this answer, below.

For a more-detailed look into some of the aspects of this answer, refer also to my other answer to a related question I asked on Stack Overflow here.

It turns out that the information from objdump -h STM32F103RB_Nucleo.elf contains the same, more-specific output sub-sections as the arm-none-eabi-size -x --format=sysv "STM32F103RB_Nucleo.elf" command, which shows the size output in the sysv format instead of in the default berkeley format.

Here again is the berkeley-format output from the size (arm-none-eabi-size for STM32 mcus) command:

arm-none-eabi-size "STM32F103RB_Nucleo.elf"
   text    data     bss     dec     hex filename
   2896      12    1588    4496    1190 STM32F103RB_Nucleo.elf

Note that arm-none-eabi-size "STM32F103RB_Nucleo.elf" is equivalent to arm-none-eabi-size --format=berkeley "STM32F103RB_Nucleo.elf", since --format=berkeley is the default.

In the objdump -h STM32F103RB_Nucleo.elf output I posted in my question, all the information we need to answer my question is found, just in a much-more-detailed format is all.

As the GNU linker manual explains (see below), VMA means "Virtual Memory Address", and LMA means "Load Memory Address". The VMA addresses are where the data is located at run-time (which could be in volatile SRAM, since some data is copied from Flash to SRAM at boot), and the LMA is where the data is located when the device is off, and also prior to loading at boot (so it must be in non-volatile Flash memory only). Some data will be copied from Flash (LMA address) to SRAM (VMA address) at boot.

For this STM32 microcontroller, Flash memory begins at address 0x08000000, and SRAM begins at address 0x20000000. So, any output sections in the objdump -h output which have a VMA (runtime address) of 0, are therefore unused (not even on the microcontroller) and can be discarded immediately. That eliminates the latter half of the objdump -h output, most of which is debug information, from the .ARM.attributes output section through the .debug_frame output section, inclusive, leaving us with only these sysv output sections which we care about:

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .isr_vector   0000010c  08000000  08000000  00010000  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text         00000a1c  0800010c  0800010c  0001010c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .rodata       00000028  08000b28  08000b28  00010b28  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .init_array   00000004  08000b50  08000b50  00010b50  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  4 .fini_array   00000004  08000b54  08000b54  00010b54  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  5 .data         00000004  20000000  08000b58  00020000  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  6 .bss          00000030  20000004  08000b5c  00020004  2**2
                  ALLOC
  7 ._user_heap_stack 00000604  20000034  08000b5c  00020034  2**0
                  ALLOC

Any section marked READONLY, you can see, is stored only in the Flash memory at the 0x08000000-level addresses, and has the same LMA and VMA address. This makes sense, as there's no need to copy it to SRAM if it's read-only. The sections marked READONLY make up the berkeley-format text section. They include:

.isr_vector
.text
.rodata

So, we know that:

.isr_vector + .text + .rodata = text

This can be verified by summing their hex sizes:

10c + a1c + 28 = b50

0xb50 is decimal 2896, which matches the berkeley size output for the text section! And, again, as shown by the 0x08000000-level addresses for both LMA and VMA, this means all of these sections are in Flash memory only!

Here's my description of what these sections are:

SYSV SECTIONS WHICH MAKE UP THE BERKELEY text SECTION AND WHICH ARE IN FLASH MEMORY ONLY:

  1. .isr_vector = the ISR (Interrupt Service Routine) vector table. It simply points to all ISR callback functions for all possible interrupts the microcontroller can handle.
  2. .text = program logic; ie: the actual code.
  3. .rodata = Read-Only data; ie: const and constexpr static and global variables which are read-only.

Next, we can see that the NON-READONLY sections which also are ALLOC and LOAD sections include:

.init_array
.fini_array
.data

So, we know that they make up the berkeley data section:

.init_array + .fini_array + .data = data

This can be verified by summing their hex sizes:

4 + 4 + 4 = c

0xc is decimal 12, which matches the berkeley size output for the data section.

I don't know what the .init_array or .fini_array sections mean (if you know, please answer or post a comment), and I'm confused on their location, as their LMA and VMA addresses are identical, indicating they are both in Flash and Flash only. However, it's clear by the addresses of the .data section, that it takes up both Flash memory (at LMA [load address] = 0x08000b58) and SRAM memory (at VMA [run-time address] = 0x20000000). This means this data is copied from Flash to the beginning of SRAM. This happens during the startup routine. .data contains NON-zero-initialized (ie: initialized with something other than zero) static and global variables. In summary:

SYSV SECTIONS WHICH MAKE UP THE BERKELEY data SECTION AND WHICH ARE IN BOTH FLASH AND SRAM, AND WHICH ARE COPIED FROM FLASH TO SRAM DURING STARTUP:

  1. .data = NON-zero-initialized (ie: initialized with something other than zero) static and global variables

SYSV SECTIONS WHICH MAKE UP THE BERKELEY data SECTION AND WHICH APPARENTLY ARE IN FLASH MEMORY ONLY?:

  1. .init_array = unknown
  2. .fini_array = unknown

That leaves us with just these sections marked with ALLOC and nothing else remaining:

.bss
._user_heap_stack

So, we know they make up the berkeley bss section:

.bss + ._user_heap_stack = bss

This can be verified by summing their hex sizes:

30 + 604 = 634

0x634 is decimal 1588, which matches the berkeley size output for the bss section.

This is really interesting!, as it shows that the berkeley bss section doesn't just include the .bss (zero-initialized static and global variables) output section, but it also includes the ._user_heap_stack output section, which is perhaps (or rather, appears to me to be) the heap size we specify inside the STM32Cube configuration software. In either case, it appears to be the SRAM set aside for both the runtime stack (for local variables) and heap (for dynamically-allocated memory). In summary:

SYSV SECTIONS WHICH MAKE UP THE BERKELEY bss SECTION AND WHICH TAKE UP SPACE IN SRAM ONLY, BUT NOT FLASH:

  1. .bss = zero-initialized static and global variables; this SRAM is set to all zeros at program startup.
  2. ._user_heap_stack = (I think) completely uninitialized SRAM which is set aside for the runtime stack (for local variables) and heap (for dynamically-allocated memory).

Summary:

Here is the breakdown of which sysv output sections from the objdump -h STM32F103RB_Nucleo.elf output (also shown with less detail in the arm-none-eabi-size -x --format=sysv "STM32F103RB_Nucleo.elf" output) make up which berkeley output sections.

In the below image, you can see all 3 berkely output sections boxed in different colors:

  1. The berkeley-format text output sections (read-only, program logic and const static and global variables) are boxed in yellow.

     .isr_vector + .text + .rodata = text
    
    1. .isr_vector [IN FLASH ONLY] = the ISR (Interrupt Service Routine) vector table. It simply points to all ISR callback functions for all possible interrupts the microcontroller can handle.
    2. .text [IN FLASH ONLY] = program logic; ie: the actual code.
    3. .rodata [IN FLASH ONLY] = Read-Only data; ie: const and constexpr static and global variables which are read-only.
  2. The berkeley-format data output sections (non-zero-initialized [ie: initialized with values other than zero] static and global variables) are boxed in blue.

     .init_array + .fini_array + .data = data
    
    1. .init_array [FLASH ONLY it appears] = unknown.
    2. .fini_array [FLASH ONLY it appears] = unknown.
    3. .data [IN BOTH FLASH AND SRAM] = NON-zero-initialized (ie: initialized with something other than zero) static and global variables. These values must be copied from Flash to SRAM at startup, to initialize their corresponding static or global variables in SRAM.
  3. The berkeley-format bss output sections (zero-initialized static and global variables, and also, apparently, uninitialized stack and heap space) are boxed in red.

     .bss + ._user_heap_stack = bss
    
    1. .bss [SRAM ONLY] = zero-initialized static and global variables; this SRAM is set to all zeros at program startup.
    2. ._user_heap_stack [SRAM ONLY] = completely uninitialized (I think) SRAM which is set aside for the runtime stack (for local variables) and heap (for dynamically-allocated memory).
  4. Discarded sysv output sections which do not contribute to any of the 3 berkeley output sections are boxed in grey.

enter image description here

Memory conclusions:

  1. Flash memory:
    1. Flash memory usage = berkeley text + berkeley data.
      1. Flash memory used by the ISR Function Vector Table only = .isr_vector.
      2. Flash memory used by the program logic only = .text.
      3. Flash memory used by the Read-Only static and global variables only = .rodata.
  2. SRAM memory:
    1. SRAM usage from static and global variables AND allocated for stack and heap usage = berkeley bss + berkeley data.
      1. SRAM used by static and global variables only = sysv (.bss + .data). Notice the dots (.) before each of these names here, as opposed to the lack of dots above.
      2. SRAM allocated specifically for stack (local variables) and heap (dynamic memory allocation) = (apparently) ._user_heap_stack.
      3. SRAM not allocated for anything = SRAM_total - (berkeley bss + berkeley data).

References:

  1. GNU Linker (ld) manual, section "3.1 Basic Linker Script Concepts": https://sourceware.org/binutils/docs/ld/Basic-Script-Concepts.html#Basic-Script-Concepts:

    3.1 Basic Linker Script Concepts

    We need to define some basic concepts and vocabulary in order to describe the linker script language.

    The linker combines input files into a single output file. The output file and each input file are in a special data format known as an object file format. Each file is called an object file. The output file is often called an executable, but for our purposes we will also call it an object file. Each object file has, among other things, a list of sections. We sometimes refer to a section in an input file as an input section; similarly, a section in the output file is an output section.

    Each section in an object file has a name and a size. Most sections also have an associated block of data, known as the section contents. A section may be marked as loadable, which means that the contents should be loaded into memory when the output file is run. A section with no contents may be allocatable, which means that an area in memory should be set aside, but nothing in particular should be loaded there (in some cases this memory must be zeroed out). A section which is neither loadable nor allocatable typically contains some sort of debugging information.

    Every loadable or allocatable output section has two addresses. The first is the VMA, or virtual memory address. This is the address the section will have when the output file is run. The second is the LMA, or load memory address. This is the address at which the section will be loaded. In most cases the two addresses will be the same. An example of when they might be different is when a data section is loaded into ROM, and then copied into RAM when the program starts up (this technique is often used to initialize global variables in a ROM based system). In this case the ROM address would be the LMA, and the RAM address would be the VMA.

    You can see the sections in an object file by using the objdump program with the ‘-h’ option.

    Every object file also has a list of symbols, known as the symbol table. A symbol may be defined or undefined. Each symbol has a name, and each defined symbol has an address, among other information. If you compile a C or C++ program into an object file, you will get a defined symbol for every defined function and global or static variable. Every undefined function or global variable which is referenced in the input file will become an undefined symbol.

    You can see the symbols in an object file by using the nm program, or by using the objdump program with the ‘-t’ option.

  2. My own answer to my own question here: Convert binutils size output from “sysv” format (size --format=sysv my_executable) to “berkeley” format (size --format=berkeley my_executable)