Les instruction purpose?

Its too bad you are learning assembler for a microprocessor with a messy architecture. You get confusing concepts such as the LES instruction.

Conventional microprocessor have registers large enough to contain a full memory address. You can simply load the address of a memory location into a register, and then access that location (and usually those nearby with indexing) via the register.

Some machines (notably the Intel 286 in real mode, which seems to be what you are programming), had only 16 bit registers but could address 1MB of memory. In this case, a register doesn't have enough bits: you need 20 bits, but the registers are only 16 bits.

The solution is to have a second register that contains the missing bits. A simple scheme would have been to require 2 registers, one of which had the lower 16 bits, one of which had the upper 16 bits, to produce a 32 bit address. Then the instruction that references two registers makes sense: you need both to get a full memory address.

Intel chose a messier segment:offset scheme: the normal register (bx in your case) contains the lower 16 bits (the offset), and the special register (called ES) contains 16 bits which are left-shifted 4 bits, and added to the offset, to get the resulting linear address. ES is called a "segment" register, but this will make no sense unless you go read about the Multics operating system circa 1968.

(x86 allows other addressing modes for the "effective address" or "offset" part of an address, like es:[bx + si + 1234], but always exactly one segment register for a memory address.)

[Segments and segment registers really are an interesting idea when fully implemented the Multics way. If you don't know what this is, and you have any interest in computer and/or information architectures, find the Elliot Organick book on Multics and read it cover to cover. You will be dismayed at what we had in the late 60s and seem to have lost in 50 years of "progress". If you want a longer discussion of this, see my discussion on the purpose of FS and GS segment registers ]

What's left of the idea in the x86 is pretty much a joke, at least the way it it used in "modern" operating systems. You don't really care; when some hardware designer presents you with a machine, you have to live with it as it is.

For the Intel 286, you simply have to load a segment register and an index register to get a full address. Each machine insturction has to reference one index register and one segment register in order to form a full address. For the Intel 286, there are 4 such segment reigsters: DS, SS, ES, and CS. Each instruction type explicitly designates an index register and implicitly chooses one of the 4 segment registers unless you provide an explicit override that says which one to use. JMP instructions use CS unless you say otherwise. MOV instructions use DS unless you say otherwise. PUSH instructions use SS unless you say otherwise (and in this case you better not). ES is the "extra" segment; you can only use it by explicitly referencing it in the instruction (except the block move [MOVB} instruction, which uses both DS and ES implicitly).

Hope that helps.

Best to work with a more modern microprocessor, where segment register silliness isn't an issue. (For example, 32-bit mode x86, where mainstream OSes use a flat memory model with all segment bases = 0. So you can just ignore segmentation and have single registers as pointers, only caring about the "offset" part of an address.)


The 8086 segment registers cs, ds, es, and ss are the original mechanism by which 16-bit registers can address more than 64K of memory. In the 8086/8088, there were 20 bit addresses (1024 K) to be generated. Subsequent versions of the x86 processors added new schemes to address even more, but generating 20+ bits of address from a pair of 16-bit values is the basic reason.

In so-called "real mode" (native to 8086/8088/80186), an address is computed by multiplying the contents of the segment register by 16 (or, equivalently, shifted left by four places) and adding the offset.

In protected mode (available with the 80286 and later), the segment register selects a "descriptor" which contains a base physical address. The operand es:[bx], for example, adds bx to that physical address to generate the operand address.


p points to a 32-bit FAR pointer with segment and offset part (in contrast to a NEAR pointer, which is only the offset part). LES will load segment:offset into ES:BX.

Otherwise, you would have to use three instructions. One for loading BX, and two for loading ES (segment registers cannot be loaded directly from memory but have to be loaded into a general-purpose register and then into the segment register).

Oh, yeah, wallyk had a good point with mentioning protected mode (although that is beside the point of your question). Here, ES will be interpreted as a selector, not an actual segment.

A segment (address) in this context is a part of the physical address:
Shift the segment by 4 bits to the left (i.e. multiply it by 2^4 = 16) and add the offset to get the physical address from segment:offset.

In contrast, a selector is a pointer to an entry in a so-called descriptor table (i.e. a selector points to a descriptor) and is used in protected mode. A descriptor table (e.g. GDT) may contain entries of information about chunks of memory, including information about the physical memory address, the chunk size, access rights etc. (there are some slightly other uses as well).

Tags:

Assembly

X86

Masm