1

I am new to assembly language. I am trying the below code and as you can see the below code.

bits 64
global _start
section .text
_start:

        mov rcx, 1234567890
        xor rcx, rcx
        mov rcx, 'wxyz'

        mov rax, 60
        mov rdi, 0 
        syscall

I would like to know why digits are stored as Big endian in register and characters are stored in registers as Little-endian

Below screenshots are from the debugger. enter image description here

enter image description here

I thought only in the memory, data is stored as Little endian. But I don't understand why the characters are stored as Little endian in the register. Kindly let me know.

Thank you.

9
  • 4
    That depends on your assembler (that you didn't specify but looks like nasm). The idea is that you will get the string in the expected order if written to memory. Commented Feb 6, 2018 at 14:35
  • 1
    nasm.us/doc/nasmdoc3.html#section-3.4.3 Commented Feb 6, 2018 at 14:38
  • 3
    x86 will READ it as little endian. But the NASM will assembly it in such way, that 'abcd' will land into memory as bytes 'a', 'b', 'c', 'd' ... there's not much point to reason about string endiannes and this particular NASM feature about string constants is designed like that, it's "extra convenience", diverting from strict machine logic a bit. So if you use mov eax,'0123' vs mov eax,0x30313233, those two are different values (in bswap way different). You simply have to memorize that (as any other syntax "quirk"). Or check machine code in listing file after assembling what you get. Commented Feb 6, 2018 at 14:40
  • 2
    It's more like the NASM treats string constants as stream of bytes (keeping their "string" order in produced machine code), even when used in context as word or bigger type, where numeric constants get the little-endian treatment and are "reversed" in memory. CPU is not aware of your source, so it has no idea (of the value originating from string and representing characters), it will read it as normal 32 bit numeric, if you do something like mov eax,'0123' => eax = 0x33323130. Commented Feb 6, 2018 at 14:47
  • 1
    @vanquish I didn't understand well your last comment here, it's mild "yes", in memory little-endian, in register you can think about it as big endian, but it's more like in the register the value is simply value (no endian). Except the left/right shifts it has pretty much no "space orientation", it has the bits somewhere in the CPU, hard to tell where and in what order, you just can count it is used in common sense, as binary integer value/etc... And being torn apart into LE bytes when stored into memory. And the debuggers display reg value naturally ("BE"-like), but that's formatting code. Commented Feb 6, 2018 at 23:34

1 Answer 1

10

Speaking about CPU registers' endianness doesn't make much sense because addresses are not assigned to the particular bytes that made up the registers, i.e.: there is no byte order to be considered.

That is, for example al is the lowest byte of rax, and ah the second lowest. With this in mind, what is the address of al and ah? Is ah's address higher or lower than al address? They don't have (memory) addresses associated, and therefore there is no byte order at all to consider.

What is relevant is how those bytes are stored into memory (e.g.: by means of a mov instruction). The endianness determines that. For a little-endian machine, the lowest byte of the register will be placed at the lowest address of the destination operand, for a big-endian machine at the highest address. The endianness is similarly relevant for loading a memory operand into a register.

In short, in order to speak about endianness there must be a kind of mapping between bytes' significance and the highness of their corresponding addresses.

Sign up to request clarification or add additional context in comments.

1 Comment

What registers do have is left and right shifts. Right shifts bring bits from more significant to less significant positions. As you say, this is totally independent of how they're scrambled in memory, and isn't endianness. It's a bit-order, and is always MSB at the left, LSB at the right, by definition of left and right shift. How shifts work across element boundaries (pslld xmm0, 8 / psldq xmm0, 1) or across sub-registers (shl eax, 8 sets AH=AL and AL=0) defines how those relate to each other.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.