Big and Little Endians [was: Re: Procedure How to Write a Manual!]

Bodvar Bjorgvinsson bodvar at gmail.com
Fri May 22 14:02:47 PDT 2009


Now I finally understand and, consequently revoke my warning words
against using 'endianess' in a manual. :D

What? I understand it, why shouldn't everyone else? :-/

Böðvar
-- Enlightened on a Friday

2009/5/22 Jeremy H. Griffith <jeremy at omsys.com>:
> On Fri, 22 May 2009 13:36:11 +0000, Bodvar Bjorgvinsson
> <bodvar at gmail.com> wrote:
>
>>Regarding the "endianess", I had a problem some 13 years ago with some
>>UNIX software that was supposed to work on Linux. It did not. I sent a
>>query to an Icelandic guy on the "Basic Linux Training" list I
>>subscribed to and he came up with a solution. Then he expained to me
>>that there was a difference between Linux an UNIX that one used big
>>endian and the other little endian in the same code of software.
>
> In current computer systems, there are two kinds of
> "endianess", called "LSB (Least Significant Byte)
> first" and "MSB (Most Significant Byte) first".
> For any given system, what determines this is not
> the operating system (Linux, Windows, etc.), it's
> the processor (CPU).  All Intel CPUs are LSB first;
> others, like Sun SPARC and Motorola 68K, are MSB
> first.  So Linux on a Sun SPARC would be MSB first,
> but on an Intel box it would be LSB first.
>
> Technically, the difference is indeed *byte* order,
> not *bit* order (which is constant).  Suppose you
> have a hex number 0xABCD.  The most significant
> byte is 0xAB; the least significant byte is 0xCD.
> Now imagine that you store this number in memory
> at address 0.  ;-)  You will get:
>
> Location  SPARC  Intel
> 00000000  0xAB   0xCD
> 00000001  0xCD   0xAB
>
> Well-designed programs where portability matters
> will work with *either* CPU.  They do this by not
> caring what the storage order in memory is, and
> always accessing multibyte numbers through a set
> of functions that work regardless of byte order.
> For example, Mif2Go was originally developed on
> a Sun SPARC system, then ported to Windows very
> easily because it followed those design rules.
>
> There's actually a third flavor, but it was used
> only on the DEC PDP-11.  Since the last of those
> is probably in the Smithsonian, you won't see it
> in current software.  It is the same as Intel
> for two-byte numbers (shorts) but switches the
> byte pairs for 4-byte numbers (longs).  So the
> number 0x12345678 is 0x34, 0x12, 0x78, 0x56.
>
> Endianness also affects Unicode, in the UTF-16
> and UTF-32 encodings of it, but *not* in UTF-8.
> It is the reason for the UTF-16 BOM (Byte Order
> Mark), U+FEFF,  In UTF-16 Big-endian (MSB first),
> the bytes are 0xFE 0xFF.  In UTF-16 Little-endian
> (LSB first), they are 0xFF 0xFE.  UTF-32 adds two
> zero bytes, before it for Big and after for Little.
>
> The Unicode BOM may also be used as an encoding
> signature, but I digress...   ;-)  Good thing
> it's Friday, eh?
>
> HTH!
>
> -- Jeremy H. Griffith, at Omni Systems Inc.
>  <jeremy at omsys.com>  http://www.omsys.com/
> _______________________________________________
>
>
> You are currently subscribed to Framers as bodvar at gmail.com.
>
> Send list messages to framers at lists.frameusers.com.
>
> To unsubscribe send a blank email to
> framers-unsubscribe at lists.frameusers.com
> or visit http://lists.frameusers.com/mailman/options/framers/bodvar%40gmail.com
>
> Send administrative questions to listadmin at frameusers.com. Visit
> http://www.frameusers.com/ for more resources and info.
>



-- 
"Life is not only a game--it is also a dance on roses."
	--Fleksnes (Rolv Wesenlund)



More information about the framers mailing list