Big and Little Endians [was: Re: Procedure How to Write a Manual!]

Jeff Coatsworth jeff.coatsworth at jonassoftware.com
Fri May 22 12:56:43 PDT 2009


Wow! What an education for a Friday ;>)

I remember playing with DEC PDP-11's when I was a kid visiting my
father's office. I used to play some pseudo-D&D command line game and
fool around with some graphics software that would draw overlapping
circles & fill them with a limited palette of colours (sort of early
Venn diagrams). Good times....

-----Original Message-----
From: framers-bounces at lists.frameusers.com
[mailto:framers-bounces at lists.frameusers.com] On Behalf Of Jeremy H.
Griffith
Sent: Friday, May 22, 2009 3:14 PM
To: framers at lists.frameusers.com
Subject: Big and Little Endians [was: Re: Procedure How to Write a
Manual!]

On Fri, 22 May 2009 13:36:11 +0000, Bodvar Bjorgvinsson
<bodvar at gmail.com> wrote:

>Regarding the "endianess", I had a problem some 13 years ago with some 
>UNIX software that was supposed to work on Linux. It did not. I sent a 
>query to an Icelandic guy on the "Basic Linux Training" list I 
>subscribed to and he came up with a solution. Then he expained to me 
>that there was a difference between Linux an UNIX that one used big 
>endian and the other little endian in the same code of software.

In current computer systems, there are two kinds of "endianess", called
"LSB (Least Significant Byte) first" and "MSB (Most Significant Byte)
first".
For any given system, what determines this is not the operating system
(Linux, Windows, etc.), it's the processor (CPU).  All Intel CPUs are
LSB first; others, like Sun SPARC and Motorola 68K, are MSB first.  So
Linux on a Sun SPARC would be MSB first, but on an Intel box it would be
LSB first.

Technically, the difference is indeed *byte* order, not *bit* order
(which is constant).  Suppose you have a hex number 0xABCD.  The most
significant byte is 0xAB; the least significant byte is 0xCD.
Now imagine that you store this number in memory at address 0.  ;-)  You
will get:

Location  SPARC  Intel
00000000  0xAB   0xCD
00000001  0xCD   0xAB

Well-designed programs where portability matters will work with *either*
CPU.  They do this by not caring what the storage order in memory is,
and always accessing multibyte numbers through a set of functions that
work regardless of byte order.
For example, Mif2Go was originally developed on a Sun SPARC system, then
ported to Windows very easily because it followed those design rules.

There's actually a third flavor, but it was used only on the DEC PDP-11.
Since the last of those is probably in the Smithsonian, you won't see it
in current software.  It is the same as Intel for two-byte numbers
(shorts) but switches the byte pairs for 4-byte numbers (longs).  So the
number 0x12345678 is 0x34, 0x12, 0x78, 0x56.

Endianness also affects Unicode, in the UTF-16 and UTF-32 encodings of
it, but *not* in UTF-8.
It is the reason for the UTF-16 BOM (Byte Order Mark), U+FEFF,  In
UTF-16 Big-endian (MSB first), the bytes are 0xFE 0xFF.  In UTF-16
Little-endian (LSB first), they are 0xFF 0xFE.  UTF-32 adds two zero
bytes, before it for Big and after for Little. 

The Unicode BOM may also be used as an encoding
signature, but I digress...   ;-)  Good thing
it's Friday, eh?

HTH!

-- Jeremy H. Griffith, at Omni Systems Inc.
  <jeremy at omsys.com>  http://www.omsys.com/
_______________________________________________


You are currently subscribed to Framers as
jeff.coatsworth at jonassoftware.com.

Send list messages to framers at lists.frameusers.com.

To unsubscribe send a blank email to
framers-unsubscribe at lists.frameusers.com
or visit
http://lists.frameusers.com/mailman/options/framers/jeff.coatsworth%40jo
nassoftware.com

Send administrative questions to listadmin at frameusers.com. Visit
http://www.frameusers.com/ for more resources and info.



More information about the framers mailing list