Zophar's Message Domain

Go Back   Zophar's Message Domain > General Chat > Talk of the Town

Reply
 
Thread Tools Display Modes
Old 02-11-2005, 12:16 PM   #1
SpaceTiger
Senior Member
 
Join Date: Feb 2003
Posts: 4,548
Default Argh! Will the endian madness ever...

So I spent the past few hours trying just read this output file from a simulation that one of my colleagues had run. "Why would that take a few hours?", you may ask. Well, I'll tell you.

What I had was a 1.5 GB file that had been written in fortran (unknown version) and the following block of code (sent by email) for reference:

NPART = 512*512*512
NPART16 = NPART/16
WRITE(1)ZR
C
WRITE(1)(XV(1,N),N= 1, NPART16)
WRITE(1)(XV(1,N),N= NPART16+1, 2*NPART16)
WRITE(1)(XV(1,N),N= 2*NPART16+1, 3*NPART16)...

It went on from there in basically the same fashion as the last three lines. There were several problems with this, some of which didn't become apparent until later, but right away it was clear that I wasn't told what the data types were. This would mean a process of trial and error, the scope of which would depend on the set of data types available in fortran.

So, I did a little reading on fortran. It turns out that there are many data types (of varying sizes) that can be declared, as well as a variety of output formats. The block of code seemed to indicate that the data was "unformatted" (presumably to save space), but it didn't specify whether the data was stored in "sequential" or "direct" mode (this would have been given in the "open" statement). I tried all possible combinations of data types and io modes, but nothing seemed to be giving me understandable results.

Eventually, I resorted to a byte-by-byte analysis of the file. First, I calculated the exact number of bytes I expected the output file to have based on the snippet of code (including 8 pad bytes appended by fortran for each call of WRITE). After some temporary confusion resulting from the ridiculous definition of "kilobyte" (1,024 bytes instead of the reasonable 1,000), I was able to match the file size to the data types. The result came out exactly right if all variables were 4-byte data types -- reals or integers, presumably.

Despite seemingly knowing the exact format of the file, however, I still couldn't get reasonable results. I was about to give up when I remembered something my roommate once told me about byte organization. A google search revealed that some machines read and write multi-byte data in the "big endian" system and some in the "little endian" system. Most PCs are little endian, while mainframes are generally big endian. The data had been written by a cluster, so I figured there was some chance I'd have to convert.

How did I do this? Well, it turns out that some fortran compilers give you the option of converting between the two systems (at compile time) for input and output. Of course, the one I was using didn't have that option, so I had to scour the documentation on our network for alternative compilers. I finally found one called "ifort" (i for Intel) and I set the relevant flag. Finally, my data came out!

What a nuisance. I understand that my advisors expect a little ingenuity from me, but this is ridiculous! A little more information would have been nice. <img src=smilies/banghead.gif>

Bah, well, at least I learned a thing or two.

<P ID="signature">----
"And dreams may come
That are everlasting
Though all just plastic too..." </P>
SpaceTiger is offline   Reply With Quote
Old 02-11-2005, 07:57 PM   #2
Isildur
Senior Member
 
Isildur's Avatar
 
Join Date: Nov 2004
Posts: 1,339
Default Re: Argh! Will the endian madness ever...

> After some temporary confusion resulting
> from the ridiculous definition of "kilobyte" (1,024 bytes
> instead of the reasonable 1,000)

While defining a kilobyte as 1,000 bytes might make more sense etymologically speaking, it would make very little sense from a programming perspective, since that isn't a round number in binary or hexadecimal.

<P ID="signature"><center>
<a href=http://1001insomniacnights.com><img src=http://pages.nyu.edu/~jc73/misc/1k1IN.gif border=0>
1k1IN:</a><font color=#903030> A Dark Comedy About 2 Roomates</font></center></P>
__________________
Holding out for Hostess Snack Cakes...
Isildur is offline   Reply With Quote
Old 02-11-2005, 08:01 PM   #3
SwampGas
Senior Member
 
Join Date: Apr 2000
Posts: 6,915
Default Re: Argh! Will the endian madness ever...

> While defining a kilobyte as 1,000 bytes might make more
> sense etymologically speaking, it would make very little
> sense from a programming perspective, since that isn't a
> round number in binary or hexadecimal.

Computer values are calculated by 2^N. 2^10 = 1024, not 1000.

<P ID="signature"><marquee direction=right scrollamount=10></marquee></P>
SwampGas is offline   Reply With Quote
Old 02-11-2005, 09:19 PM   #4
Isildur
Senior Member
 
Isildur's Avatar
 
Join Date: Nov 2004
Posts: 1,339
Default Re: Argh! Will the endian madness ever...

> > While defining a kilobyte as 1,000 bytes might make more
> > sense etymologically speaking, it would make very little
> > sense from a programming perspective, since that isn't a
> > round number in binary or hexadecimal.
>
> Computer values are calculated by 2^N. 2^10 = 1024, not
> 1000.
>

I know that, obviously. Did you even read my post? I was pointing out that 1000 = 0x3E8 = %1111101000

...as opposed to 1024 = 0x400 = %10000000000


Edit: Oh, wait, I guess you meant to reply to SpaceTiger.

<P ID="signature"><center>
<a href=http://1001insomniacnights.com><img src=http://pages.nyu.edu/~jc73/misc/1k1IN.gif border=0>
1k1IN:</a><font color=#903030> A Dark Comedy About 2 Roomates</font></center></P><P ID="edit"><FONT class="small">Edited by Isildur on 02/11/05 04:20 PM.</FONT></P>
__________________
Holding out for Hostess Snack Cakes...
Isildur is offline   Reply With Quote
Old 02-12-2005, 12:49 AM   #5
SpaceTiger
Senior Member
 
Join Date: Feb 2003
Posts: 4,548
Default Re: Argh! Will the endian madness ever...

> While defining a kilobyte as 1,000 bytes might make more
> sense etymologically speaking, it would make very little
> sense from a programming perspective, since that isn't a
> round number in binary or hexadecimal.

But the etymological problem was exactly the one I was talking about. The prefix "kilo" is supposed to imply "1000", not "1024". If they wanted to define it another way, they should have used a different prefix, especially since I've seen both definitions used as standard.

<P ID="signature">----
"And dreams may come
That are everlasting
Though all just plastic too..." </P>
SpaceTiger is offline   Reply With Quote
Old 02-12-2005, 12:54 AM   #6
Ugly Joe
Senior Member
 
Ugly Joe's Avatar
 
Join Date: Dec 2003
Posts: 1,461
Default Re: Argh! Will the endian madness ever...

> If they wanted to define it another
> way, they should have used a different prefix, especially
> since I've seen both definitions used as standard.

http://mathworld.wolfram.com/Kibibyte.htmlThere is</a>. I've only seen the term used in one program, though (a BT client).

<P ID="signature"></P>
__________________
Ugly Joe is offline   Reply With Quote
Old 02-12-2005, 01:10 AM   #7
SpaceTiger
Senior Member
 
Join Date: Feb 2003
Posts: 4,548
Default Re: Argh! Will the endian madness ever...

> There is. I've only seen the term used in one program,
> though (a BT client).

Nice, I'm officially using this term from now on. <img src=smilies/magbiggrin.gif>

<P ID="signature">----
"And dreams may come
That are everlasting
Though all just plastic too..." </P>
SpaceTiger is offline   Reply With Quote
Old 02-12-2005, 03:26 AM   #8
MegaManJuno
Senior Member
 
Join Date: Jan 2003
Location: WV
Posts: 626
Default Re: Argh! Will the endian madness ever...

While I agree that there is a good place for this, it's too bad they came up with a prefix that makes it sound like baby-talk when you speak it.

"Oh.. wook at all de wittle kibibytes..." <img src=smilies/puke.gif>

<P ID="signature"></P>
MegaManJuno is offline   Reply With Quote
Old 02-12-2005, 05:01 AM   #9
Reaper man
Member
 
Reaper man's Avatar
 
Join Date: Apr 2002
Location: Austin, TX
Posts: 5,409
Default Re: Argh! Will the endian madness ever...

to clear everything up:

a kilobyte is 1024 when you're talking about data stored on a computer (ie 345KB = 353,280 bytes)
a kilobyte is 1000 when you're talking about network speed/bandwidth, but it's usually measured in kilobits
(IE 56KBs = 56,000Bs = 448Kbs = 448,000bs)

<P ID="signature"><center>
sig not found...</center></P>
__________________
Reaper man is offline   Reply With Quote
Old 02-12-2005, 05:09 AM   #10
SpaceTiger
Senior Member
 
Join Date: Feb 2003
Posts: 4,548
Default Re: Argh! Will the endian madness ever...

> a kilobyte is 1024 when you're talking about data stored on
> a computer (ie 345KB = 353,280 bytes)
> a kilobyte is 1000 when you're talking about network
> speed/bandwidth, but it's usually measured in kilobits

Eh, these are good rules of thumb, but there are definitely exceptions. I found http://www.t1shopper.com/tools/calculate/this link</a> to be quite informative.

<P ID="signature">----
"And dreams may come
That are everlasting
Though all just plastic too..." </P>
SpaceTiger is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 12:03 PM.

Contact Us - Zophar's Domain - Archive - Top

Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.