Talk:Orders of magnitude (data)

From Wikipedia, the free encyclopedia

Contents

[edit] Uninformative Entries

This article seems to have a very large number of entries describing the names of the numbers (eg. 8,000,000 = 8 kilobits) and relatively few entries that actually give you an idea of how large that number is. For example, in the entry for 10^15 bits, there are five entries - four of which are names. Are all these names really relevant? Furthermore there is notable redundancy in the names - for example there is the heading "10^18 bits – One exabit" and then the first entry is "1,000,000,000,000,000,000 bits (1018 bits, 125 petaoctets) – One exabit" which is really not very informative - yes there is some information there but most readers will find it irrelevant. Cornflake pirate

[edit] Powers of ten or powers of two

I created this with a list of powers of ten to match the other order of magnitude charts. But perhaps it would make more sense to go with powers of two (and sectionalizing accordingly by prefixes)? Fredrik 16:01, 19 May 2004 (UTC)

Stick with the powers of 10. Wikipedia defines an order of magnitude as a power of 10 and you don't want to generalize that (at least I wouldn't). Sectionalize by SI prefixes (kilobit, megabit etc.) but mention binary prefixes (kibibit, mebibit etc.).
Because the byte is (more?) frequently used as unit of data, kibibyte, mebibyte etc. should also be included. And of course, byte should really be octet… This should yield some delightfully complicated fun!
Herbee 22:45, 2004 May 22 (UTC)
Seems good to me then. Thanks for contributing :) -- Fredrik 12:33, 23 May 2004 (UTC)
Why would you not want to generalise that? To me generalisation would seem a perfectly reasonable thing to do especially in a case like this. Indeed it seems that the Order of magnitude page has been changed ... yeah, in three years what do you expect? Currently it reads
An order of magnitude is the class of scale or magnitude of any amount, where each class contains values of a fixed ratio to the class preceding it. The ratio most commonly used is 10.
Anyhow, if it had been I who created the article, I'd have used powers of two ... and/or four/thirty-two/1024. But, then on the other hand, who says we can't have our cake and eat it too? Bring on the delightfully complicated fun. Jimp 02:42, 19 April 2007 (UTC)
It's done ... enjoy. Jimp 07:36, 20 April 2007 (UTC)

Kilobyte could refer to either, and many other sizes as well; it is therefore not used in this list.

The point has already been made (in the article) that kilo always means 1000, and never 1024. Seen in that light, the above statement might actually reinforce the confusion we're trying to avoid.
Herbee 16:45, 2004 May 23 (UTC)

I guess I should've added "In casual use...". But I'm fine with it removed too. Fredrik 16:48, 23 May 2004 (UTC)

.4 × 109 bits – Size of the human genome, 3.2 billion base pairs

and later:

1.28 × 1010 bits – Capacity of the human genome, 3.2 billion base pairs

At 3.2 billion base pairs, wouldn't the potential information content be 1.28 × 1010 bits? Transcription from DNA runs in one direction down one strand and can see any of four states at each basepair (A,T,C,G); not just that the pair was one of two (AT,GC) possible pairs. Nor do complementary Codons code for the same amino acid.

Ah, but binary numbers have the wonderful ability to encode four states in just two bits. Explicitly, one might encode the AT base pair as 00, CG as 01, GC as 10 and TA as 11. So it's actually 2 bits per base pair. Accordingly, I'm changing the size of the human genome to 6.4 gigabits.
Herbee 22:23, 2005 Apr 21 (UTC)

That said, the amino acid translation for codons is more-than-one-to-one (in fact it's about four-to-one), thus reducing that potential back to around 3 &times 109 post-translation. But even after that, the human genome doesn't come close to saturating its potential information content, since about 97% is Junk DNA. And that brings us to a content of about 9 &times 107 useful bits -- a small stack of floppies. Mimirzero 07:25, 7 Jan 2005 (UTC)

I changed to page to use "Capacity" instead of "Size of Genome" to satisfy my second complaint without getting extra obscure on the page.

The meaning of size is perfectly clear while capacity is not. Moreover, the article is about size and not about 'amount of content'. For instance, the 'total amount of printed material in the world' is listed, and that certainly includes a lot of junk and redundancy. Accordingly, I'm changing back to the size of the human genome.
Herbee 22:23, 2005 Apr 21 (UTC)

[edit] At a glance?

The article claims that 150 megabits is the approximate amount of data the human eye can "capture at a glance". That sounds pretty cool, but what does it mean? I'm pretty sure nowhere near that much actually makes its way to any useful bit of the brain on a brief glance at a scene. It's not too different from the number of photoreceptors in the human retina; is that what's meant? I'm willing (just barely!) to believe that there may be a useful sense in which the figure is right, but stated so baldly it seems more like a factoid than a fact. Gareth McCaughan 21:56, 2005 Apr 21 (UTC)

The retina article claims an information rate of 0.6 megabits/sec through each optic nerve, so each "glance" would take about two minutes. The 150 megabits seem a factor 100 too high, at least.
Herbee 22:46, 2005 Apr 21 (UTC)

[edit] Phonebook

This is so stoopid:

104bits

  • 22,500 bits – Amount of information in a typical non-fiction book.
  • 27,000 bits – Amount of information in a typical phone book.
  • 42,000 bits – Amount of information in a typical reference book.

Reality check: a phone book has on the order of 1000 pages, which would amount to 27 bits or less than 4 bytes per page. Stoopid! My own phone book (2005/2006 edition for the Arnhem-Zevenaar region, the Netherlands) has 704 pages, 4 columns per page, 129 lines per column, and on average 35 characters per line, for a total of about 100 megabits. Ads have a lower information density, so I estimate that my phone book contains 80 ± 20 megabits of information. The other two entries make no sense either—what is a "typical non-fiction book"?

As a fix, I removed these three entries and created a new one for the phone book.
Herbee July 3, 2005 23:05 (UTC)

[edit] Binary prefixes like kibibyte

A vote has been started on whether Wikipedia should use these prefixes all the time, only in highly technical contexts, or never. - Omegatron 14:57, July 12, 2005 (UTC)

A year & a half later & the debate continues. Jimp 01:26, 19 April 2007 (UTC)

[edit] Octet vs Byte

Byte is a far more common term... shouldn't we use that instead of octet? SigmaEpsilonΣΕ 03:31, 15 May 2006 (UTC)

I tend to agree, Bytes would be useful on this page as they are more common and much easier for most people to understand.--Hibernian 19:32, 16 August 2006 (UTC)
Yes. — Omegatron 15:58, 20 May 2007 (UTC)
I disagree, octet is the correct and accurate term, byte can be ambiguous. Sarenne 17:36, 21 May 2007 (UTC)
Yes - octet is used in a number of fields, notably music. Replacing a word that is used solely in the computer field with one that is used in many is more confusing, regardless of any historical ambiguity (byte = 8 bits since I was a kid, and that's a long time ago). Maury 12:24, 23 May 2007 (UTC)
I find the layout is poor, I'm going to try to look into a more efficient way of displaying the information. Tyler 17:50, 25 May 2007 (UTC)
IMHO the layout is fine. However, it might be worthwhile to split it into two tables? One for the "kibibits" and one for the SI standard "kilobits". It's less confusing to only have to deal with one measuring system at a time. (Or not; doesn't matter.) - Theaveng 12:20, 28 September 2007 (UTC)


I think octet is more appropriate here, as it provides an accurate, numerical definition that doesn't rely on anything else. There can be absolutely no confusion about what an octet is, and I think that goes well with other "orders of magnitude" lists on Wikipedia where absolute values are provided. — Northgrove 09:12, 14 September 2007 (UTC)
Wikipedia is supposed to reflect ACTUAL usage by the common people, not try to redefine how people are supposed to talk. Descriptive of the language, not prescriptive. - Theaveng 12:12, 28 September 2007 (UTC) - P.S. My external hard drive is 300 gigaBYTES, not 300 gigaoctets (sounds unusual doesn't it; that's because it is). Be *descriptive* of the language as its actually used.
P.P.S. I apologize if I sounded a little hostile. It's just that I, as an engineer, get a little annoyed when some English major comes along and tells me I should be saying "gigaoctets" because "it's more proper". (And they say I'm not allowed to use the phrase "can't disagree". Chaucer and Shakespeare used double negatives; why can't I?) ----- I've been using "bytes" for the last thirty years, as have all the engineering & programming colleagues around me. We are not going to change to "octets" just because you tell us we should.
My Atari 2600, Commodore 64, Amiga, and PowerMac were not filled with octets. They were filled with 128 bytes, 64 kilobytes, 2 megabytes, and 1 gigabyte respectively. BYTES, not octets. The article should reflect that common usage. - Theaveng 12:44, 28 September 2007 (UTC)
The term "octet" has been often used in protocol specifications and the like when it is necessary to unambiguously define an 8-bit data word. See Octet (computing) and (to pick an IETF RFC at random) RFC1122. Letdorf 14:04, 28 September 2007 (UTC).
Well that does make sense because specifications are a lot like laws: Filled with lots of jargon only understood by lawyers and politicians. These words have very precise definitions, but are also very confusing to the layman. "Byte" is still the more common term that is used virtually everywhere you look, and IMHO "octet" should be treated the same way as "crumb"... a term that is used, but only rarely, and only with select groups (like specification writers or engineers).
Also an article using the terminology that people know best (bytes) is far more useful to those perusing the encyclopedia, whereas an article using unfamiliar terminology (octets) will just leave the average Joe Smith scratching his head in confusion. He won't get any use out of it. IMHO. - Theaveng 15:44, 28 September 2007 (UTC)

[edit] Semioctet vs. "nibble, rarely used"

Say what? I've been using the term nibble since the 1970s. As do many of my engineering colleagues. Hardly rare.

And what the heck is a semioctet??? Where the heck did that come from? I've never heard that terminology, not even once, in the last thirty years. (IMHO it seems this article has its terminology backwards, using rare terms as "common", and common terms as "rare".) (Wikipedia's supposed to reflect ACTUAL usage by the common people, not try to redefine how people are supposed to talk.) - Theaveng 12:12, 28 September 2007 (UTC) P.S. I back-up my argument by pointing out the word "semioctet" has no article of its own. It defaults to the common usage "nibble" article.

[edit] Fractional byte lengths

The phrase "10 bits is the minimum byte length" is inherently misleading. It is correct to say that 10 bits is the minimum bit length for a certain purpose. Byte has already been defined as 8 bits and introducing the non-standard notion of fractional byte lengths will only confuse the reader (e.g. what is the byte length of 3 bits?). I've edited the text accordingly.

This points to a more general semantic confusion in the article between the actual information conveyed in a string of bits/nibbles/bytes/whatever and the information carrying capacity of such information. I'm not sure how best to clear up this confusion. —Preceding unsigned comment added by Ross Fraser (talk • contribs) 00:48, 15 November 2007 (UTC)

[edit] Wikipedia

Anybody know how much data (in Bytes) is contained in the entirety of en.wikipedia.org? —Preceding unsigned comment added by 68.41.142.17 (talk) 03:40, 26 February 2008 (UTC)

[edit] Useful orders of magnitude

I would imagine that one of the more useful orders of magnitude that most non-computer proficient laymen (and yes, I do include myself in that group) need is the ol' byte->kilobyte->megabyte->gigabyte, etc, etc. It is how I ended up here in the first place, looking to confirm my knowledge of it. SteveCoppock (talk) 03:01, 1 May 2008 (UTC)

[edit] Reference with Obsolete Comparison

http://www.uplink.freeuk.com/data.html

The page was last modified on:

Last-Modified: Sun, 12 Mar 2000 00:12:24 GMT

Their reference to the "entire Internet being roughly 100 Terrabytes" seems a tad dated. It's not uncommon for people to have hard drives 1% of that size in their desktop PC. —Preceding unsigned comment added by 86.155.191.111 (talk) 11:47, 5 May 2008 (UTC)