Data
From Wikipedia, the free encyclopedia
| This article needs additional citations for verification. Please help improve this article by adding reliable references. Unsourced material may be challenged and removed. (July 2007) |
Data (singular: datum) refers to a collection of natural phenomena descriptors including the results of experience, observation or experiment, or a set of premises. This may consist of numbers, words, or images, particularly as measurements or observations of a set of variables.
Contents |
[edit] Etymology
The word data is the plural of Latin datum, neuter past participle of dare, "to give", hence "something given". The past participle of "to give" has been used for millennia, in the sense of a statement accepted at face value; one of the works of Euclid, circa 300 BC, was the Dedomena (in Latin, Data). In discussions of problems in geometry, mathematics, engineering, and so on, the terms givens and data are used interchangeably. Such usage is the origin of data as a concept in computer science: data are numbers, words, images, etc., accepted as they stand. Pronounced dey-tuh, dat-uh, or dah-tuh.
Experimental data are data generated within the context of a scientific investigation. Mathematically, data can be grouped in many ways.
[edit] Usage in English
In English, the word datum is still used in the general sense of "something given", and more specifically in cartography, geography, geology, NMR and drafting to mean a reference point, reference line, or reference surface. More generally speaking, any measurement or result can be called a (single) datum, but data point is more common[3]. Both datums (see usage in datum article) and the originally Latin plural data are used as the plural of datum in English, but data is more commonly treated as a mass noun and used in the singular, especially in day-to-day usage. For example, "This is all the data from the experiment". This usage is inconsistent with the rules of Latin grammar and traditional English, which would instead suggest "These are all the data from the experiment". Many British and UN academic, scientific, and professional style guides (e.g., see page 43 of the World Health Organization Style Guide) request that authors treat data as a plural noun. Nevertheless, it is now usually treated as a singular mass noun in informal usage, but usage in scientific publications shows a strong UK/U.S divide. U.S. usage tends to treat data in the singular, including in serious and academic publishing, although some major newspapers (such as the New York Times) regularly use it in the plural.[1] UK usage now widely accepts treating data as singular in standard English[2], including everyday newspaper usage[3] at least in non-scientific use.[4] UK scientific publishing usually still prefers treating it as a plural.[5]. Some UK university style guides recommend using data for both singular and plural use[6] and some recommend treating it only as a singular in connection with computers.[7]
[edit] Uses of data in science and computing
Raw data are numbers, characters, images or other outputs from devices to convert physical quantities into symbols, in a very broad sense. Such data are typically further processed by a human or input into a computer, stored and processed there, or transmitted (output) to another human or computer. Raw data is a relative term; data processing commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next.
Mechanical computing devices are classified according to the means by which they represent data. An analog computer represents a datum as a voltage, distance, position, or other physical quantity. A digital computer represents a datum as a sequence of symbols drawn from a fixed alphabet. The most common digital computers use a binary alphabet, that is, an alphabet of two characters, typically denoted "0" and "1". More familiar representations, such as numbers or letters, are then constructed from the binary alphabet.
Some special forms of data are distinguished. A computer program is a collection of data, which can be interpreted as instructions. Most computer languages make a distinction between programs and the other data on which programs operate, but in some languages, notably Lisp and similar languages, programs are essentially indistinguishable from other data. It is also useful to distinguish metadata, that is, a description of other data. A similar yet earlier term for metadata is "ancillary data." The prototypical example of metadata is the library catalog, which is a description of the contents of books.
[edit] Meaning of data, information and knowledge
The terms information and knowledge are frequently used for overlapping concepts. The main difference is in the level of abstraction being considered. Data are of highest level, information is next, and finally, knowledge is of the lowest level among all three. In other words, one can call both information and knowledge as data, not vice versa. However, in recent interdisciplinary research a few independent specializations of these terms have been proposed....
Information as a concept bears a diversity of meanings, from everyday usage to technical settings. Generally speaking, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern, perception, and representation.
Beynon-Davies [4] uses the concept of a sign to distinguish between data and information. Data are symbols. Information occurs when symbols are used to refer to something.
[edit] See also
- Biological data
- Data acquisition
- Data analysis
- Data domain
- Data element
- Data farming
- Data integrity
- Data maintenance
- Data management
- Data mining
- Data modeling
- Data processing
- Data recovery
- Data remanence and data destruction techniques.
- Data set
- Data warehouse
- Database
- Datasheet
- Drylabbing, creating false data.
- Environmental data rescue
- Metadata
- Scientific data archiving
- Statistics
[edit] References
- ^ "The plural usage is still common, as this headline from the New York Times attests: “Data Are Elusive on the Homeless.” Sometimes scientists think of data as plural, as in These data do not support the conclusions. But more often scientists and researchers think of data as a singular mass entity like information, and most people now follow this in general usage."[1]
- ^ New Oxford Dictionary of English, 1999
- ^ "...in educated everyday usage as represented by the Guardian newspaper, it is nowadays most often used as a singular."[2]
- ^ Beynon-Davies P. (2002). Information Systems: an introduction to informatics in Organisations. Palgrave, Basingstoke, UK. ISBN: 0-333-96390-3
This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.

