Talk:Controlled vocabulary
From Wikipedia, the free encyclopedia
If the below is the definition of controlled vocabulary:
- The terms are chosen and organized by trained professionals (including librarians and information scientists) who possess expertise in the subject area.
Are the automatically generated lists at Amazon.com CAPs (Capitalized phrases) and SIPs (Statistically Improbable Phrases) controlled vocabularies?
--Jahsonic 23:47, 19 November 2005 (UTC)
-
- Interesting questions. I'd say that the Amazon lists are examples of folksonomy; SIPS are something else entirely (quite innovative, I think). But neither of these are consciously organized or bureaucratically implemented, or beset by the inherent conservatism and inflexibility of top-down classification schemes. Bryan 00:14, 20 November 2005 (UTC)
[edit] Controlled vocabulary = taxonomy?
The first sentence of this article implies that a controlled vocabulary is synonymous to a taxonomy. But is that true? Can someone verify this? - 212.187.26.113 10:39, 30 January 2006 (UTC)
- A taxonomy is a list of permitted words, but it suggests broader and narrower subjects for each term. A controlled vocabulary is also a list of permitted words, but it lists the synonyms and other closely related terms that are not permitted. The person who wrote that is probably thinking of the Library of Congress Subject Headings, which includes both features, but you could have a list that is just one or the other. GUllman 22:54, 30 January 2006 (UTC)
[edit] missing definition in the first place
The whole first paragraph of the article doesn't make clear what a controlled vocabulary actually is: It mentions for what it is used ("to tag units of information"), who have chosen/organized them ("trained professionals"), where they are for ("can accurately describe..."), where they are published ("controlled vocabulary...are often published in..."), and of what they are part of (CVs "form part of a larger universe of nomenclatural approaches...").
The half a sentence "A controlled vocabulary is a carefully selected list of words and phrases," is probably not a matching definition of CVs. One (important) property of CVs is, that they are non-ambiguously assigned to terms and vice versa, so that neither homonyms nor synonyms are contained.
Also, because of the twice mentioned possible usages of CVs ("to tag units of information", "can accurately describe...") I vote for a clean-up of the introductional paragraph. --80.135.172.224 14:57, 25 February 2006 (UTC) (gneer, not logged in; password lost, and WP doesn't send it..):
[edit] Some random thoughts
- A comparison should be made between controlled vocabulary and natural language. Free text search is natural language yes but it has more to do with indexing exhaustivity (you index almost everything as opposed to a few terms) rather than being the polar opposite of controlled vocabulory .
- If we are talking about indexing schemes (which may not be the case here) there are probably three kinds of indexing schemes , controlled language (index terms taken from predefined terms), natural language (index terms taken from text only), free indexing (index terms can be taken either from text or anywhere else).
- Strengths and weaknesses of controlled vocabulary vs natural language should probably include stuff like control of synonyms, polysemes, using scope notes to control homographs (all strengths), lack of specificity and exhausitivity , slow updating , high input costs, difficulty of use by normal users and a few others weaknesses.
- There should be mention of both Thesauri and subject heading as 2 major examples of controlled vocabulary. Technically there's a slight difference between Thesauri and Subject headings schemes. Subject headings like LCSH evolved from the library environment while Thesauri was developed more for indexing of documents. As a result there are a few (minor?) differences like some LCSH terms are still displayed in indirect order , subject heading terms tend to cover multi concepts with phrase heading(pre-co-ordinate indexing) as opposed to typically one word terms (descriptors) in thesauri.
It used to be that the equivalence, associated, broader, narrower term was only found in Thesauri while Subject headings at best had equivalence terms. I believe LCSH only added BT, NT, RT fairly (last few decades) recently? Still nowadays the difference has narrowed quite a bit. Thesauri also tend to be more specialized for a narrow subject field, describing documents while LCSH is more for describing library catalogs (books) which are wider in description.
- The taxonomy thing is covered on Corporate taxonomy but currently it is a mess. There doesn't appear to be a really strong consensus on the definition and I'm currently reading a paper that tries to entangle the differences between thesauri, classification systems and taxonomies. In brief taxonomies are based on both thesauri for labeling semantics and classification systems (either facets or hierarchies) for structure. The main point here is that taxonomies are generally organization specific, more user focused as opposed to being based on literary warrant like most thesauri/classification scheme. They focuses on not just books,catalogs or "connecting people to documents but also people to people". While taxonomies can play roles such as filtering search results, conveying context of search it's primary role is that of supporting browsing. There is a line about how taxonomies differ is not in foundations but in deployment. But I suppose that view comes from the recently fashionable knowledge management/organization field?
Aarontay 10:45, 30 January 2007 (UTC)

