User:Wvbailey/Function definitions

From Wikipedia, the free encyclopedia

This is an expansion of the article Function (mathematics).

Contents

[edit] Mathematical Function as either a "rule" or a table of values

The informal idea of a function as a rule is used as the definition of a function in informal contexts. For example, Minsky gives these two definitions:

"What is a function? Mathematicians have several more or less equivalent ways of definiting this. Perhaps the most usual definition is something like this:
1 A function is a rule whereby, given a number (called the argument), one is told how to compute another number (called the value of the function for that argument) ...
Another way mathematicians may define a function is:
2 A function is a set of ordered pairs <x,y> such that there are no two pairs with the same first number, but for each x, there is always one pair with that x as its first number.
So we will think of a function as an association between arguments and values -- as a set of ordered pairs <x,y> such that there is just one pair for each x. For each function there may be many definitions or rules that tell how to find the value y, given the argument x" (numbers added, Minsky 1967:132-134).

Suppes in his Introduction to Logic (1957) stated the definition of function this way:

1 "A function is a rule which assigns to each element of a given set a unique element of some other, not necessarily distinct set ... this definition is intuitively satisfactory.
2 "On the other hand... More formally, a function R is a binary relation such that if x R y and x R z then y = z. ... A function is sometimes said to map its domain onto its range. ... In logic a function is often called a many-one relation." (numbers added, Suppes 1957:230)

[edit] A Function as a Turing machine

As an theory alternative to other mathematical theories, computation requires only the positive integers. Turing discussed this in his 1936 and proposed the use use of the Cauchy sequences to compute every possible real number that is possible to compute, to some decimal precision,[1]

A technical problem with the definition of function:

Of course, to make sense of the informal definitions of function in the previous section -- written as it is in the English language -- the reader would need be able to read English and know a bit of set theory: the notions of "element" and "set", "binary", "relation", "domain" and "range", and "ordered pair".

Herbert Breger gives an egregious example of the tacit knowledge that the student brings to symbols and their interpretations. From symbols common to most mathematicians -- i.e. ⊗ ℵ ↑ ∆ N M ∩ * π -- asks us to guess what is written here.

"Mathematics as a purely formal system of symbols without a human being possessing the know-how for dealing with the symbols is impossible.... Look, for example, at the following lines:
Ω * π
N * π ⊗ FN * π
N ∩ M * π ℵ N ↑ M ⊗ FN ↑ FM
N * π ⊗ FN ∆ N
"These lines are a piece of mathematics. Even if you succeed in finding the meaning of the symbols, or, to be more percise, in finding how to deal with the symbols, your painstaking efforts in order to understand will be sufficient to prove my thesis." (Breger in Grosholz and Breger 2000:227)[2]

Thus these informal definitions of "function" may be sufficient for casual purposes, but as the example above demonstrates they are relying on an undefined concept of a "rule" ... and an entire body of interpretive knowledge. Minsky criticized his own definition:

"an effective procedure is a set of rules which tell us, from moment to moment, precisely how to behave:
"This attempt at a definition is subject to the criticism that the interpretation of the rules is left to depend on some person or agent" (Minsky 1967:106)

He proposed a solution:

"We could avoid the problems of interpretation -- of understanding -- if we could specify, along with the statement of the rules, the details of the mechanism that is to interpret them." (ibid)

He then proposed a candidate-machine -- a Turing machine:

Now in general, what is on [the machine's] tape when the machine stops will depend in some complicated way on what was on the tape at the start. So we can say that the tape result of the computation is a function of the input. Exactly what function it is depends, of course, on what Turing machine was used. Hence we can think of a Turing machine as defining, or computing, or even as being a function. (Minsky 1967:134)

[edit] A function as an arithmetic operator defined by formal number theory

-- a work in progress -- ¬⊃≡⋍≃↔Є ∞ ⇒⋀⊗ℵ↑∆←⊆∉∈∸→⊂∀∃ℕ∩∪ǎăℬ⇔

This form of number theory extends for all the real numbers: -∞, 0, +∞. Kleene 1952 starts at chapter IV A FORMAL SYSTEM then skips to Chapter VIII FORMAL NUMBER THEORY.

To develop this theory he immediately defines what he calls three "function symbols" + (plus), * (times), ' (successor). But the development is worth repeating to see what is going on. In summary:

His entire collection of symbols is Logical symbols: ⊃ (implies), & (and), V (or), ¬(not), ∀ (for all), ∃ (there exists). Predicate symbols: = (equals). Function symbols: + (plus), * (times), ' (successor). Individual symbols: 0 (zero). Variables: a, b, c, .... Parentheses: (, ).
From these symbols and the notion of juxtaposition (or concatenation) he defines term. From term he defines formula. He develops the notions of "scope of a variable" and "free variable". Then he introduces the notion of substitution. Finally he develops the notion of transformation rules, in particular the three deductive schema. Here the symbol ⇒ is being used in place of a long line under the expression, and indicates "yields" in the tautological or derivational sense. The symbols A, B are formulas, A(x), A(t) indicate a formula with a free variable:
  • A & (A ⊃ B )⇒ B
  • C ⊃ A(x)⇒ C ⊃ ∀xA(x)
  • A(x) ⊃ A(t)⇒ ∃xA(x) ⊃ C
These transformation rules are three of 21 postulates that he divides into three categories:
  • GROUP A1: Postulates for the propositional calculus (formulas with no free variables)
  • GROUP A2: (Additional) Postulates for the predicate calculus (incorporating formulas A(x) with a free variable x)
  • GROUP B: (Additional) Postulates for number theory

It is in the last group B that we see the "function symbols" + and * appear. As these are axioms, they are worth repeating:

  • 13. A(0) & ∀x(A(x)⊃A(x') ⊃ A(x). (axiom of induction, cf Peano axioms)
  • 14. a'=b' ⊃ a=b. (Peano axiom)
  • 15. ¬a' = 0. (Peano axiom)
  • 16. a=b ⊃ (a=c ⊃ b=c). (Peano axiom)
  • 17. a=b ⊃ a'=b' (Peano axiom)
  • 18. a+0=a (additive identity defined)
  • 19. a+b'=(a+b)' (addition function-symbol defined in a recursive sense, cf Kleene 1952:186)
  • 20. a*0=0 (multiplicative identity defined, an axiom)
  • 21. a*b'=a*b+a (multiplication function-symbol defined in a recursive sense, cf Kleene 1952:186)

Finally, he defines the notion of immediate consequence of the premise(s) by the rules.

In the final chapter he introduces the INDUCTION RULE over formulas with variables i.e. A(x). From all of this he deduces (proves) the familiar "arithmetic laws", including "association", "distribution", and "commutation" for + and *, the notions of additive identity and multiplicative identity, and the notions of multiplicative and additive inverses, the order properties (<, ≤, >, ≥), other more unusual properties such as "connexity, irreflexiveness, asymmetry, inequalities under addition and multiplication), but in particular interest the existence and uniqueness of quotient and remainder.

From this follows the notion of formally provable and we have, in a nutshell the same development used by Kurt Gödel in his Incompleteness theorems (1931) (which is fact where Kleene's development stops until he introduces the notion of primitive recursive functions.

[edit] A function as defined by the notion of the 5 primitive recursion schema (operators, equations)

-- a work in progress --

Because the most common way a function is defined -- and the way most students know a function to be from algebra and the rules of arithmetic they use to divide and multiply -- is by what is formally known as the primitive recursive functions. Addition, subtraction, multiplication, division, exponentiation, factorials, etc. can all be described in terms of 5 primitive "function schema" that form the "basis" of primitive recursion.[3] From these schema (almost) every number that can possibly be calculated can be calculated.

"A function is primitive recursive, if it is definable by a series of applications of these five operations of definition."[4]

Thus the function is defined by these equations, in the sense developed above of symbols and formulas related by the (=) equals symbol.

  1. Successor (add one to a number e.g. f(5) = 6)
  2. Constant (just establish the constant value e.g. f(x) = 5 to plug into the next equation)
  3. Identity or projection (plucks one formula or constant or successor from a list so we can plug it into the next formula)
  4. Substitution (plug the result of a previous formula into the next one)
  5. Recursion (plug the result of previous formula back into the formula itself)

[edit] A function as a restricted relation: a set theoretic definition

Unlike computability (Turing machines) and calculatibility (recursion theory), set theory is fully-axiomatized, and in its rigorous form it has been highly successful in confirming that what one does in arithmetic, algebra, calculus, abstract algebra, real and complex analysis, etc is okay to do. For example, it (like number theory but more convincingly) confirms that one can in fact express any expressible number as a Cauchy sequence. A number of notions (a good example is the "inverse") can be defined very precisely, built as they are upon a firm axiomatic foundation.

[edit] Axiomatized rules

In the mid-1960's Kurt Gödel agreed with the above stance of Minsky, to the extent that he (twice) eschewed the very recursion theory he himself was largely responsible for, in favor of Turing machines. [footnote and citation here] Some 30 years earlier, In 1933-4 [?] he had proposed to Church, in a strongly-worded negative opinion of Church's proposed use of Church's λ calculus to define the notion of "effective calculability", that a set of axioms might be employed instead. But about the same time he realize that his own recursion theory (nowadays called primitive recursion) that he had used in his 1931 paper, would also be inadequate to the task. Together with a suggestion of Herbrand he proposed to Church what became known as "general recursion" (nowadays known as μ-recursion). Shortly thereafter Church's students Kleene and Rosser proved that "general recursion" was equivalent to the λ-calculus, and in 1936-7 Turing proved that his a-machines were equivalent to the λ-calculus.

An axiomatization of "effective calculability" still does not exist. About as close as computation theory comes is the Church-Turing thesis -- an hypothesis that almost all accept as true, but more on heuristic arguments than on any axiomatization.

Problems with an axiomatizatic approach include tacit assumptions (cf Breger 2000, Minsky 1967 quoted above). Be that as it may, however, in the late 1800's another set of axiomitized rules were in development, but from a different angle -- that began with Cantor (1897), proceeded forward with Zermelo's 1908 and evolved through Fraenkel and Godel and von Neumann into what is now known as "Frankel-Zermelo" set theory "Godel-Bernays" set theory, "von Neumann" set theory (cf Manin 1977:95).

The consensus of modern mathematicians is that the word "rule" should be interpreted in the most general sense possible: as an arbitrary "set theoretic notion" of binary relation with certain restrictions. The following is a formal definition of those restrictions:[5] [6] [7]

[edit] Formal restrictions on the notion of "function"

The following is Manin's contingencies (constraints) on the definition[8]. Here f is a function (mapping), u is a set made of elements ui, v is a set made of elements vi, u x v is the Cartesian product, the set of all possible ordered pairs <ui, vi>:

  • f is a subset of u x v
  • the projection of f onto u concides with all of u
  • each element of u corresponds to exactly one element of v
∀z(z ∈ f ⇒ (∃u1 ∃v1 (u1∈u ⋀ v1∈v ⋀ "z = <u1, v1>")))
⋀ ∀u1 (u1∈u ⇒ ∃z(v1∈v ⋀ "z=<u1,v1" ⋀ z∈f))
⋀ ∀y1 ∀v1 ∀v2 (∃z1 ∃z2 (z1 ∈f ⋀ z2 ∈f ⋀ "z1 = <u1, v1>" ⋀ "z2 ⋀ <u1, v2>" ) ⇒ v1 ⇒ v2

In the following we will use set X in place of Manin's set u and set Y in place of Manin's v and write individual elements as x of X and y of Y (in set-theory parlance, x∈X and y∈Y):

The condition for a binary relation ƒ from X to Y to be a function can be split into three conditions:

"f is a mapping from the set X to the set Y. First of all, mappings, or functions, are identified with their graphs; otherwise, we would not be able to consider them as elements of the universe [of discourse][9]. The following ... successively imposes three conditions on f:
1. "f is a subset of X x Y;
X x Y is the Cartesian product or "cross-product" of sets X and Y that generates every potential "ordered pair" <x,y> that a function could "be" when producing its output y's from its input x's. Intuitively, in algebra the Cartesian product (defined over a domain and a range of the positive integers) corresponds to the collection of "dots" of the X-Y plane <x,y> = { <0,0>, <0,1>, <0,2>, <0,n>, ... <1,0>, <1,1>, ... <m,n> } only a few of which will actually be "be" the function as represented by its "graph".
2. "the projection of f onto X coincides with all of X;
The "projection of f onto X" corresponds to the notion of the so-called first coordinates, those x's of the function's ordered pairs. Intuitively, this projection corresponds to the X-axis of algebra. By definition, the function must use every possible x-value along the axis and "effectively" produce one and only one y-value. Thus the notion of y = "square-root of x" requires two functions, one to, for example, produce +3 = sqrt(9), the other to produce -3 = sqrt(9). If, given a particular x, f fails to produce any y at all (i.e. it is "ineffective" for a given x) we say the function f is a "partial" function; a good example is y = 3/x when domain X = { 0, 1, 2, 3 }.
3. "and, each element of Y corresponds to exactly one element of X...."
The set Y will be the "effective range", the collection of those y (it could be as few as none at all) that the function f "effectively" produces (i.e is capable of producing in a finite time) given any of the x's in its domain of definition. Intuitively, the set Y will be the y-values "projected on" the "Y-axis" of algebra. Not every y-value need be used, and the same y may be used more than once, but for every x there had better be one and only one y."(quoted from Manin 1977:12 with "u" changed to "X" and "v" changed to "Y" and numbers and explanatory notes added.)[10]

Example:

Notice that this example uses the algebra function f(x) = x2. But it does not say exactly what this "function" (aka "rule") really is. Is it a string of typographical symbols f, (, x, ), = x, 2 arranged in a certain manner on the page? Is it the machine or person "that computes/calculates" when it sees the symbols -- i.e. an active "operator" or an agent "programmed" (designed, instructed) to take (or waiting to receive as input) an x from the domain X and produce only one y? Or is the function a specification (e.g. a program of instructions, a body of learning) that an agent (i.e. a student, a computer) follows to produce y from x? All these notions, and combinations thereof, are prevalent in the literature.
"For each function there may be many definitions or rules that tell how to find the value y, given the argument x. Two definitions or rules are equivalent if they define the same function. It may be very difficult, given two different rules, to tell if they are in fact equivalent. In fact, may, in a sense, be impossible."[11]
See more at Church-Turing thesis and algorithm.
Given a "domain of discourse" (the "universe") of the natural numbers 0 to infinity ∞ and a similar "codomain"[12] of 0 to ∞, we might choose (e.g. to keep our example tiny) to restrict our "domain of definition" to X = { 0, 1, 2 }. What will be our "effective" range? We need a function first: Plug these numbers into f(x) = y = x2. The "effective" output set will be Y = { 0 = 02, 1 = 12, 4 = 22 }.
We have a couple ways of presenting our results. One is as a table: x-value in, y-value out. Or, our function f(x) = y could be represented by the collection of ordered pairs (the x-value of the ordered pair is sometimes called the "first coordinate"; the y-value is called the "second coordinate"):
f(x) = { <0,1>, <1,1>, <2,4> }
Finally, we might plot these as a "graph". To do so we first create the collection of every possible ordered pair, i.e. the Cartesian product, from the two sets X and Y: X = {0, 1, 2} x Y = {0, 1, 2, 3, 4, 5}, i.e. a collection of 3*6 = 18 ordered pairs:
X x Y = {0, 1, 2} x {0, 1, 2, 3, 4, 5} =
{ <0,0>, <0,1>, <0,2>, <0,3>, <0,4>, <0,5>, <1,0>, <1,1>, <1,2>, <1,3>, <1,4>, <1,5>, <2,0>, <2,1>, <2,2>, <2,3>, <2,4>, <2,5> }.
The three ordered pairs in bold-face are, in fact, our function and will be three dots in the X-Y plane represented by the Cartesian product.

In some contexts, a relation that is total, but not necessarily single-valued, may be called a multivalued function. As shown in the middle image, a relation that is single-valued but not necessarily total may be called a partial function.[13].

[edit] Notes

  1. ^ Computable convergence in Turing 1936 in Davis 1965:142ff Also The Computable Real Numbers in Minsky 1967:157ff)
  2. ^ Peano's axioms: 1 ∈ N; n ∈ N ⇒ n' ∈ N; n,m ∈ N ⋀ n = m ⇒ n' = m'; n ∈ N ⇒ n' ≠ n (semicolons added)
  3. ^ (cf Kleene 1952:219)
  4. ^ (Kleene 1952:219)
  5. ^ From Suppes 1960, Axiomatic Set Theory: "If R is a relation then the domain or R (in symbols: DR) is the set of all things x such that, for some y, <x,y> ∈ R. ... The range of R ... is the set of all things y such that, for some x, <x,y> ∈ R. ... The range of a relation is also called the counterdomain or converse domain ... The field of a relation ... is the union of its domain and range" (Suppes 1960:59).
  6. ^ From Halmos 1970, Naive Set Theory: "If X and Y are sets, a function from (or on) X to (or into) Y is a relation f such that dom f = X and such that for each x in X there is a unique element y in Y with (x, y) ∈ f. The uniqueness condition can be formulated explicitly as follows: if (x,y) ∈ f and (x,z) ∈ f, then y=z. ... The words map or mapping, transformation, correspondence, and operator are among some of the many that are sometimes used as synonyms for function ... For relations in general, and hence for functions in partiular, we have defined the conepts of domain and range" (Halmos 1970:30-31).
  7. ^ Expanding the notion of a "function" as a type of "restricted relation": Enderton makes this comment: "At one time it was popular to distinguish between the function and the relation (which was called the graph of the function). Current set-thoretic usage takes a function to be the same thing as its graph. But we still have the two ways of looking at the function". He wants to reserve the word "decidablility" for relations and "the analogous concept" (p. 208-209) "computability" for functions. He thus defines the notion of "computable": "...iff there is an effective procedure that, given any k-tuple a of natural numbers, will produce f(a)" [a replaces "a" with an arrow over it]. He then provides a proof that "The following three conditions on a function... are equivalent: (a) f is computable, (b) when viewed as a relation, f is a decidable relation. (c) When viewed as a relation, f is an effectively enumberable relation" (Enderton 2001:208).
  8. ^ Manin 1972:12
  9. ^ "Domain of discourse, "universe of discourse": cf index p. 355 in Boolos-Burgess-Jeffrey 2002 also page 103. Also Table VII in Enderton 2001:68.
  10. ^ Manin offers formal formulas (expressions) for these three conditions (Manin 1977:12) He later offers this definition: "A binary relation (or correspondence) r is a set (or class) of all of whose elements are ordered pairs. If r ∈ V is a relation, then its domain of definition dom(r) is the class of all first terms in the elements of r, and the range of values rang(r) is the class of all second terms.... A function is a binary relation in which each element is uniquely determined by its first term" (Manin 1977:101).
  11. ^ Minsky 1967:134
  12. ^ Suppes 1960 uses "counter-domain" and "converse domain".
  13. ^ Kleene 1952:326 is perhaps the first use of this distinction. "...let us now call a function from any subset (proper or improper) of the n-tuples of the natural numbers to the natural numbers a partial function. In other words, a partial function φ is a function which for each n-tuple x1, ..., xn of natural numbers as arguments takes at most one natural number φ(x1, ..., xn) as value. For [such an n-tuple] we say φ (or φ(x1, ..., xn) is defined. He then defines "completely defined", "incompletely defined" and "completely undefined" in the expected way. In essence, for Kleene an "incompletely defined" function would be one that would never terminate with an answer (p. 325). Also see Boolos-Burgess-Jeffrey 2002:7 and p. 14 problem 1.1. They then define "semirecursive relations" and "semidecidable sets" in a similar manner, that is, "if there is an effective procedure that, applied to any number, will if the number is in the set in a finite amount of time give the answer 'yes', but will if the number is not in the set never give an answer. For instance, the domain of an effectively computable partial function f is always effectively semidecidable ... if and when we succeed [in a computation] we know that n is in the domain; but if n is not in the domain, we never succeed (p. 80). ¶ The notion of "partial" function can be very confusing. Notice how broad Kleene's definition is: it seems to gather into its fold every possible "function" imaginable. This is correct: a "total" function is just a (lucky, preferred form of) restricted "partial" function. The keywords in Kleene's quote above are "at most" : "a function which ... takes [i.e. produces] at most one natural number" (italics added to the quote). In other words, Kleene's "at most one" includes "none at all". This means that, for a particular x input, the (partial) function could "take" no natural numbers whatever -- the computation results in an endless loop. An example is division by 0. Depending on the division algorithm, the function either produces all the natural numbers simultaneously or none at all -- just a "division by zero" error message. Per Kleene's proposal: To "produce no number at all" means that the computational/calculational "rule" never terminates with an output "number".

[edit] References

  • George Boolos, John P. Burgess, Richard C. Jeffrey, 2002, Computability and Logic: Fourth Edition, Cambridge University Press, Cambridge UK, ISBN: 0521 00758 5.
  • Enderton, Herbert B. (2001), A Mathematical Introduction to Logic: Second Edition, Harcourt/ Academic Press 
  • Paul R. Halmos, 1960, 1974, Naive Set Theory, Springer-Verlag, New York, ISBN 0-387-90092-6.
  • Husch, Lawrence S. (2001), Visual Calculus, University of Tennessee, <http://archives.math.utk.edu/visual.calculus/>. Retrieved on 27 September 2007 
  • Stephen C. Kleene, 1952, 10th impression, 6th reprint 1971, Introduction to Metamathematics, North-Holland Publishing Company, Amsterdam NY, ISBN: 0 444 10088 1.
  • Manin, Yuri I. (1977), A Course in Mathematical Logic, Springer-Verlag New York 
  • Marvin Minsky, 1967, Computation: Finite and Infinite Machines, Prentice-Hall, Inc., Englewood Cliffs, N.J. ISBN: none.
  • Patrick Suppes, 1960, 1972, Axiomatic Set Theory, Dover Publications, Inc., New York, ISBN: 0-486-61630-4.
  • Thomas, George B. & Finney, Ross L. (1995), Calculus and Analytic Geometry (9th ed.), Addison-Wesley, ISBN 978-0-201-53174-9 

Function Function Function