Talk:Generalized linear model
From Wikipedia, the free encyclopedia
This text should be read with care, because of the errors, in the text and formulas.
Contents |
[edit] Cleanup
There are no errors here, but this article needs to be cleaned up and fleshed out a bit. I will try to work on it soon... --shaile 00:33, 14 April 2006 (UTC)
- I don't like: "In statistics the generalized linear model (GLM) generalizes the ordinary least squares regression." Surely OLS is an estimation method, not a model? In my view, it should say "In statistics the generalized linear model (GLM) generalizes the linear model." Blaise 21:45, 23 September 2006 (UTC)
-
- You bettter believe OLS implies a model. Granted, a model that most would agree is wrong for their data, but some models are useful, so we use them. I'd suggest merging those two, they cover the exact same subject. Pdbailey 13:39, 26 September 2006 (UTC)
The second sentence of the introduction is terribly muddled. It should be clarified. I would do so, but I am currently struggling to understand this subject, so I shan't try just yet. Thomas Tvileren 08:44, 15 October 2007 (UTC)
[edit] proposed changes
The edit I made (that was rved) did trim the overall size of the page and reduce the number of examples, but as it stands it is very difficult to understand what a GLM is. The basic question is what is a glm? After we have said that, we need to say why you might want to use one and then I think we should get a little into how parameters are estimated. I'm going to rv it back and expand substantially on what I wrote last time.
It would be great if we could expand on the part about using any CDF. here we have the alternative example where Y = 1ifη > 0 and is zero otherwise. Pdbailey
- Would you explain what you meant by this sentence: "Because the variance portion is not constant, there is no sense to least squares, instead the parameters must be estimated with maximum likelihood or quasi maximum likelihood." It is not clear at all... Thanks! (Also, PLEASE sign your comments! All it takes is 4 ~'s) --shaile 22:21, 20 April 2006 (UTC)
- Sorry, I don't have time to fix it up right now. The general idea is that OLS won't work because none of the assumptions hold (independance, constant variance)Pdbailey 00:11, 21 April 2006 (UTC)
[edit] on using η
So the reason that I like the η terminology is that it seperates out the linear part of the equation (Xβ) from the random part (Y). It makes it clear that there is a linear model in there. Also, the way the second equation is now (
so that
) it is a bit of a garbled mess. Pdbailey 02:21, 21 April 2006 (UTC)
- I guess I see your point. On the other hand, I think maybe it's better to keep a similar notation to that of Linear model, thus dropping the η. As for the second equation, I think it should be either
or
You're right, the inverse g function is a bit confusing, and I think
is the more standard notation. I need to find the paper for this, I have it around somewhere... --shaile 04:21, 21 April 2006 (UTC)
-
- I'm working off lecture notes from McCullagh (author of one of the References) and I'll have to admit that my notes are not an example of clarity (hence g instead of g − 1). However, I really like the seperation that McCullagh emphasies between the linear part of the model (which describes the mean behavior with a linear equation) and the variance part of the model, which describes the dispersion of the Y values. In fact, he argues that it is always clearer to write a model in the fashon -- to seperate out the expected value from the dispersion. In a GLM, when they are seperated it is clear that the link, well, links the two portions of the model. It gives some perspective into the relevance of the link function and seperates out clearly the three components that are present in all GLMs. Some people might think that probit and logit are worlds apart, but in this framework, it is clear that they are minor variants on each other.
-
- This form is also echoed in programs such as STATA which has a linear model, a link function, and a variance function. But I'm affraid that we just disagree on this one -- you like the simplicity of having it all in one formula, I like seeing all three components seperated. Pdbailey 05:19, 21 April 2006 (UTC)
-
- I just changed the page in light of this discussion not to rv but to state the model more clearly in the way I'm arguing for. Pdbailey 05:27, 21 April 2006 (UTC)
-
- Actually, this is fine. It was just more confusing the way you had it before. :) --shaile 13:21, 21 April 2006 (UTC)
-
-
- Based on the objects that the two of you raised, I changed it back to look more like it used to. I think that it is easier to get a handle on this way quickly. Pdbailey 16:10, 27 April 2006 (UTC)
-
-
- Would someone clarify this:
? It's not at all clear, and I don't see how the error term is a function of the other parts, actually, I don't think it should be stated this way. Also, We need more details in there, it was better when the three parts of GLM were clarified separately. Any objections? --shaile 19:22, 27 April 2006 (UTC)
- Would someone clarify this:
[edit] Reorganization and exponential family detail
I have been reorganizing this article a bit, starting at the top. I wish however to point out a small but important change I made to the definition of exponential family here. Where before it contained a term a(y)b(θ), in McCullagh & Nelder (p28) it clearly shows a(y)θ. I have made this change and eliminated the reference to the b function. If there is a more reputable source than M&N (hard to imagine) which has the more general a(y)b(θ) form, feel free to revert but please include the source. Baccyak4H 18:50, 27 October 2006 (UTC)
- I think that's a good change you made to the definition of the exponential family. Do you think we could include how this relates to the link function? (I know how to do that, but I'd have to do a few to remember exactly where the link function comes from...) --shaile 20:57, 27 October 2006 (UTC)
Thanks. I was planning to address some more advantages of this form (M&N's form) including sufficient statistics, variance as function of a, c and d, and the canonical parameter. But that must wait; my copy is elsewhere now ;-). I certainly will be continuing my reorganization (mostly to make different editors' contributions sound like they came from one editor -- my pet peeve). While I may do a lot of tweaking, please feel free to improve my efforts. Baccyak4H 03:06, 28 October 2006 (UTC)
- while M&N use that unusual exponential family form in their book they are equivalent (in the sense that you point out) and I think it makes sense for wikipedia to the broader definition. Which is to say that the notational convenience that it affords one book may not be as nice in an encyclopedia. Also, this article is for a much broader audience than the book and covers in a lot less detail. Pdbailey 03:51, 28 October 2006 (UTC)
I suppose in the end a more general formula would be better. Is there a version like the current version which also contains the so-called dispersion parameter? M&N has that (φ), and given it appears in an overdispersion context twice (one of which I just added), I see some merit in including it in the formula: one of the great merits as I see it is the unification that the theory provides - another reason I may discribe the canonical parameter some more (Note: I removed it from the link table title, not because it was wrong (it was not), but because it was unexplained. I plan on returning to fix that). Baccyak4H 03:00, 29 October 2006 (UTC)
Well, here is a possible generalization of the exponential family formula which includes a dispersion parameter φ. THe inclusion of φ is in the spirit of M&N's definition, but the rest of the formula is the same as the current one.
- Old:
.
- New:
.
I tried to reword the discussion there to apply to this. Baccyak4H 14:58, 30 October 2006 (UTC)
Just dawned on me that for the b(θ) formula, if b is invertable then it is exactly equivalent to the version without b; they merely represent a reparameterization of each other. Baccyak4H 16:34, 31 October 2006 (UTC)
[edit] History and Motivation
1. There is no mention of the probit link. From a passage in McCullagh and Nelder, the probit work is historically important, in particular the presentation of the scoring algorithm in an appendix written by R.A. Fisher for a paper by the toxicologist Bliss.
2. Mention that in practice glm's provide an important way to address heteroscedasticity.
My apologies in the event I have overlooked some passage that addresses these concerns.
Dfarrar 04:48, 20 February 2007 (UTC)
There is an article (stub) under development for probit. Dfarrar 04:51, 20 February 2007 (UTC)
[edit] Probit link?
Should the probit link be included in the table of canonical link functions, or is it not considered a canonical link function? Bill Jefferys (talk) 23:43, 20 December 2007 (UTC)
- It isn't a canonical link for any of the distributions there. I do suspect there is a distribution that makes it such, but don't know off the top of my head what it might be. Baccyak4H (Yak!) 02:23, 21 December 2007 (UTC)
[edit] Boldface vectors
Shouldn't beta be in boldface throughout? At present some of the betas are bold and others aren't. This is particularly disturbing when it happens in Xβ.
--84.9.83.26 (talk) 11:46, 19 December 2007 (UTC)
- Yes, consistency is a good thing. Nice catch. Baccyak4H (Yak!) 14:47, 19 December 2007 (UTC)
- Update I did that fix; sorry if I missed any. I used wikimarkup (multiple single quotes) rather than html markup (<tags>) for consistency with the rest of the article. Baccyak4H (Yak!) 15:00, 19 December 2007 (UTC)
Thanks! :-)
Another one. In a vector context, shouldn't the linear predictor η be boldface too? η = X β
--84.9.73.5 (talk) 13:03, 21 December 2007 (UTC) (formerly 84.9.83.26)
- This would depend on context. For one observation or data point, no, since X is a row vector. For the entire data, yes. I am not sure what you mean by "vector context", but I hope those two scenarios clarify things. Baccyak4H (Yak!) 16:26, 21 December 2007 (UTC)
[edit] See also
Several types of regression which fall under this topic have been added to the see also section. However, links to these already appear where they are discussed higher up in the article. I propose to remove them in the see also section. Anyone with me here? Baccyak4H (Yak!) 19:12, 2 February 2008 (UTC)
[edit] A couple of points
I think the technical term for the family of distributions is the Exponential Dispersion Family, though unfortunately I have no sources to confirm it other than my hazy memory. Can anyone confirm this?
Technically one doesn't even need to specify a distribution to fit a GLM, only a variance function is required (though specifying a distribution means one can estimate the dispersion parameter by maximum likelihood. However I don't know if that can be worked into the article without making it more confusing.
There is no mention in the article of iteratively re-weighted least squares (IRLS or IWLS depending on who you talk to), the method used for estimating the parameters, and the current article in that location doesn't seem relevant GLMs.
Thoughts? -3mta3 (talk) 00:17, 8 April 2008 (UTC)
- 3mta3, good points. One does need to specify an exponential family to fit with ML, but you are right that there are pseudo-ML options as well. I think this could be a new seciton. There also really should be a section on IRLS and Fisher steps, this a big part of what brings the GLM together, they can all be fit with the same general solver. Pdbailey (talk) 02:48, 8 April 2008 (UTC)

