Talk:Latent semantic analysis
From Wikipedia, the free encyclopedia
Removed link to 'Google Uses LSA in Keyword Algorithms: Discussion' (which pointed to www.singularmarketing.com/latent_semantic_analysis ), as this seems to be an out of date link.
The section on the dimension reduction ("rank") seems to underemphasize the importance of this step. The dimension reduction isn't just some noise-reduction or cleanup step -- it is critical to the induction of meaning from the text. The full explanation is given in Landauer & Dumais (1997): A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge, published in Psychological Review. Any thoughts? --Adono 02:54, 3 May 2006 (UTC)
Contents |
[edit] Derivation
I added a section describing LSA using SVD. If LSA is viewed as something broader, and does not necessarily use SVD, I can move this elsewhere. -- Nils Grimsmo 18:16, 5 June 2006 (UTC)
[edit] SEO LIES
This link is straight up misleading It should not be included on this page, SEO LIES do not deserve a place on wikipedia. http://clasione.blogspot.com/2007/05/lsi-what-is-google-looking-for-on-your.html http://dpn.name if you need to contact me. removing the link for now. 121.50.194.2 04:37, 4 June 2007 (UTC)
[edit] intuitive interpretation of transformed document-term space
Many online and printed text resources describe the mechanics of the LSI via SVD, and the computational benefits of dimension reduction (reducing the rank of the transformed document-term space).
What these resources are not good at is talking about the intuitive interpretation of the transformed document-term space, and furthermore the dimension-reduced space.
It is not good enough to say that the procedure produces better results than pure term-matching information retrieval - it would be helpful to understand why.
perhaps a section called "intuitive interpretation". Bing Liu is the author who comes closest but doesn't succeed in my opinion [1]. —Preceding unsigned comment added by 82.68.244.150 (talk) 03:22, 4 October 2007 (UTC)
[edit] "After the construction of the occurrence matrix, LSA finds a low-rank approximation "
The current article states "After the construction of the occurrence matrix, LSA finds a low-rank approximation "
I'm not sure this is true. The LSA simply transforms the document-term matrix to a new set of basis axes. The SVD decomposition of the original matrix into 3 matrices is the LSA. The rank lowering is done afterwards by truncating the 3 SVD matrices. The LSA does not "find a low-rank approximation". The rank to which the matrices are lowered is set by the user, not by the LSA algorithm. —Preceding unsigned comment added by 82.68.244.150 (talk) 03:54, 11 October 2007 (UTC)
[edit] Further Readings
I suggest to list the following book for further reading, as it is a very complete reference for the whole theory behind LSA:
"Handbook of Latent Semantic Analysis" by homas K. Landauer, Danielle S. McNamara, Simon Dennis, Walter Kintsch
Description at books.google.com: http://books.google.com/books?id=jgVWCuFXePEC&dq
The" Handbook of Latent Semantic Analysis "is the authoritative reference for the theory behind Latent Semantic Analysis (LSA), a burgeoning mathematical method used to analyze how words make meaning, with the desired outcome to program machines to understand human commands via natural language rather than strict programming protocols. The first book of its kind to deliver such a comprehensive analysis, this volume explores every area of the method and combines theoretical implications as well as practical matters of LSA. Readers will be introduced to a powerful new way of understanding language phenomena, as well as innovative ways to perform tasks that depend on language or other complex systems. The "Handbook "clarifies misunderstandings and pre-formed objections to LSA, and provides examples of exciting new educational technologies made possible by LSA and similar techniques. It raises issues in philosophy, artificial intelligence, and linguistics, while describing how LSA has underwritten a range of educational technologies and information systems. Alternate approaches to language understanding are addressed and compared to LSA. This work" "is essential reading for anyone-- newcomers to this area and experts alike-- interested in how human language works or interested in computational analysis and uses of text. Educational technologists, cognitive scientists, philosophers, and information technologists in particular will consider this volume especially useful.

