Talk:Hill climbing
From Wikipedia, the free encyclopedia
[edit] Local search
What is the difference to local search? Stern 08:28, 13 Jun 2005 (UTC)
- They are the same thing. In Talk:Local search, there's an ongoing discussion of that article's relation to this one. The idea of merging this article with Local search is being raised. --Anthony Liekens 21:43, 11 October 2005 (UTC)
[edit] Need a reference on Probabilistic Hill Climbing
The header/title says it (mostly). I still think we need a plain Metropolis Algorithm entry -- not just Metropolis-Hastings. Another user said the same thing (Metropolis is "more commonly used" than Metropolis-Hastings according to that user). In any case Metropolis Algoithm certainly merits an entry. 199.196.144.11 22:16, 21 March 2007 (UTC)
- Hi, can you explain to me how Metropolis-Hasting is related to Hill Climbing? Thanks, Sander123 08:22, 22 March 2007 (UTC)
- Hill-climbing can be reduced to: 1- pick a new sample point, 2- Measure the goodness of the new sample point, 3- go to step 1. Suppose you have some distribution that represents your belief about the goodness of points in your search space. It would be nice if you could draw your samples according to your believed-goodness distribution. This way, you will be fully utilizing your belief to climb the hill as fast as possible. (Note that your believed-goodness distribution will change as you gather more samples.) Metropolis is an algorithm that draws samples from an arbitrary distribution. (But it requires that you supply a symmetric candidate distribution that hopefully is at least somewhat close to your believed-goodness distribution. Almost everyone just uses a Gaussian for the candidate distribution.) Metropolis-Hastings is a generalization of Metropolis that can use any candidate distribution (but most people just stick to Metropolis and use a Gaussian because they don't know how to pick a better candidate distribution). Metropolis is commonly used in conjunction with Gibb's sampling to evaluate Bayesian Belief Networks. It is effective for hill-climbing in low-dimensional space (ie with univariate distributions), but in high-dimensional space it suffers severly from the curse of dimensionality and converges extremely slowly. The problem with high-dimensional space is that it tries to sample from your believed-goodness distribution rather than sampling points that are likely to improve your believed distribution. Gradient algorithms, on the other hand, are efficient for hill-climbing in high-dimensional space. For example, Empirical Gradient Ascent measures the gradient a tiny ways out in each dimension to compute the optimal direction to ascend the hill (assuming local linearity). Stochastic Gradient Ascent tends to be even faster than Empirical Gradient Ascent. It picks a random direction, measures the goodness of that direction, backs up, steps forward in the best direction yet found, and repeats. Gradient algorithms that add a momentum term to each dimension tend to perform even better than Stochastic Gradient Ascent.
I'm actually implementing a stochastic hill climbing algorithm and I have no earthly idea what the heck this page is on about.
This page could use an informal, intuitive description of the hill climbing concept. It would be useful for many people who are new to optimization. Eternallyoptimistic 14:19, 26 April 2007 (UTC)
- I've added a description that avoids the mathematical notation. Sander123 09:52, 21 May 2007 (UTC)

