Talk:Boltzmann machine

Statistics Low‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
Low	This article has been rated as Low-importance on the importance scale.

Computer science Low‑importance

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science articles

Low

This article has been rated as Low-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Global Energy edit

Latest comment: 8 years ago3 comments2 people in discussion

Why does the global energy function have:

\sum \limits _{i<j}\cdots

Shouldn't this be:

\sum \limits _{i,j}\cdots

But I could be misunderstanding... 129.215.26.79 (talk) 15:31, 13 May 2014 (UTC)Reply

Never mind I see it just saves having to divide by two to account for double counting. 129.215.26.79 (talk) 12:43, 15 May 2014 (UTC)Reply

Looking at it from the point of view of a programmer, it says "Don't do all the work twice". ;-) 92.0.230.198 (talk) 17:25, 27 June 2015 (UTC)Reply

Training sign edit

Latest comment: 15 years ago1 comment1 person in discussion

I removed the minus sign from the RHS of this:

{\frac {\partial {G}}{\partial {w_{ij}}}}={\frac {1}{T}}[p_{ij}^{+}-p_{ij}^{-}]

If p+ is clamped and p- is unclamped, then we want to make the weights MORE like the correlation of the clamped and less like unclamped, I think ... please check this! Charles Fox

You're incorrect, the minus sign is needed —Preceding unsigned comment added by 72.137.60.77 (talk) 17:36, 5 April 2009 (UTC)Reply

What does "marginalize" mean in the following? edit

"We denote the converged distribution, after we marginalize it over the visible units V, as P − (V)." There is no other instance of this word in the article. Even a technically-minded reader wouldn't understand this article if the word isn't defined anywhere. - Will

CRF edit

Latest comment: 17 years ago1 comment1 person in discussion

Is the Boltzmann machine the same as a Conditional Random Field? If so that should be mentioned somewhere!

No, it isn't. A CRF can however be viewed as convexified Boltzmann machine with hand-picked features. - DaveWF 06:10, 19 April 2007 (UTC)Reply

The threshold edit

Latest comment: 10 years ago2 comments2 people in discussion

What is the importance of the threshold parameter? How is it set?

Learned like any other parameter. Just have a connection wired to '+1' all the time instead of another unit. I should add this. - DaveWF 06:10, 19 April 2007 (UTC)Reply

Can threshold be referred to as bias? Also, the link on threshold takes you to the disambiguation page, which has no articles describing threshold in this context.

The term threshold is wrong in this context. As the description of Boltzmann machines is in this article, there is no threshold function, and thus no threshold. Instead the Theta is a bias here. If a unit is activated, the bias Theta of that unit will be added to the total energy function. I have updated the text accordingly. I assume this a copy and paste error, by just copying the term threshold over from the description of Hopfield networks, where it actually makes sense, since Hopfield networks have a threshold function. - sebastian.stueker 23:30, 17 May 2013 (UTC)Reply

The Training Section edit

I have a problem understanding what is P+(Vα). P+ is the distribution of the states after the values for Vα are fixed. So P+(Vα) should be 1 for those fixed values and 0 for any other values to Vα.

Also, what does α iterate over in the summation for G?

The cost function edit

What is the cost function? What cost does it measure? How do we train the network if we have more than one input?

{-1,1} or {0,1} ? edit

Latest comment: 5 years ago3 comments3 people in discussion

In the definition of s the article claims that s_i is either -1 or 1. Five lines below, it says that the nodes are in state 0 or 1, which is also what I found in (admittedly older) literature on the subject. Is the {-1,1} simply wrong or am I missing something? —Preceding unsigned comment added by Drivehonor (talk • contribs) 13:56, 7 August 2007

I think either representation should work. But I'm not sure. Can anyone confirm this? —Preceding unsigned comment added by Zholyte (talk • contribs) 19:47, 10 November 2007 (UTC)Reply

You probably just don't understand, because it doesn't matter at all. —Preceding unsigned comment added by 130.15.15.193 (talk) 23:12, 2 December 2009 (UTC)Reply

I think s_i should be in {-1, 1}. If {0, 1}, then the contribution to energy, as per the energy function, of a positively weighted pair of off nodes would be the same as that of a mismatched pair (1 off, 1 on). I believe off-off and on-on should have equal energy contributions, but don't see how this can be achieved given the current statement of the global energy equation, and the state domain {0, 1}. Onejgordon (talk) 08:12, 22 November 2018 (UTC)Reply

Etimology edit

Latest comment: 10 years ago3 comments3 people in discussion

WHY is it called a Boltzmann machine? Is it named after Ludwig Boltzmann? The Ludwig Boltzmann article references this one... but that can't be determinant. --Nehushtan (talk) 22:10, 12 January 2009 (UTC)Reply

Yes, it's named for Ludwig Boltzmann. AmiDaniel (talk) 08:31, 26 September 2011 (UTC)Reply

Because the underlying energy minimization strategy involves the Boltzmann Distribution p.r.newman (talk) 13:45, 19 May 2013 (UTC)Reply

Question : the new phase of learning ? edit

Latest comment: 15 years ago1 comment1 person in discussion

Question, "Later, the weights are updated to maximize the probability of the network producing the completed data." What's this mean ? Is this new phase of learning ? Is this mean that : $p_{ij}^{+}$ are set as constatn in this phase - compute as a characteristic of learning set ? - for example if our data set is {{1,1,0},{1,0,1},{1,0,0}} then : $p_{12}^{+}=1/3$ , : $p_{13}^{+}1/3$ , : $p_{23}^{+}=0$ in all later iterations ? Peter 212.76.37.154 (talk) 16:04, 28 January 2009 (UTC)Reply

Please improve the first paragraph edit

Latest comment: 11 years ago1 comment1 person in discussion

The first paragraph (the description) fails to describe what Boltzmann machine is. It only talks about what it is not. Given that how many things are not a Boltzmann machine, it is a bit wasteful... The class, where this particular network belongs (and where this article links it to) is lacking a description (is a stub / something automatically generated and not informative). Other explanations are given by counterexample. I.e. it says that this network is a counterpart of something else and that it can't be used for something. Neither statement is helpful in understanding of what it is or where it can be useful. 79.181.224.222 (talk) 22:19, 16 December 2012 (UTC)Reply

Tidy up of citations needed edit

Latest comment: 10 years ago1 comment1 person in discussion

I move the Ackley et al. citation in the article to be an in-line reference but then got daunted by trying to bring the other citations and further readings into line with Wikipedia standards. I'll try and get back to it but hope others will feel free to take it on! p.r.newman (talk) 13:45, 19 May 2013 (UTC)Reply

Incorrect statement about scalability? edit

"the time the machine must be run in order to collect equilibrium statistics grows exponentially with the machine's size". I've talked to ML researchers who have disputed this point. This "fact" has been here for years - do we have a reference on it?

Yes, but does it do? What is it for? ;o) edit

Latest comment: 8 years ago1 comment1 person in discussion

The article tells us what it looks like and how to train it but I can't for the life of me see what it takes as input and what it gives as output. A bit more on that and especially an example or two would be a big improvement. 92.0.230.198 (talk) 17:31, 27 June 2015 (UTC)Reply

Image edit

Latest comment: 8 years ago1 comment1 person in discussion

You might want to use the image

for the article. --MartinThoma (talk) 23:07, 12 February 2016 (UTC)Reply

External links modified edit

Latest comment: 6 years ago1 comment1 person in discussion

Hello fellow Wikipedians,

I have just modified one external link on Boltzmann machine. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

Added archive https://web.archive.org/web/20100705230134/http://learning.cs.toronto.edu/~hinton/absps/pdp7.pdf to http://learning.cs.toronto.edu/~hinton/absps/pdp7.pdf

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 04:44, 23 July 2017 (UTC)Reply

What does it actually do? edit

Latest comment: 3 years ago1 comment1 person in discussion

I read the article but there is scant indication literally of what it does. It would be helpful to list some examples, especially what it is doing for a user. The article is the "how" of the network. Jazzbox (talk) 23:27, 29 May 2020 (UTC) Jazzbox (talk) 23:28, 29 May 2020 (UTC)Reply

The introduction and description is a mess edit

Latest comment: 3 years ago1 comment1 person in discussion

The introductory paragraph of this article is a epistemological train-wreck. Lots of related terms are being randomly thrown around with no connection or coherence. Hinton's own Scholarpedia article is a good start. Probably makes sense to be fair, but Boltzmann Machine was a LEARNING model that had nothing to do with Spin Glasses which are COMBINATORIAL physics problems. It looks as if the article is hastily written by a graduate student who is writing a term paper, and not by an expert. — Preceding unsigned comment added by 128.111.64.109 (talk) 16:03, 24 August 2020 (UTC)Reply

the link between E_{i=off} and p_{i=off} is not correct edit

Latest comment: 1 year ago1 comment1 person in discussion

the probability that s_i=1 is const*\sum_{j\neq i} exp(-\beta E(s_j with s_i=1)). So the rather cumbersome calculation in the middle of the article should be revised, because the partial sum needed to correctly defined p(s_i=1) or 0 are omitted 24.120.54.52 (talk) 17:18, 10 March 2023 (UTC)Reply

Add topic