Talk:Gated recurrent unit

Latest comment: 4 months ago by Neil Strickland in topic $z$ or $1-z$?

Fully gated unit picture edit

Unless I am mistaken, the picture given for the fully gated recurrent unit does not match up with the equation in the article for the hidden state. The 1- node should connect to the product of the output of tanh, not the product with the previous hidden state. In other words, instead of the 1- node being on the arrow above z[t], it should be on the arrow to the right.

--ZaneDurante (talk) 18:21, 2 June 2020 (UTC)Reply

Yes, you are right! I also noticed this already in 2016 when I prepared lecture slides based on the formulas and this picture. They do not match. 193.174.205.82 (talk) 14:56, 18 January 2023 (UTC)Reply

Article requires clarification edit

Is not clear on the article how the cell connects to another cell, to his own layer, or to what else it connects.

Remove CARU section? edit

Lots of publicity for a paper by Macao authors from a Macao IP address, with limited relevance for the GRU article. 194.57.247.3 (talk) 11:45, 28 October 2022 (UTC) Than Please describe what is y_hat(t) in the figure (it does not appear in equations) — Preceding unsigned comment added by Geofo (talkcontribs) 11:15, 29 August 2023 (UTC)Reply

$z$ or $1-z$? edit

Why does this article have $h_t=(1-z_t) \odot h_{t-1} + z_t \odot \hat{h}_t$? The original paper (reference [1]) has h_t = z_t \odot h_{t-1} + (1-z_t) \odot \hat{h}_t, which is also the convention used by PyTorch (see this page) and tensorflow (not documented in the obvious place, but clear if you write some code to test it.) — Preceding unsigned comment added by Neil Strickland (talkcontribs) 23:19, 28 January 2024 (UTC)Reply