Closed
Description
Are there plans for a maxout layer? For example:
class Maxout(nn.Module):
def __init__(self, d_in, d_out, pool_size):
super().__init__()
self.d_in, self.d_out, self.pool_size = d_in, d_out, pool_size
self.lin = Linear(d_in, d_out * pool_size)
def forward(self, inputs):
shape = list(inputs.size())
shape[-1] = self.d_out
shape.append(self.pool_size)
last_dim = len(shape) - 1
out = self.lin(inputs)
m, i = out.view(*shape).max(last_dim)
return m.squeeze(last_dim)
EDIT: it seems like people are still requesting Maxout in 2021. Please do not use this implementation as it is for a very very old version of PyTorch and (probably) no longer works.
Activity
soumith commentedon Feb 21, 2017
as your example itself shows, it's simple to write by the user itself.
We're not sure it's common enough to write a reference layer in the pytorch core.
For now, we're probably not going to add it, unless a lot of folks think otherwise.
erogol commentedon Jun 28, 2017
For ones who need Maxout, I changed the above code to make it work.
kkorovesis commentedon Dec 17, 2017
Can you please explain how this works please ? I have a 2D tensor off size [128,600] and I want to get it through a Maxout Layer.
Fix compilation on GCC 7
lucasb-eyer commentedon May 16, 2018
And here's a version that is a pure non-linearity applied on the last dimension, this way it can be used with more than just Linears: convs of any dimension, time-series, etc. But it's almost trivial, honestly, so I agree with not including it in the lib to avoid bloat.
Example use:
davidtvs commentedon Feb 4, 2019
Actually, Maxout is applied to the channel dimension which in PyTorch is dimension 1 and not the last:
shamoons commentedon Mar 24, 2020
If ReLU6 made it into the core, I should think that max out would as well.
paniabhisek commentedon May 29, 2020
Here, you can find both variant(
Linear
andConv
). TheLinear
version is more efficient than presented above.Add config for 11.3 conda (pytorch#805)
cowwoc commentedon Oct 12, 2021
Honestly, this is a mess. TensorFlow provides a very simple API: https://www.tensorflow.org/addons/api_docs/python/tfa/layers/Maxout
Some of the implementations I see for PyTorch insert Linear layers under the hood, others apply an activation against preexisting layers. Some of them invoke
Tensor.view()
with 3 components, others with 4. It's hard to make heads and tails of which is correct, and why.I am looking for a drop-in replacement for
torch.nn.ReLU
andtorch.nn.functional.relu()
. Can anyone provide that?add ninja profiling tools to the base docker (pytorch#805)