Closed
Description
Hi,
Thank you for sharing the great work.
I have a question about the data transformer.
self.transforms_x = Compose([
RandomStretch(),
CenterCrop(instance_sz - 8),
RandomCrop(instance_sz - 2 * 8),
ToTensor()])
Why the size applied on the search image is "instance_sz - 2 * 8" which would be 255- 16 ??
Why not is just 255?
If you know the answer please share with me.
Thank you in advance.
Best Regards,
Lai
Activity
amoskalev commentedon Sep 20, 2019
The same question, original paper uses 255x255 resolution
huanglianghua commentedon Sep 21, 2019
"instance_sz - 2 * 8" is a random crop on 255x255 images, used in the training phase, thereby reducing the overfitting on training data.
amoskalev commentedon Sep 21, 2019
Is it okay to use random crop if for training we expect image to be centered on the object?
JyunYuLai commentedon Sep 21, 2019
Hi,
Thank you for your reply.
For my understanding, the smaller input size is okay.
But we have to compute the corresponding embedding size and target size during training.
For example, the original paper input size is 255, so the embedding size would be 22x22.
And in the repo is 255-16, so the embedding size would be 20x20.
Am i right? Please correct me if there is any misunderstanding.
The other question is like @ferumchrome , random crop and center crop is not mentioned in the paper, this should be the features in this repo?
How about the performance with and without the augmentation?
Thank you in advance.
Best Regards,
Lai
huanglianghua commentedon Sep 23, 2019
@ferumchrome The offsets can be used to simulate the target's movement during test. (During test, the targets are not always stay at the center)
@JyunYuLai It does not matter so much that the embedding sizes in training and test are different. The AlexNet has no padding, so cropping on the image approximately has the same effect as cropping on the features (regardless of the differences caused by 3x3 max-poolings), as well as on the score map.
In effect, cropping a 239x239 image on a 255x255 one, is (approximately) the same as cropping a 15x15 score map on the 17x17 one.
Accept-AI commentedon Feb 23, 2020
请问z = self._crop(z, box_z, self.instance_sz)
x = self._crop(x, box_x, self.instance_sz) 这个_crop()是什么
huanglianghua commentedon Feb 23, 2020
@xzy123456611 The
_crop
method extracts a squared area centered at thebox_z
/box_x
, with reasonable context. The side length of the output image isself.instance_sz
.Accept-AI commentedon Feb 24, 2020
谢谢您的回复
Accept-AI commentedon Feb 24, 2020
z = self._crop(z, box_z, self.instance_sz)
x = self._crop(x, box_x, self.instance_sz)您好,为什么两行的代码,第三个参数用的都是self.instance_sz呢??
huanglianghua commentedon Feb 24, 2020
We do this for further data augmentation. The
transforms_z
further crops imagez
around its center to size (127, 127).Accept-AI commentedon Feb 24, 2020
thank you