SIFT Reading Notes

Overview

SIFT (Scale Invariant Feature Transform) seems to have lots of good reference resources e.g. good concept and good details. Some of the key properties are (1) scale invariant and (2) rotation invariant. Property one is achieved by guassian pyramid of multiple convolution responses, which was scale by the standard deviation sigma. Although not explictly mentioned in the good details, I think this scaling is required due to the fact that different levels of response are compared pixel to pixel to enumerate key point candidates, and that gaussian with larger sigma would simply gives smaller response. On the other hand, property two is achieved by computing maximum, local polarized gradients, which are weighted by distance to key point center. Note that if an orientation achieve 80% of the max orientation, a new key point would be generated. Some illustrative pictures can be found here and good details

Some Unresolved Questions

Gaussian pyramid downsample the image after convolution (ref). To compute difference of guassian, I assume upsampling is required to compute difference of gaussian.