Notes for: 2018 - Lesson 3: Understanding convolutions

Sometimes freezing just Batch Normalization Layers might be useful, we will learn why in later classes. Basically if the model is similar to image net when the objects take most of the image we might want to freeze these layers.

Tip: You can index a numpy array with None to add a new dimension to them, e.g. image_array[None].

Understanding CNNs

Looking again at this demo we can get an intuition on how the kernels modify an image. Another great visual explanation on this video (screenshots from there).

Kernel: [ [ -1 -1 -1 ], [0 0 0] [ 1 1 1] ]

The Excel conv-example file is also very helpful to understand the concepts.

On the Excel we have:

For an image with 3 channels instead of 1: we will have to add those channel dimensions to the kernels. So instead of being 3x3 they would become 3x3x3.

We want to turn the output of the Fully Connected Layer into a probability, they should be between 0-1 and the sum should be equal to 1. The activation function for this is the Softmax. The Softmax calculates exp(x) to remove the negatives.

Softmax works great to classify one thing, its her personality :)

Multi label classification: What if we want to classify an image with one cat and one dog?. Change the activation function to a sigmoid, so the values don’t up to one so you can classify multiple times. Sigmoid wants to know where you are between -1 and 1, and beyond these values won’t care how much you increase.

When retraining models like ImageNet be careful of the size of the inputs. If we don’t use the same size it will kinda destroy the weights it learned. This depends on what we want to predict, is it close to ImageNet? don’t change it a lot.

When changing the learning rate every epoch its important to think what layers do I want to change more than others.

Other readings