Feature extraction: redness and elongation

In this worksheet, you will implement the extraction of the redness and elongation features of the images.

from PIL import Image
import matplotlib.pyplot as plt

%load_ext autoreload
%autoreload 2

from utilities import *
import os.path

from intro_science_donnees import data_dir
dataset_dir = os.path.join(data_dir, 'apples_and_bananas_simple')
images = load_images(dataset_dir, "*.png")

Exercise 1: Extraction of the image foreground

To compute features of our images we first need to extract the foreground of the picture by separating the object from its background. For most images in this simple dataset, the object lies on a light background. So a simple strategy is to choose a threshold theta (seuil) and decide that any pixel whose red, green or blue value is below the threshold belongs to the foreground.

Let’s take an apple:

img = images['a10.png']

We compute, for each pixel, the min of the red, green and blue value:

M = np.array(img)
G = np.min(M[:,:,0:3], axis=2)
fig = Figure()
imgg = fig.add_subplot().imshow(G, cmap='Greys_r')

and derive a boolean array F (or black and white image) where F[i,j] is True (white) whenever the pixel of coordinates i, j is in the foreground:

theta = 150
F = G < theta

Try again with other values for the threshold theta.

Using the above as inspiration, implement in utilities.py the function foreground_filter(img, theta = 150) that takes a numpy array or PIL image img as argument together with a threshold theta, and returns a thresholded image. Check it on our image:

plt.imshow(foreground_filter(img, 150));

Now, apply the filter with a threshold of 200 to all images in the dataset and display the result.

  • Use a comprehension [f(x) for x in ...] to apply the filter to all images

  • Use image_grid to display the result

image_grid([foreground_filter(img, theta=200) for img in images], 

utilities.py provides a filter transparent_background_filter that calls foreground_filter and makes all pixels in the background transparent. Apply it to all images in the dataset, and try different thresholds theta:

image_grid([transparent_background_filter(img, theta=130) for img in images],

Exercise 2: Extraction of the redness feature

We now want to extract the redness as the average (mean) of the foreground pixels of the red channel (those that are True in F) minus the average of the foreground pixels in the green channel.

Implement the function redness(img) in utilities.py.


  • To compute the mean, it’s best to work with floating point numbers. Make sure to extract, for example, the green channel with G = M[:, :, 1] * 1.0.

  • Recall that np.mean(R) computes the mean of all values of an array R;

  • Given an array R and a boolean array (such as F) of same dimensions, R[F] returns an array of all values R[i,j] such that F[i,j] is True. For example:

R = np.array([[1,2], [3,4]])
F = np.array([[True, False], [True, True]])

Check visually your redness function on the images of the dataset:

           titles=["{0:.2f}".format(redness(img)) for img in images])

Check your redness function with these automated tests:

assert abs(redness(images['b01.png']) -  0   ) < 0.1
assert abs(redness(images['a01.png']) - 41.48) < 0.1
assert abs(redness(images['a09.png']) - -3.66) < 0.1

Question 3: Extraction of the elongation feature

As a second feature to distinguish apples from bananas, we extract the elongation of the fruit: the ratio over the length over the width of the object. But how to measure these in the first place, when the fruits can have any orientation, and there can be noise in the picture?

We will use the occasion to show a nifty trick, implemented in the elongation function. Display the elongation for all the fruits in the data set as computed by this function, and check visually that it’s plausible. You may want to use a ruler!

           titles=["{0:.2f}".format(elongation(img)) for img in images])

So, how does this work?

We convert the black and white image into a cloud of points: Each point represents the coordinates of one of the foreground pixels (similar to a matrix in sparse format). Then we find the principle axes of the cloud of points, using a well-known algorithm called singular value decomposition. The first principal axis is the direction of largest variance of the cloud of points. The second one is the direction orthogonal to the first one. The aspect ratio will simply be defined as the ratio of the standard deviations in the two principal directions.

Let’s illustrate the process on the synthetic banana:

img = images['b01.png']
# Build the cloud of points defined by the foreground image pixels
F = foreground_filter(img)
xy = np.argwhere(F)
# Build the picture
fig = Figure(figsize=(20, 5))
# Original image
subplot = fig.add_subplot(1, 3, 1)
subplot.set_title("Original image", fontsize=18) 
# The foreground as a black and white picture
subplot = fig.add_subplot(1, 3, 2)
subplot.set_title("Foreground", fontsize=18) 
# The cloud of points, as a scatter plot, together with the principal axes
subplot = fig.add_subplot(1, 3, 3)
subplot.scatter(xy[:,1], xy[:,0])
elongation_plot(img, subplot)
subplot.set_xlim(0, 31)
subplot.set_ylim(31, 0)
subplot.set_aspect('equal', adjustable='box')
subplot.set_title("Cloud of points and principal axes",  fontsize=18)

Try again with other pictures!

The trick we just used to extract features from a cloud of points is a mainstream method in machine learning called PCA (Principal Component Analysis). See the appendix for a another example.

You will learn the mathematics behind PCA in later linear algebra courses. However, thanks to the existing libraries, you can apply it right now in just a few lines:



We now have utilities to compute two features for our images: redness and elongations. Let’s come back to the data analysis and see if these features are sufficient to distinguish between apples and bananas!