Pre-processing is the first phase in image processing. The goal of the pre-processing is to clean up the image and extract only the information usable in the further phase. Since, in many cases, this phase has to go through every pixel of the image, only relatively simple operations can be performed. The result is a list of “features” (you will see what this word means in a little bit) useful for more detailed processing. And many times, it’s desirable to have a much smaller number of features than the number of pixels.
In our case, the goal is to identify the white rectangle. And the good way to start is by identifying the pixels lying on the edge of the white area. That pixels are the “features” in this particular context. And they can be extracted by the following process.
- Threshold pixel by the color (white color must have red, green and blue components high enough).
- Identify all the connected areas (blobs).
- Pick up the biggest blob (assuming the marker paper is the dominant white object in the image) and throw away all the others. This cleans certain artifacts.
- If the biggest blob doesn’t have enough volume (means number of pixels), then exit.
- Identify contour points of the blob. These are the white pixels next to the black pixel.
The result is a set of contour points, illustrated in the image below. To give you rough numbers. Input image has 640×480 pixels, which is slightly over 300,000 pixels in total. Pre-processing chooses, given reasonable input, less than 3,000 pixels. This reduces the amount of data by the factor of 1,000.