Signup for new content


AR By Hand – Part 1 – Introduction

Building up the AR system from scratch.

Chances are, you already at least heard the term augmented reality. The very generic definition says that augmented reality is a real-world enhanced by computer-generated information. This mini-series will be about enhancing the captured image or video by adding a 3D model into it. Like in this video.

If you want to know how this works, then keep reading. I will walk you through all the steps. The following image is the rough summarization of the steps.

Now in simple words. The final goal is to display a cube on top of the white A4 paper. The process to do that starts by thresholding the pixels to find the white ones. This produces several blobs. Assuming the A4 paper is the biggest of them. Others are suppressed. Next step is to identify contour, edge lines and corners of the paper. Then the planar transformation is computed. Planar transformation is used to compute a projection matrix which allows drawing 3D objects to the original image. Finally, the projection matrix is transformed into the OpenGL compatible form. Then you can do anything with that.

Sounds trivial, right? Maybe. Still, it takes some effort to go through the details, therefore I have prepared the following chapters.

In addition, there is an example project accompanying this series. You can download it right here.

Code is written in Java (+ few OpenGL shaders) and build by Maven. As soon as you understand these, you should be able to build the project and run the test applications. Test applications are executable classes within test sources. Main classes are CameraPoseVideoTestApp and CameraPoseJoglTestApp.

Regarding the expected level of knowledge. It will be very helpful if you have some knowledge about linear algebra, homogenous coordinates, RGB image representation, pinhole camera model and perspective projection. Although I will try to keep the required level to the minimum, it is too much to explain every little thing in detail.
Now let me make a note about the quality of the result. There are 2 main factors which affect quality – implementation and environment. I will cover one type of implementation. Will let you judge how good it is. Please leave me comments, especially if you have a concrete idea to improve. The second factor which matters is the environment. This includes everything from camera quality, noise, distractions in the scene, lighting, occlusion, till the time you can spend on the processing each frame. Even today’s state of the art algorithms will fail under the crappy environment. Please keep this in mind, when you do your own experiments.


This chapter gave you an overall idea of the project. Next chapter will tell you how to track the plane.

Almost there!

Enter your information to download the project.