Programming a Robot to follow an Object

Making a computer able recognize objects from a video it's quite an interesting topic. Programming a robot to move is quite interesting as well, we are putting both ideas together to get our robot to follow a tennis ball!
This is not new, it has been done lots of times but it a cool project that touches many programming concepts like Computer Vision, Threads and controlling motors! So go ahead and try it yourself!!
You don't need any special/expensive hardware, anything able to compile C/C++ with eyes and legs will do ;)
Actually, we are using wheels for legs and a camera for eyes. Yes, this is C/C++ not Java and I developed and tested all this with a Raspberry Pi 1 Model B+ with its camera module plus a bit more of hardware.

The Robot

Computer Vision

Let's start by the eyes: how to see? how to recognize objects?
An object will have some features, something that distinguish it from the rest, its color could be that. So a way can be to filter the object color from the rest of the image.
Starting from a color image:

original tennis ball

we produce a monochrome one where the pixels from the object are white and black elsewhere:

filtered tennis ball

As you can see this image has noise: white pixels that don't belong to the ball.
To get rid of the noise we use what in Computer Vision is known as Erode: this will help us to turn into black all the little insolated white areas that are likely to be noise; afert eroding we get:

tennis ball filtered and eroded

We also have some black pixels that must be white, this is because the tennis ball actually has more than one color so we would like to fill the gaps, for this we use an operation called: Dilate and we get:

tennis ball filtered, eroded and dilated

now what we can do is look for the biggest white area and that will be the ball we are after!

The steps so far are:

	getBallPosition(camera, ballColor) {

		frame =;

		// every pixel with color==ballColor is set to white, black elsewhere.
		frame = filter(ballColor);

		frame = erode(frame);
		frame = dilate(frame);

		// get the (x,y) center coordinates of the white area
		position = findCenter(frame);

		return position;

Filtering the image can be done in many ways. It's important to know that an image is a matrix if color dots each of these dots is a pixel, a BGR pixel (for example) is a dot with 3 values: one for blue, one for green and one for red. Knowing this, to filter a given color is to select dots with the right BGR values and replacing the rest with black (blue=0;green=0;red=0) to mention any other color. Since object in real life don't have just one color this approach lacks something. we must be a bit more flexible and not only select pixels of the exact color but any other color within a certain threshold.
A well known way to achieve this is to convert the BGR image a HSV (HSV is a cylindrical-coordinate representations of points) and then filter the pixels widthin range of HSV values.
Other technique is to work directly with the BGR coordinate system and filter the pixels inside a cube (or a sphere: slower) of a given size and a given center. The center is the color of the object to detect (the exact BGR pixel) and the size (length or radius) represents the similarity we want to work with. Think of colors as dots in the space and we are keeping those that are in a vicinity of a given one.
whichever algorithm we are using it must be time efficient: this has to be done frame after frame so it must be fast!
This is what the robot is doing and it allows the user to select which algorithm to use!!

Moving the Robot towards the Object

Now that we know the object position we have to translate it into a movement.
After experimenting for a while I found this simple approach quite useful for our purposes:
Divide the image into 9 squares and assign a movement to each of them:

moves asignment

Depending on where the center of the object is, the robot moves in a straight line or rotates. Curve movements were tried but the results were not good perhaps because the motors are a bit inaccurate. BTW: Make sure your robot is well balanced and can move in a straight line before attempting have it moving by itself.

Putting it all together:

We can process images to detect a tennis ball and we know how to move the robot toward it.
The bottle neck here is detecting the object, moving it should not be an expensive operation however it can consume some valuable time, so we are dedicating a thread only for image processing and another one for moving the robot (and a third one to keep an eye on the user input and to stop execution when required).
With this approach we have the robot simultaneously looking for the ball while moving towards it. Similar, perhaps, to what a dog would do: run towards the ball while watching for changes in its position. Actually we could do way better: try to predict changes in the ball movement and calculate a route to intercept it... but that too ambitious for now...

The algorithm is:

bool keepGoing;
move_t currentMove;

int main(void) {
	initialize camera and motors;

	keepGoing= true;
	// thread to analyse images and detect the ball:
	thread calculateMovementThread;
	// thread to control the robot movement:
	thread moveCloserToTheBallThread;
	wait for user command to finish;

	// stop all the threads:
	keepGoing= false;
	wake up any waiting thread;
	wait for the threads to finish;

	release resources like camera and motors;
    return 0;

calculateMovement {
	while(keepGoing) {		
		retrieve current ball position;
		move = calculate move from current position;		
		if (move != previous move) {


changeMovement(newMove) {	
	set currentMove to newMove in a thread safe way (synchronizing with moveCloserToTheBall function);	

moveCloserToTheBall {
	move_t move;

	while(keepGoing) {

		run this block synchronized with changeMovement function {
			while(there is no new move && keepGoing) {					
			move = currentMove;

		// perhaps the "while" above stopped because it's time to leave so don't move the robot
		if(keepGoing) {
			// the moveRobot call should return fast but even if it doesn't, it's outside the synchronized code so we can
			// simultaneously process the next image.	

	sets the speed of the motor to the given value;

Final Notes

Object Recognition and Image Processing are huge Computer Science topics! We are just scratching the surface here!

Getting a robot to follow a tennis ball it's a very interesting project with lots of variables that can impact on the final result:
On the hardware side: make sure the robot moves in a straight line and turns left and right properly.
It's recommendable that the object to track has a color uniform and different from the rest of the image.
On the software side: make sure you are working with the right configurations parameters: small changes in the light can have a big impact in detecting the object. Different types of floors will probably require different speeds.
Make sure as well that the code running is efficiently enough.

The source code provided bellow includes some tools to help you find the right configuration values.
C/C++ compiler is required and OpenCV library installed.
The code will work with very little changes, the only piece that needs to be rewritten is the motor's driver: miniDriver.c
When running it you might want to start with the BGR approach first since it requires less configuration parameters but try them both to see which one works better for you!

source code have fun!!