Novel Image Processing Using Hybrid Algorithm In Matlab For Robotics

  • Introduction

A robotic arm is a robot manipulator, which can perform similar functions to a human arm. Robotic arms are the vital part of almost all the industries. In industries, a robotic arm perform various different tasks such as welding, trimming, picking and placing etc. Moreover the biggest advantage of these arms is that it can work in hazardous areas and also in the areas which cannot be accessed by human.

Gestures are meaningful body movements which is capable of expressing something in a communication, although gesture finds a place to catalogue itself into non verbal communication it prominently reaches well to the other end of communication. The main purpose of gesture recognition research is to identify a particular human gesture and convey information to the user pertaining to individual gesture. From the corpus of gestures, specific gesture of interest can be identified, and on the basis of that, specific command for execution of action can be given to robotic system.

Overall aim is to make the computer understand human body language, thereby bridging the gap between machine and human. Hand gesture recognition can be used to enhance human computer interaction without depending on traditional input devices such as keyboard and mouse. Hand gestures are extensively used for telerobotic control applications. Robotic systems can be controlled naturally and intuitively with such telerobotic communication. A prominent benefit of such a system is that it presents a natural way to send geometrical information to the robot such as: left, right, etc. Robotic hand can be controlled remotely by hand gestures.

A hand gesture is captured using the image acquisition tool in MATLAB. Skin color detection is used to detect the hand from the acquired frame. Similarly various gestures are captured. The data base of 20 images is created by capturing 5 different images for each of the four gestures. The next step in line is the training section where the SIFT algorithm and artificial neural network is implemented for feature extraction. This is followed by testing where a particular hand gesture shown is recognized. Serial communication involves sending one single character corresponding to the identified gesture to the main controller which actuates the robotic arm.


Database is necessary for the recognition of a given hand gesture. The database consist of twenty images in such a manner that five different images are captured for each of the four gestures. The 3 basic steps in database creation are explained below

  • Image acquisition

We use a camera and device drivers as peripherals.Image acquisition hardware information (imaqhwinfo()) gives the basic information about the installed adapters and the toolbox name and version. The parameters are adapter name , device id, and the format and resolution video being captured. The created object act as interface between the adaptor and MATLAB. YUY2 is one common format in MATLAB image acquisition toolbox and the commonly preferred resolution is 640×480. The video being captured in YUY2 format gives a YCbCr image where Y is luminance Cb and Cr are the color components. Conversion of image data type to 8 bit unsigned integer. So the operations need to be performed only on 2 matrix

  • Segmentation by skin color detection

The basic principle involved is to create a matrix of 1’s and 0’s. The positions where skin color is seen is made one and rest are made zero. Minimum and maximum threshold are set to the Cb and Cr values, which can be changed to different values using set color option. Using the Cb and Cr values the skin color region in the image are detected. The captured image may contain various skin colors, using bounding box we detect different regions. The image which are less than 170 * 150 are neglected and image greater than the size are cropped and shown in the output. A new matrix is created with the resolution of the captured image(here 640×480). Then we scan through the image matrix values in ‘Cb’ and ‘Cr’ and if these values are in the range of skin color we assign a value 1 to the corresponding location of the new matrix. The range has to be adjusted to suit your skin color. Using a slider we can vary the values of Cb and Cr of set color.

  • Storing the images

The hand gesture captured are stored in local storage for further steps and processing using the index local variable to name each images creates


Training process is done to optimize the performance of the image processing system The feature extraction of the created data base is done in training section. The trained network is used for the testing and real time hand gesture recognition. The SIFT and ANN algorithm are implemented in training section

  • SIFT algorithm

For each of the images in the data base SIFT algorithm is used to extract keypoints and a keypoint matrix is formed. Each image is passed through cascaded Gaussian blurring filter to get the scale spaces. Keypoints are local maximas of minimas we obtain from the scale spaces. Once the keypoints are obtained descriptors for a group of keypoints are found. Descriptors include the x gradient y gradient and orientation of the keypoint pixel as vectors

  • Artificial neural networks

The ideal output matrix for the artificial neural network is set.

The first row of the output matrix consist of high values for the images of 1st hand gesture for the rest of the values are zero.Similar steps are followed to create the output matrix for different gestures. Particular sequence in each row is high for different hand gesture in the output matrix. The input to the ANN net is the feature extracted matrix from sift algorithm. The input matrix also consist of similar sequence where a particular images will have more features in a row.

  • Training

Artificial neural network is trained by giving the input and ideal output matrix. The artificial neural networks hidden layer are specified. Accurate number of hidden layer must be specified in the function to get maximum performance and avoid execution delays of the neural net in live testing. Based on the input and output, the neural net is trained. The feedback circuit optimizes the neural network to get maximum performance. The Trained neural can be called in live testing by giving real time input, the output of which will be classified to nearest ideal value.

  • Testing

The final stage in hand gesture recognition. MATLAB is interfaced to a camera and live video is taken. Frames at fixed interval is captured from the video. From each frame hand is detected using the method of skin color detection and the detected hand is cropped out from the image. SIFT algorithm is used to extract key points from each frame and a matrix is formed using the descriptors of these key points. This is passed to the neural network where classification take place. The descriptor matrix from the data base is compared with the present matrix and the image from data base showing maximum match is chosen as the recognized gesture.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s