1. Introduction

The way we interact with our computers has evolved as the result of evolution in our computers. A decade ago we mostly used desktop computers that we need to sit in front of them and control them with a mouse and keyboard. After that we experienced more portable computers known as laptops which can be used mouse free with a touch pad. Then we had smartphones which introduced us a new way of interaction: touch based interaction. This new way of interaction lets us to get rid of mouse and keyboard. But now we live in the era of smartwatches, virtual reality and Internet of Things (IoT). touch based interaction is not necessarily the best way of interaction with these kind of new technologies. As an example we can consider smartwatches. Even though having a powerful computer on your wrist is amazing but their small screen makes it hard to interact with them or you can consider a Google cardboard in which you have to put your phone inside it and it makes your phone inaccessible. One possible solution to approaching this problem can be wearing a glove with a set of sensors that lets you to control your device by doing some gesture. Even though this approach can solve the problem to some extend but wearing a glove for doing daily activities is not something that all the users were comfortable with it. So what can be the next technology for us to interact with these kind of devices? The answer is a device free technology.

GestFi is a device free gesture recognition system that lets you to interact with your devices by doing some gestures. It exploits WiFi signals affected by body movements of a user to distinguish among different gestures. For this project we used a pair of off the shelves WiFi adapter card to build a system that can distinguish among 3 different gestures in experimental environments.

In the rest of this report is structured as follows: In section 2 we review some related work to device free gesture recognition systems. Then in section 3 we describe our implementation and in section 4 we describe our system evaluation. After that in section 5 we have discussion and conclusion sections and after that in appendix A we mention our other attempts that doesn’t work properly. Finally appendix B you can find the links to our codes.

2. Related Works

WiSee is among one of the initial works which used wireless signals to recognise different gestures but the researchers in this project didn’t use a commodity wireless card. You can read more about it here:

There was another work after WiSee that you can find it here:

In this work people used a commodity wireless card but they hardcoded the features of different gestures into the classifier. This approach prevent the users to define a new gesture.

In our work we also used commodity wireless cards but we didn’t hardcode any feature into the classifier and classifier can learn different gestures by seeing the CSI values.

3. Implementation

In this section we describe what you need to build this system and how to build this system:

3.1 Prerequisite


A pair of “intel wifi link 5300” wireless adapters. You can buy them from here:

This is just a normal wireless adapter that has a specific driver that lets you to get packet information from physical layer to detect some information from the signals in the air.

Figure 1 describes how you should put this wireless card into your laptop.


Figure 1

For installation of this driver you can use this link:



Operating System: We have tested this system in Ubuntu 14.04 with kernel version 3.16.0. If you have another version of the kernel you can download 3.16 from here:

You have to download these 3 files and put the in same directory:




After Downloading cd into that directory and run the following command to install the new kernel:

sudo dpkg -i *.deb

Anocanda: Anaconda is a free Python distribution. It includes more than 400 of the most popular Python packages for science, math, engineering, and data analysis. We need it for classifying different signals. You can download Anocanda from here:

Matlab: We need Matlab because the researchers who develop that driver provided some matlab scripts to read and parse signal information in the user space.

Torch 7: Torch is a scientific computing framework with wide support for machine learning algorithms. We used this framework to develop our neural networks as a classifier. You can use this link to download torch 7:


3.2 System Architecture:

Before discussing about system architecture I have to explain how WiFi communication work in the physical layer. For every WiFi communication we have a transmitter and a receiver Each of them can have 1 or multiple antenna for communication and the data will be sent over different subcarrier. After receiving every single packet in the receiver it calculates a matrix called “Channel State Information (CSI)”. Every element of this matrix is a complex number which describes the current state of the channel between a pair of transmitter and receiver antennas and its magnitude can be affected by the movements of nearby humans. In our experiments we had 3 antenna in receiver and 1 antenna in the transmitter and the data is sent over 30 different sub carrier. So for every packet we have 3*1*30 = 90 different CSI value. We want to use these CSI values for gesture recognition. For creating feature vectors we only used 10 different subcarriers because they were redundant and it makes computation faster. So for every sample we have 30 CSI values.


Figure 2 describes the building blocks of our system:

Blank Flowchart - New Page (1)

Figure 2

This system is combined from a transmitter, a receiver and a user. User is between transmitter and receiver. Whenever user wants to do a gesture he should do a click by mouse and then it tells the receiver to record the CSI values from the received packets. In our system the length of every gesture is considered as 2 seconds and in every seconds 2500 packets will be received by the receiver. So for every gesture we have 2500*2 different samples and each sample has 30 different CSI value. So the size of a feature vector is 150,000. After reading these values this feature vector is sent to a SVM classifier (we also examined a Convolutional Neural Network (CNN) and a K Nearest Neighbour (KNN). But SVM worked better.) and then the classifier will tell us what is the gesture and then we can map that gesture to any task we want.

In the following we describe how can you setup the transmitter and receiver to get CSI values from the physical layer.

There is an injection mode in the updated driver that lets us to generate packets at any rate we want. We send these huge number of packet every second to have a precise information about channel state information during a gesture. We ended up to use this injection mode after a tons of failure with different approaches. (See appendix A).

In order to setup the transmitter to use injection mode you have to cd to linux-80211n-csitool-supplementary/injection/ directory first. Then,  run “” with root privilage.

Then you can generate 2500 packets every second with the following command on the transmitter:

While true; do sudo ./random_packets 2500 100 1 350;done

After that we have to configure the receiver to use monitor mode. For this purpose you have to cd to linux-80211n-csitool-supplementary/injection/ directory on the receiver machine. Then you have to run “” script by root privilage.

For running the actual program you should run script. In this script we have used a Matlab api to run the matlab script which is written for reading the CSI values and conditioning the received signals from received packets. The python script tells the driver to record CSI values and then run the matlab script to create the CSI waveforms. These waveforms are too noisy and cause a problem for classifier. So we applied a low pass filter to these waveforms to reduce the level of noise before sending them to a classifier. In figure 3 and figure 4 you can see a CSI waveform before and after applying the low pass filter. Then the matlab script will return the smoothed signal to the python script and then it will create a feature vector and send it to the SVM classifier.


Figure 3: Before applying low pass filter


Figure 4: After applying low pass filter

4. Evaluation

To use this system the first thing we have to do is to train it. We trained this system in an environment without any interference. The first time We decided to classify four different gestures we created 30 samples for every gesture. Then we used a cross validation approach to see how well our classifier work. We trained the system with 80% of data and then tested it on the other 20%. In average it gave us 87% accuracy.

For the second time we decided to test our system with 3 different gestures and 100 samples for every gesture. Again we used a cross validation approach. For this scenario the classifier gave us 100% accuracy. In the following you can see the waveforms for different gestures. Notice that different gestures produce different pattern in the CSI waveforms. Each of these slideshows is related to different samples of a specific gesture:

This slideshow requires JavaScript.

This slideshow requires JavaScript.

This slideshow requires JavaScript.

5. Discussion and Conclusion

Device free gesture recognition systems is a new approach which lets the user to interact with their computer without wearing anything. One of these approaches exploits WiFi signals around a user to recognize their body movements and their gesture. In this project we built GestFi which uses a pair of commodity wireless card to recognize different gestures. Our system can detect 4 different gesture with a linear SVM classifier with 87% accuracy and 3 different gestures with 100% if you increase the number of train set. This system is tested in a controlled environment without any interference. Using WiFi signals has its own limitations. One of them is presence of other people near the user. The other one is that based on our experiments you have to train this system in every new environment that you want to use it. The other one is that the position of transmitter and receiver should be fixed. Even a very small change in the position of these devices make the system useless and forces the user to train it from the beginning.

Appendix A:

In this appendix we mention some of our failed paths.

Before using injection mode we tried to connect a intel chipset to a normal access point and send a ping flood to access point to collect CSI values from the reply packets. This approach doesn’t work because normal access points is not designed to receive 2500 packets per second and after a while these huge number of packets make the access point buffer full and make it to crash. You can reduce the number of packets per second but it results in a low resolution waveform and classifier can’t distinguish between different signals very good and it provide a low accuracy.

We also tried to send this ping flood directly from access point to the intel 5300 chipset. But this approach also had same problem as previous one. The receiver buffer will become full and the system will crash.

We tried to use some kind of different normalization methods (normalization of every subcarrier and every sample and every feature) on the feature vectors to make this solution more generalize to train it once and use it in different environments but it didn’t work. Even a small change in the orientation of transmitter or receiver completely change the channels and CSI waveform shapes would be different.

Appendix B:

You can find all of our codes in our github for your future use:

The link is:


This entry was posted in Final Project Reports. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s