Gesture based recognition systems are getting more popular. People use lots of different sensors and approaches to detect gestures and control the devices around. In this project we have created a glove by using five LightBlue Beans (They are small arduino boards with an accelerometer). This glove can be used to control a music player which is connected to Spotify server. It can be used while you are driving or riding or even running to control your music player. In second week of the project we developed our own music player. In third week we programmed the beans to send their acceleration data to the host machine which is a Linux machine. Then we created a train set of 10 different gestures (100 samples for each single gesture). Then we used a linear SVM classifier to classify these gestures. The accuracy that we got was 96.5 %. Then we used the output of this classifier to perform different tasks in our music player.
Figure 1, demonstrates the different building blocks of our system:
There is a push button on one of the beans (bean #3). For energy saving we only send bean’s data to host device whenever this button is held. So for doing any gesture the user should hold this button first. The gestures are designed in a way so that user can easily hold this button.
In the following we will describe different steps from gesture creation by user to controlling the music.
Step 1: The first step is holding the push button. We programmed bean #3 in a way to send the acceleration data to the Node-Red which is run on the host device every 100 ms whenever the button is held. After holding the button user should start his gesture.
Step 2: After Node-Red receives acceleration data from bean #3, it’ll send pull request to other beans which are attached on the fingertips to get their acceleration data as well.
Step 3: Other beans (bean #1,#2,#4,#5) will send their acceleration data to the Node-Red as well.
At the end of gesture user should release the button and then bean #3 will send a end signal to the Node-Red.
You can see the actual flow diagram of Node-Red in figure 2. There is a serial connection to bean #3 that will receive acceleration data whenever the button is hold. After that to pulling the acceleration data from bean #1, #2, #4 and #5 we have connected the output of bean #3 to “bb1 accel”, “bb2 accel”, “bb4 accel” and “bb5 accel” and the output of all of them will be sent to Music Player through TCP connection over port 8887
Step 4: Node-Red will send the acceleration data of all beans and the “end” signal to the receiver socket of Music Player over a tcp connection continuously. When receiver socket gets the “end” signal it’ll create a feature vector of all the acceleration data and then it’ll send this vector to the SVM classifier. size of the feature vector is 250. But if receiver socket doesn’t receive enough acceleration data from the user it’ll fill the remaining element of this vector by 0.
Step 5: The classifier is a linear SVM which is trained by 80 samples of 10 different gestures (800 in total). We found that linear classifier works better than polynomials in this case. This classifier is tested by 20*10 gestures in the test set and its accuracy is 96.5%.
The output of the classifier will be a number from 0 to 9 which each of them is mapped to a specific task in the music player.
In week 3, we added two more tasks to this music player which are like and unlike. There is a favorite music list in our app. User can add and remove some musics to this list by doing “like” and “unlike” gestures. the gesture for “like” is thumbs up and the gesture for “unlike” is thumbs down. So the total gestures are:
(1) play-pause, (2) stop, (3) next track, (4) previous track, (5) fast forward 10 secs, (6) fast forward 20secs, (7) rewind 10 secs, (8) rewind 20 secs, (9) like and (10) unlike.
You can see the gesture related to each of the above tasks in the slideshow.
Step 6: In the last step the command will be sent to spotify controller method and based on the predicted class the corresponding task will be performed. This component is mostly like what we have for getting “text based” commands from UI.
All the related files can be found on my github at: