After years of hard work by Simon team, Simon 0.4.0 is out which is the major release after a long time.
This new version of the open source speech recognition system Simon features a whole new recognition layer, context-awareness for improved accuracy and performance, a dialog system able to hold whole conversations with the user and more.
You can read about the new release in detail here.
It feels amazing when something you have implemented during Google Summer of code 2012 is part of the release. Also, this was the first time i was part of any release
I know, it is very late for posting about my experience during GSoC 2012 but i do not want to miss this opportunity to share it with you all before this year ends.
Let me brief you again with my project, it is based on Multimodal Accessibility in which i am using Computer Vision to improve Speech Recognition in Simon. As the major obstacle for command and control speech recognition systems was to differentiate commands from background noise, so in my project i am using the computer vision to determines when to activate / deactivate the sound recognition using visual cues like when the user is actively looking at the screen/robot and is speaking something.
Why i picked up this project in particular?
I came to know about GSoC from our seniors and the learning experiences they got out of it really motivated me to get into this. So i started searching for projects before the organisation list was even announced. Fortunately, i saw this project idea and i really liked the idea. Also, I was working on face detection lately. I also knew from seniors that the KDE is awesome community and Computer vision is one of my favourite fields. I love to work on something which replaces the normal way of using computers and which replaces physical mouse/keyboard. This all factors together droved me completely towards this project and kept me motivated. So i contacted Peter and we discussed about the ideas and it was the first time i stepped onto IRC Then i kept discussing about it on the irc and the mailing list and that’s how it all started
It has been an incredible learning experience and I’m very happy of the final results. I am more positive and confident than ever. I learnt a lot of basic stuffs like git, makefiles, building, executing and debugging code and much more. This project also acted as a starting point to my contribution to the open source community. It was my first “serious” project with such a big codebase but Simon is nicely documented and with peter’s proper guidance, i was able to adjust very soon.
I also faced many challenges during the period. It was really tough to adhere to the timeline. CMake build system and git was very much new to me. There were many unexpected bugs which surprised me a lot but then it was so much fun figuring it out and fixing it. Also working on UI was time-consuming.
KDE also invited and sponsored me to the Tallinn, Estonia for Akademy 2012 which is their annual conference. It was my first international journey. It gave me opportunity to meet people in real whom i just knew from the irc nick. The opportunity to interact and share thoughts with highly intelligent and experienced minds was a life changing experience, and the biggest takeaway, which would not have been possible without the support of Google, KDE and my mentor.
I also participated in Randa meeting 2012 in Switzerland as a part of KDE Accessibility team. It was my first sprint ever and was really very productive. I implemented vision configuration and solved many bugs there. I would again like to thanks Mario Fux for organizing this fantastic event and all sponsors and donors who made it possible.
Peter has recorded great video on Context awareness which covers most of the things i have implemented during GSoC 2012.
As you can see clearly in the video, that Simon has turned into multimodel speech recognition system. Simon will deactivate the input devices in absence of the user. This is strikingly similar to the day-to-day communication between humans!
I owe a big part of success to my mentor Peter Grasch for always being there to answer my questions, offer advice and review the code. I have learnt a lot from him and I am sure I have improved a lot as a programmer. The best thing about working with him was that he never really disclosed the solution, instead he gently guided towards the direction of the solution, so I never lost a learning opportunity
And thanks to lots of other people in the community as well whose names I am forgetting. While there I would like to thank my friends keeping up with me when I slept during the day and worked at night.
And more than anything else, I am very happy to make my parents proud after so many years of constant hard work they have put and sacrifices they have made to to chase my dream of becoming a computer engineer. I hope it’s the first of many more proud moments that I will be giving to them in the future.
I would like to maintain this project after GSoC and continue contributing myself to Simon/KDE. So stay tuned, there is much more to come
To Future GSoC Aspirants:
I would suggest maintaining good communication links with the comunity and trying to be involved with the project as much as possible.
Peter adviced me to first draw the class diagrams before starting my project and it really helped me in the future. I know we all have habit to directly start with coding but i would highly recommend you to have proper structure diagram ready before starting to code, it will give you clear idea about the implementation.
Try to budget a lot of extra time in your project application – most of us are not experienced developers and cannot estimate the amount of work needed for something correctly. Plus, when some additional problems arise (and they will), it’s always better to have time set aside to deal with them. I would highly recommend you to discuss this with your mentor before submitting your proposal.
Finally, Nothing is too hard to accomplish if you love what you do.