Stories-My project -- Chatbot with gesture based user input
So this was the topic (that is given in the title) of my "paper" or "project" for the final year in my college. So initially the topic I selected was just "gesture recognition" and I went forward with that and worked on it whenever I got the time. So at first I just saw some tutorials that would come up naturally with gesture recognition and I also looked up the OpenCV library(one of my friends also talked about it to me), So I landed on an interesting article, the author of whom had gone through several tutorials of hand sign gesture recognition and had briefly summarised it and presented in a list .
I suppose it was a good article in the sense ,it gave me a lot of information in terms of different paradigms to approach towards implementation without getting bogged in the details. I naturally read up several concepts that were involved in it, So the one I like the most and kind of thought that it was simple enough was the convexity defect module. There two reasons for it, one the code was in python so it didn't require me to learn some complex C or C++ API for whatever libraries were used or whatever logic that might be needed for these low -level but powerful languages. Not that I had any problem with some complexity ,but for my purposes ,for the college reviews (for grading) ,I wanted something that could be implemented easily. And perhaps also I just needed a backup in case I am not able to complete the project in time ... So the code would be available from the website. (Of course thankfully that didn't happen). Two ,The code was also small in amount meaning it just used very high level functions and one more thing was that in python programs are super easy to implement
Regardless I followed the tutorial from the website referred for the complexity defect method.
So it was not a bad tutorial albeit it had outdated reference to OpenCV and may have used an older version of python. But it did introduce me to many of the OpenCV functions for manipulating images such as transformation to grayscale, binary as well as finding geometrical properties of object captured in the images such as contours, etc. And it also introduced to a special algorithm called convexity defect algorithm which finds the cavity points in the image of the captured object(for example If we capture a gesture of a hand holding four fingers then the cavities would be the points between the fingers). The website hosting the tutorial was a nice little blog ,which seemed to contain personal stuff also. But it did seem like he definitely knew what he was doing.
One of the other benefit I got from this project was that it introduced me to OpenCV ,a powerful computer vision library available in various languages. To understand the code(tutorial) I looked up many functions in the documentation and there were brief explanations given and perhaps due to my background in C++, Java and spending many hours reading up on many classes of different APIs ,I found it easy to know what to look for , Of course , in hindsight it makes me realize that there is one more aspect I need to focus on which is probably data and the different ways in which it is represented and accepted using I suppose in this case python but generally any language. All in all ,it was good experience.
I didn't talk about chatbots. Chatbots I wanted to learn it from the ground up ,so I decided to learn in java. I found a article in codeproject.com . It had what I wanted ,it explained chatbot from the ground up ,from basics. What surprised me was that it didn't use artificial intelligence at all, as I was expecting a heavy dependence on AI. It was based on the keyword concept , simply storing the keyword and the associated responses to it and then checking each word in the sentences input and searching against it in the database. I suppose, now that I think about it ,it seems absurdly simple and perhaps is very limited.
For instance , designing the database containing keyword and response is based on your subjective knowledge of how to respond in certain situation and not to mention how time consuming it will be just to design a few sentences. And I think it doesn't really allows for open-ended conversations that often happen with people from real world where everyone has their own point of view. Perhaps maybe the thing you can easily design would be fact based response ... One more negative aspect I noticed in the said tutorial was that in the knowledge base (Database) ,all the responses the code author had put up were negative in nature ,although for each keyword there were three responses ,all them were negative albeit having just different wordings in nature. So he may have rushed a lot with this part ,as not a lot of thought may been put up in this part. But when I tested the chatbot in java I found it was not that bad... because if a user tries to converse with the bot, say for five to six times then he may not figure out that its the chatbot he is actually conversing with. (Although if he is starting from testing purposes ,he could maybe figure out in 10 tries itself)
I suppose the original aim I wanted to accomplish with chatbot was not to hold a real conversation but just to provide rich processing capabilities so that it can accept different sentences in the input and give appropriate responses, What I have in mind ;when I say a response , the "response" is based on my experience with Siri from iPad and google assistant from android , where they are able to gather what you want from your input even if you have spelling mistakes or semantic mistakes or even if you use some kind of short form like acronym. And also the way this software is able to interact with the different hardware components attached . They are able to perform the function the user wanted them to do.
Basically what I think is important is that when the user feels that they are going to chat with a bot, they shouldn't feel restricted otherwise this becomes a turn off for many users to continue to converse further. This is even more important when the said person is conversing with the use of gestures. What I wanted with the "gesture" part was that; I wanted to basically give each gesture a meaning corresponding to one english word and I was hoping that they then can be used to form sentences and so in this way the deaf person can communicate the same way a person with voice would with a computerized chatbot assistant . So basically I also assumed that they should be able to produce gestures rapidly and without much training. But I suppose this might only be possible for people in the deaf community .
Comments
Post a Comment