Teaching computers to converse

When Amazon Echo was first launched in 2014, it was primarily pitched as a smart speaker allowing you to control your music with voice. The next big phase in conversational revolution began in 2016 with the advent of chatbots [1]. Facebook announced a developer friendly platform to build chatbots on messenger. Soon there were tons of toolkits to create bots in minutes. Chatbots are essentially natural language text interfaces constructed using rules and are navigated by pre-defined flows [2].

Bots became a hype owing to app fatigue. Companies saw bots as a new way to generate traffic, usage and engagement on their application without having to install any app. However, the bot technology only worked for conversations with predefined flows like ordering flowers or creating a restaurant reservation. The language of interaction is not so “natural” after all – try asking the bot a complex question with differently timed starts and stops, tones, pauses or implied meanings. As a result, chatbots didn’t deliver on the expectations and became useless pretty soon.

Real conversations require a level of comprehension and cognition that goes far beyond the’ predefined flows. They are conceptually as well as emotionally complex. It is not just about an input-output mapping. There are many parameters like what we say, how we say, where we say, when we say and why we say, that together provide the context of the conversation [3]. In the true aspirational AI world, computers would be able to capture these nuances. Secondly, human conversations often have imperfections like slangs, multi-language sentences, abbreviations, accented pronunciation, etc. which the computer needs to parse and hopefully generate the response in the same “language”. Thirdly, a conversation is not an isolated exchange of messages. It often builds upon previous exchanges, not necessarily all consecutive. A true conversational AI system would be able to identify the related messages from the history and built upon that context to make the exchange more relevant and useful to the user. Moreover, human conversations are tasteful due to unique personality traits, quirks, emotions, surprises etc. and computers would need to incorporate these to make interaction fun and enjoyable. Only then would the smart assistants become truly smart and truly personal!

Many companies like Amazon, IBM [4], Google, Facebook, Intel [5] are investing in this area as it is clear that this is the next big revolution in human computer interaction which feels much more natural than clicks, hovers or typing, and is easily accessible to everyone due to absence of any learning curve.

1. https://venturebeat.com/2016/08/15/a-short-history-of-chatbots-and-artificial-intelligence/
2. https://www.forbes.com/sites/forbestechcouncil/2017/12/04/the-rise-of-conversational-ai/#1e34d65b3b91
3. https://developer.amazon.com/alexa-skills-kit/conversational-ai
4. https://www.ibm.com/watson/advantage-reports/future-of-artificial-intelligence/ai-conversation.html
5. https://www.intel.co.uk/content/www/uk/en/it-managers/conversational-ai.html


2 comments on “Teaching computers to converse”

  1. Hey – great post! I think we’re definitely getting better at NLP and contextual/conversational AI and chat bots. However, for the system to effectively work like we would want it to – understand slang, build upon previous exchanges, pick up on our quirks/traits/emotions etc – we would need to allow such devices to constantly monitor us around the clock in order to “chime-in” or instantly understand the context of any requests we may have. It would be cool if there was a way to ensure that this could be done while maintaining privacy – and I think companies need to get better at providing that level of trust to their customers – that their data isn’t being sold on and is being used solely for the purpose of making the product or service better for the customer. To truly leverage the full potential of these technologies, and have them integrated into our lives in a positive manner, we first need to tackle data and privacy issues. Transparency will play a huge part in all things privacy going forward – there needs to be clear trace-ability of where, what, how, why and when our data is being used – and only then will customers fully embrace integrated technologies. All we’re missing is good data – and a way to harness/capture that data while ensuring privacy – sounds simple enough!

  2. I think it is interesting to look back at the evolution of these voice assistants. Apple’s Siri and Amazon’s Alexa have improved a lot since they were first launched, but there is still definitely a long way to go. Sundar Pichai talked about Google Assistant, hoping it will become “naturally conversational and visually assistive.”1 It was also interesting to see the demo of Google Duplex, where the voice assistant was able to adopt some unexpected human behaviors.

    1. Google Developers (2018), Keynote (Google I/O ’18). Retrieved from YouTube: https://www.youtube.com/watch?v=ogfYd705cRs.


Comments are closed.