The Power of Thought
As humans, our ability to think surpasses any other creature. Our brain is extremely complex. However, with the advancement in Learning using Deep Neural Networks this ability is replicated in computers as well. The algorithms of Artificial and Recurrent Neural Networks is based on the structure of the human brain.
Google’s TensorFlow [ https://www.tensorflow.org ], Facebook’s Pytorch [ https://pytorch.org ], H20.ai [ https://www.h2o.ai ] and many more companies are creating groundbreaking architectures in Deep Neural Networks. By training Neural Net on past data or with allowing a Neural Net to learn on its own using Re-enforcement learning we are effectively able to create a brain in a computer than can think better and faster than us humans, but for now limited to a specific task.
Machine Learning algorithms that can be implemented with scikit-learn [ https://scikit-learn.org ] or Knime [https://www.knime.com] are also used to make specific data-based decisions as well. Using trained framework ML algorithms are used to make predictive or grouping based decisions that we humans inherently do.
The Power of Sight
We as humans can perceive the world around us with our power to see. We do two things when we see i.e., recognize what items are and where they are located.
We train robots to do both these as well. Libraries such as OpenCV [ https://opencv.org ], SimpleCV [ http://simplecv.org ] gives us programmers the ability to transfer our power of sight to computers. We can train the computer to work on static images or video feeds from cameras and then build algorithms to allow robots to understand those images. Algorithms such as Yolo [ https://pjreddie.com/yolo ], Faster RCNN (Shaoqing Ren at al, 2016), Single Shot Multibox Detector (Wei Liu et al., 2016) and more are excellent at detecting objects in a scene once trained.
One of the other things we as humans can do is to read, our brains are easily able to transfer written text into information for processing. OCR Engines like Tesseract [ https://opensource.google.com/projects/tesseract ] or ABBYY FineReader [ https://www.abbyy.com/en-eu/finereader ], which combined with an Efficient Accurate Scene Text (EAST) Detection Algorithm (Zhou et al.’s 2017) we can easily detect as well as convert text into string data.
The Power of Sound
Communication with each other through sound is a very powerful medium for interactions that we humans use. Computers, on the other hand, are easily able to communicate with each other using binary data forms and for inter-computer based communication, the sound is not a very useful medium for them.
However, when robots want to communicate with us humans or vice versa, we need sound and speech. Applications like Amazon’s Alexa, Apple’s Siri, Google’s Assistant are meant to solve that problem. Moreover, apart from just converting sound from one form to the other using Speech to Text and Text to Speech, they can understand intents and respond to requests with Natural Language Processing (NLP).
From a programmers standpoint libraries using CMUSphynx [ https://cmusphinx.github.io ] and Apache’s OpenNLP [ https://opennlp.apache.org ] help, robots understand how we humans communicate.
The Power of Touch
Humans can feel, we can sense changes in temperature, pressure and sometimes our body responds with pain and discomfort when those parameters from the environment go out of bounds. Some of these parameters are required for robots as well as they need to respond to those changes in the environment.
Sensors can measure almost all environmental parameters and quantitatively convert them to numbers that can be taken and processed by robots. With sensors for everything from pressure, temperature, humidity, position, toxicity, and more robots can receive a lot of gigabytes of accurate data than what we humans could measure with our fingertips.
From a programmer’s perspective, Credit Card and small-sized edge computers such as the Raspberry Pi [ https://www.raspberrypi.org ] and the Arduino [ https://www.arduino.cc ] are perfect for receiving, processing and communicating this data from the environment. Sensors connect to the GPIO pins on the boards to receive and process data.
The Power of Action
After receiving the information via sight, sound and touch, the data that proceeds to our brains with our thought then converts them into decisions that we make. Our muscles then transform those decisions into action. Those actions are executed either in the physical world or feed those actions using the computer input devices (the mouse and the keyboard) into a computer to further actionize those inputs.
In the virtual world, this is where Robotics Process Automation (RPA) comes in. Tools like AutoIT [ http://autoitscript.com ] and Sikuli [ http://sikulix.com ] which are free and tools like UiPath [ http://uipath.com ], Automation Anywhere [ https://www.automationanywhere.com ], BluePrism [ https://www.blueprism.com ] and Workfusion [ https://www.workfusion.com ] which are licensed enable RPA.
In the physical world, robots take information from the cognitive functions of the Deep Neural Net and act upon them. Take for example a computer vision-based sorting machine that removes defective parts based on visual inspection.
By studying each of the human abilities and looking at the tools that are already prevalent in the market that replicate each of the individual human abilities that K.I.T.T portrayed into a computer-based algorithm, we can now combine these pieces to create an implementable map for General AI. Transferring the human abilities of Thought, Sight, Sound, Touch, and Action into a Robotic Framework becomes an achievable goal.
The one ability that K.I.T.T had that we have not been able to replicate is human emotion, the feelings of love, trust and respect for each other are still unique to humans, but even those may be algorithmically driven in the days to come!