The relationship between video and artificial intelligence is changing our experience of the world. Artificial Intelligence (AI) refers to “machines that respond to stimulation consistent with traditional responses from humans, given the human capacity for contemplation, judgement and intention.” (Shubhendu and Vijay in West and Allen 2018). In layman’s terms; computer programs that are built to think and often learn in a similar way to the human brain. However, unlike humans, AIs can have extensive access to continually expanding memories and can utilise and understand sets of data that span across the globe. The reach of artificial intelligence is growing exponentially, and a key benefit is that AI can often replace the human input required by certain processes – at a much faster pace.
As AIs can be taught and built for any task, their potential is virtually endless; however, some of the most interesting uses arise when it crosses paths with video.
One area of AI development that has grown substantially, in recent years, is facial recognition. This has wide reaching benefits in terms of personal and corporate security; safeguarding and private access issues, for example. But, with systems being able to recognise the minute details of people’s faces, the science fiction future that we have seen in so many movies is becoming an everyday reality. A famous example is Minority Report with Tom Cruise – where we see a futuristic interpretation of the power of AI and facial recognition used to predict potential criminal activity. In China, AI is being used to match live video feeds with a database of portraits collected from government issued IDs. This information is used to target advertising, track criminals and even publicly shame jay walkers, among various other applications.
This video from The Wall Street Journal shows how the AI is used:
If this isn’t sinister enough already, the Chinese government have recently been condemned for its tracking and control of the Uighur muslims. This recent article from the Guardian delves deeper into the nature of facial recognition in a political and ethical context.
The number of cameras and video content being captured is so massive that it is becoming impossible for security operators to keep up. In this context AI does the job of analysis, meaning that the human can focus on critical decision-making.
The question is, will the AI ever be able to make the critical decisions for us?
As far as critical decision-making goes, the use of AI in video editing is an area worth exploring. More and more tools are being developed that can replace the traditional role of a video editor.
One of the most well know AI platforms is IBM’s Watson. Developed in 2011 to answer questions on the American quiz show Jeopardy, Watson has grown into a huge AI platform. In the world of video, IBM used Watson to pick the best moments in the 2016 film Morgan to create one of the first AI assisted movie trailers. By analysing other trailers, Watson learned what makes one good, and how to select and curate similar scenes in order to make something that draws people into the cinema.
However, the AI didn’t really create the trailer, alone;
“After learning what keeps audiences on the edge of their seats, the AI system suggested the top 10 best candidate moments for a trailer from the movie Morgan, which an IBM filmmaker then edited and arranged together.” (https://www.ibm.com/blogs/think/2016/08/cognitive-movie-trailer/).
So, Watson is no Walter Murch; but it did reduce the length of the process from weeks to hours – showing the true power of AI. This is the trailer:
Jumping forward to today, a new Watson based system has been in the news which could give video editors a run for their money. During this year’s Wimbledon Tennis Tournaments, the AI has been used to analyse live video footage of all the tennis matches – including crowd reactions and the gestures of players, to create a highlight reel within minutes of the match finishing. The level of detail is formidable – Watson will listen to the sound of the racket hitting the ball to make sure that the cuts are precise.
In all these cases, AI is being used to analyse tasks with video and make them quicker and easier; spotting things that people may have missed.
At Nice we are working with video and AI in the context of learning and development. We have developed a solution called Total Recall. From a user perspective, Total Recall involves watching a video and then answering questions based on the video you have just seen. From a production perspective, the AI reads the transcript of the video and creates the questions in seconds.
The AI aspect of Total Recall is powerful because of the speed and efficiency of production. It’s also powerful because the AI algorithms have been built on solid learning theory. Using techniques such as open input and delayed feedback means that the style of learning that is produced by the AI is more likely to ‘stick’ with the learner.
Video as the medium is the icing on the cake – because everyone loves watching (great) video: the element of empathy helps us to identify with a situation and helps us to immerse ourselves in the learning experience. A great combination of video with AI creates rememberable learning experiences at a push of a button.
Want to learn more? Click Here