Machine Vision and Big Data

I started my computing career with machine vision. It was my first large sized project in school a few years ago (or was it a few decades ago but who is counting). Anyway, the project was to determine defects in computer chips with machine vision. We had employed a simple neural network learning algorithm that tried to determine if the image of the chip was a match with the image of a good specimen stored in the system or not. So, what we were trying to solve was comparing the captured image with one fixed image and deciding on if there was a match within an acceptable level of deviation. It was not perfect but it worked. It was unfortunate that the company that provided us with the grant money decided to not deploy the solution in production for reasons best known to the company.

Anyway, today I came across this nicely written short editorial piece in International Journal of Computer Vision titled simply Big Data. The author describes the new techniques that are now available to the area of machine/computer vision mostly driven by advances by machine translation and speech to text processing. Both these areas came into their own once large training data sets were used. And this was possible only with general advances in storing, retrieving and processing big data sets. For more details please refer to the article here.

The current state of image recognition is best illustrated by the two examples below. In the first set the learning engine was asked to choose an image in either column 1 or 2 that looked liked the image in the remaining columns (3-5) in each of the two sets. So, for first set the learning engine needed to identify a horse and in the second set a face. And as you can see the system was able to do so correctly. Of course this by does not mean that the same algorithm would be able to identify a tree if given bunch of pictures with shrubs etc. But, it s a promising start.

Machine Vision to identify a horse and a face in a set of images

  • Rubinstein, M., Liu, C. & Freeman, W.T. Int J Comput Vis (2016) 119: 23. doi:10.1007/s11263-016-0894-5