Computer Vision – How does it work and how enterprises can leverage this tech?

May 24, 2021

“Beauty lies in the eyes of the beholder” – This holds true for human beings, but does it hold true for Computers as well? Let us find out.

Human eyes capture the image and the human brain analyses to extract some meaning out of it.

But have we ever thought as to how the brain analyses and understands the images? What algorithms run in our brain which can correctly distinguish between an Apple or Orange? Or how can it correctly identify a human being and give it a name?

The brain is one of the most complex organs of the human body. It has inspired scientists to explore the ideas of brain functioning through technology and see things that are beyond human capabilities. In a way, this is making humans superhuman.

what is the Brain made of?

A complex network of 100 billion neuron cells makes the human brain. Each neuron cell has many inputs called dendrites and 1 output called Axon.

human brain
Image courtesy: Google

Each neuron acts as a logical gate to filter and outputs a relevant signal which then passes to many other neurons creating a complex network.

So how does it relate to Computer Vision?

Computer Vision deals with Images. Before understanding what Computer Vision does, let us understand Images and the structure of images.

Each image is a matrix of pixels (intensity of images) with RGB channels (for Colour images). In short, an image is a matrix of numerical values (intensity).

images structured.
Image courtesy: Google

So just by changing the values of pixels, you can change/transform the image using specified image masks:

values of pixels
Image courtesy: Google

How does this transformation help in Computer Vision?

In real life, Images come with a lot of noise, depending on the type of image capture (X-ray, Digicam, Ultrasound, CT Scan, etc.), there can be different types of noise that hides the important details in the image (E.g.: There may be an Image of minute tumour present in Xray, which goes undetected through the human eye), on transforming the image through some filters, tumour can be detected. Mapping of other Vision areas to detect edges, corners of the object using different image filters.

Techniques mentioned above, come under Classical Image Processing Techniques, which still hold true along with modern Image processing techniques using AI/Deep Learning.

What is AI/ Deep Learning?

Artificial Intelligence is a big umbrella under which comes Classical Machine Learning and Deep Learning (Neural Networks) and various other areas like Data Science. The basic working of AI is to develop the algorithm through Data. This means for different data, there will be different algorithms whereas in traditional programming, we develop the algorithm based on fixed rules.

Deep Learning work with billions/trillions of data points (which is in the case of Images as each image has huge number of pixels and to create smart algorithms, we require huge number of images).

Deep Learning contains various Neural network architectures inspired by biological neurons. These architectures, and a huge number of Image Dataset, learn pattern/features from the images and are fine-tuned to create an algorithm that further detects patterns on unknown images. (E.g.: face pattern for Face Recognition, object pattern for Object detection and many others).

How Computer Vision can be utilized by Enterprises?

The use of Computer Vision promises huge opportunities in almost every domain like Agriculture, Cyber Security, Transportation, Defence, Chip Manufacturing, Life Sciences, and Healthcare, etc.

In Agriculture, Vision-based algorithms detect the quality of farm produce, timely monitoring of ripening of fruits/vegetables, detection of leaf diseases, and then suggesting the right amount of nutrients.

Cyber Security, detects mouse movement patterns and any fraudulent user and then flagging them.

In Chip manufacturing, detection of components in PCB and minute level of lamination defects can easily be detected using trained Vision models.

In Healthcare, early-stage detection of tumors can help prevent cancer or another life-threatening disease.

There are numerous other domains where computer vision can be used, and the opportunities are infinite. The simplest one we see is Face Recognition in our mobile phones, irrespective of your beard or changes in your face muscles, every time it recognizes your face correctly.

Computer vision algorithms today are easily identifying the hidden Beauties of nature that are not visible by the human eye. Thus, the statement “Beauty lies in the eyes of the beholder” stands true for technology as well.

This blog has been written by Ketan Gaydhani, Technical Project Manager at IVL Global. With 14+ years of working experience in Application Support, he is actively involved in solving functional and technical problems for the customers so that they can have a seamless experience with the application. His area of interest includes interacting with customers to understand their business so that their user experience about the application can be delightful and their business challenges can be solved to increase their revenue stream.