Control your A.I.
Enable an AI to say what it sees.
Image classification is difficult not so much because of the algorithm, but because of I the need of a dedicated training set. Many if not most academic papers use packages such as the CIFAR image set found at www.cs.toronto.edu/~kriz/ cifar.html – it takes one look at its contents (as seen in the image on the right), to appreciate that the number of detection categories in this data set is highly limited.
If we’d written this six months ago, the next step would involve parametrising a specific model. Fortunately, advances in AI tech have led to the rise of the Hugging Face company. One of its most interesting products is shown below. For reasons of posterity, we will reuse the virtual environment created before. The next step involves downloading the actual library: (aitranslator) ~/aitranslatespace$ pip install transformers
Hugging Face’s model zoo is backed up by an abstraction layer that permits for easy trialling of different models.
COMPUTER VISION – LOCALLY
Debating whether the task of image classification is computer vision or artificial intelligence is a good way to keep a cigar lounge full of engineers busy. Given that this magazine is printed on non-smokable paper, we will agree to disagree and consider image classification as part of the task at hand.
Various vendors such as Microsoft Azure make significant profits by offering computer-vision and image-classification services. While using them is appealing, as the models provided are extremely high quality, in practice, drawbacks also have to be considered.
First, image data is large; even when compressed to JPG or similar format, transferring images to the server (maybe even abroad) adds additional latency to the system. Furthermore, in the case of safetycritical systems, disconnecting the internet connection is enough to make the AI part of the defence system non-workable.
Finally, purchasing cloud services incurs significant costs – while playing around with the actual cognitive services is not particularly expensive, in the longer run, compute costs do add up. In addition to that, vendor lock-in makes you dependant on the (often finicky) mode de jour of your provider.
Fortunately, machine vision is one of the oldest areas of Python AI and was interesting long before the ChatGPT and co firestorm hit the tech industry. Due to that, developers have the choice of a wide variety of options, one of which we are using in this project.