In the context of digital transformation, Artificial Intelligence (AI) presents a tremendous opportunity to disrupt traditional business models.
I am personally interested in Computer Vision (CV), which is categorised as a subfield of Artificial Intelligence (AI) and includes many interesting business value opportunities that could drive digital transformation.
A common misconception is that digital transformation initiatives require significant resources and investment. This is simply not the case, with most successful initiatives starting small, with a clear outcome and a ruthless focus on execution through iteration.
With this in mind, I plan to create a small Computer Vision test lab for education and testing purposes.
What is Computer Vision?
Computer Vision is a field of study that seeks to develop techniques to help computers “see” and understand the content of digital images such as photographs and videos.
In short, the goal of a Computer Vision problem is to use observed image data to infer something about the world.
As previously stated, Computer Vision is a multidisciplinary field, often categorised as a subfield of Artificial Intelligence (AI) and Machine Learning (ML), which may involve the use of specialised methods and general learning algorithms.
Common Computer Vision techniques include:
- Image Classification: Classification based on contextual information in an image.
- Object Detection: Define objects within images.
- Object Tracking: Follow an object of interest in a given scene.
- Semantic Segmentation: Understand the role of each pixel in an image.
- Instance Segmentation: Segments different instances of classes.
The proposed Computer Vision test lab will provide a real-world environment to explore these techniques, looking to identify strengths, weaknesses, barriers, etc.
Computer Vision is an exciting field of study, which continues to gain momentum across many industries.
In the context of work, the list below highligts a few examples where Computer Vision could potentially add value.
Anomaly Alert System: Provide real-time surveillance of a specific location, autonomously alerting users when a specific object enters the scene and/or executes a pre-defined movement.
Security/Privacy Filter: Monitor live video streams (e.g. video conferences, webinars), automatically detecting and redacting specific objects (e.g. locations, individuals, displays).
Employee Social Distancing: As a response to the COVID-19 pandemic, help users practice safe social distancing (physical separation) by monitoring office locations, notifying users of any risk.
Fever Control: Monitor human temperature, detecting potential fever symptoms, delivering an immediate notification to the user.
Accessibility: Support users with a visual impairment to remain safe and productive within the workplace.
The proposed Computer Vision lab must be capable of testing these use cases.
I have defined the following high-level requirements to support the creation of a Computer Vision test lab.
The Computer Vision Test Lab capital investment must be low (< £200.00), with minimal operating costs (< £100.00pm).
The Computer Vision Test Lab must be viable for use at rural locations, including limited (intermittent) network connectivity and power consumption (< 10 Watt).
The Computer Vision Test Lab must accurately represent a “real world” production architecture, including edge infrastructure (e.g. Camera, Compute, Storage).
The Computer Vision Test Lab must support modern AI/ML software libraries, including CUDA, cuDNN and TensorRT.
The Computer Vision Test Lab must leverage open-source licensed software libraries (e.g. OpenDataCam, etc.) Ideally targeting MIT, GPL 2.0/3.0 and Apache Licence 2.0.
The Computer Vision Test Lab must support autonomous operation, without physical or virtual human intervention.
The Computer Vision Test Lab must include appropriate software and physical security controls (Zero Trust), including the ability to physically secure all compute/storage devices.
These requirements aim to ensure the Computer Vision test lab has a defined scope and can be easily reproduced.
The high-level diagram below outlines the proposed Computer Vision test lab architecture.
I have selected an Nvidia Jetson Nano Developer Kit as my edge compute/storage capability.
The Nvidia Jetson Nano is a small, reasonably powerful computer that can run a wide range of advanced networks, including the full native versions of popular ML frameworks such as TensorFlow, PyTorch, Caffe/Caffe2, Keras, MXNet, etc.
The table below highlights the performance benchmarks, which are commonly multiple times better than completing edge computing capabilities (e.g. Raspberry Pi, etc.)
When combined with the Nvidia JetPack SDK, the Nvidia Jetson Nano delivers an easy-to-use, cost-effective (£100) and power-efficient (5W) compute/storage capability, capable of supporting image classification, object detection, segmentation, and speech processing, etc.
Bill of Materials
The proposed Computer Vision test lab architecture is very cost-effective. I have selected the following components based on their interoperability.
- Nvidia Jetson Nano Developer Kit: £101.50
- SanDisk Ultra 64GB A2 microSD Card: £9.69
- Geekworm 5V 4A 4000mA AC-DC Adaptor: £12.99
- Raspberry Pi Camera v2.0+: £24.00
- EDiMAX EW-7611ULB USB Wi-Fi Adapter with Bluetooth: £12.17
- Microsoft Azure: £75.00pm (Variable)
The total one-time cost is £160.35, with a variable monthly cost from Microsoft Azure (estimated at £75.00).
Over the coming weeks, I plan to have the Computer Vision test lab up and running, at which point I will document the setup and initial findings.