As Artefact, we care about positively impacting people, the environment, and the community. That’s why we are committed to partnering with nonprofit organizations that make these values the basic building blocks of their vision.
Therefore, we collaborated with Smart Parks, a Dutch company that provides advanced sensor solutions to conserve endangered wildlife and efficiently manage park areas by providing cutting-edge technology.
In this series of posts, we chronicle our journey in designing and building an ML system to use Smart Parks’ camera traps media. In particular, the goal of the project is to use an ML approach to ingest the data coming from the camera traps and then provide insights, such as the presence of people or specific kinds of animals in the images or videos captured by the cameras. This information then is used by the park rangers to better protect the wildlife and sooner detect possible dangers like poachers.
Introduction
Smart Parks needed a wildlife monitoring system able to accomplish the following tasks:
Our guiding principle here was one that favored speed. So, as we got started, our singular priority was to deploy a barebone but fully functioning end-to-end product as soon as possible.
This will be the first article of many and it will focus on the context of the project, the high-level view of the designed system, and the advantages of our cloud-based solution. In the upcoming ones, we will go more in-depth into how to connect camera traps to the Google Cloud Platform and external endpoints using a tool called Node-RED and how to design a simple web app using Streamlit to manage the camera traps placed in the parks.
Let’s get started!
Camera Traps
Before we jump in, let’s quickly review what camera traps are and how they can be used to support animal protection and conservation.
Camera traps are devices that have built-in sensors so that when activity is detected in front of them, a picture or a video is immediately taken. They let park rangers and wildlife biologists see our fellow species without interfering with their normal behavior.
Going around the parks and collecting information is a valid technique, but it is an expensive, labor-intensive, and people-intensive process. In addition, there is also the risk of running into dangerous wildlife or, even worse, poachers.
While different techniques to collect data come with different tradeoffs, camera traps are an excellent source. The great advantage of camera traps is that they operate continually and silently and they can record very accurate data without disturbing the photographed subject. They can be helpful both in surreptitiously monitoring possible illicit activities and in quantifying the number of different species in an area and determining their behavior and activity patterns.
Google Cloud Platform
For the camera traps media storage and management, we chose to use a cloud-based solution, more in particular, the Google Cloud Platform.
Google offers Storage solutions like Google Cloud Storage, object storage with integrated edge caching to store unstructured data, compute solutions such as Cloud Functions, Functions as a Service to run event-driven code and it also offers useful AI APIs for example:
Having all these components in a single unified environment was the ideal solution for us and helped us provide a working solution in a short time.
The Workflow
First of all the media are uploaded to a Google Cloud Storage bucket, how exactly this happens will be discussed in the second article of this series. The bucket is organized in folders, one for each camera trap. Once a file is uploaded, a Google Cloud Function is immediately triggered, this function takes care of the following tasks:
This architecture provides multiple advantages:
Cloud Vision and Cloud Video Intelligence APIs
Using Machine Learning, specifically, Computer Vision, to automatically identify people and animals in images or videos has seen significant advances in recent years and nowadays it is widely considered a “game-changer” by wildlife researchers. Let’s focus more on the used APIs.
Vision API and Video Intelligence API offer powerful pre-trained machine learning models through REST and RPC APIs. The first one is meant to work with images whereas the second one, as the name suggests, with videos. Both of them are capable of automatically recognizing a vast number of objects, places, and actions.
For this project we focused mainly on these 3 features provided by the APIs:
You can play with the Vision API just by uploading your image over here.
The trail ahead
The journey so far is a foundation for the exciting and impactful journey that lies ahead. With basic tooling in place in the near future, we’ll be able to create a lot of value not just for Smart Parks but also for wildlife conservation and beyond!
The next steps will involve these broad areas of work:
In this first article, we discussed how we built our fully automated scalable pipeline in Google Cloud, enabling us to ingest media and use Machine Learning APIs to extract insights from them. It provides a solid, easy, fast-to-implement, baseline for any kind of project that involves media consumption and the use of machine learning to extract insights from them.
Thank you for reading and see you in the next articles of the series where we will explain more in detail how the presented architecture is effectively connected to the camera traps, and where we will go through the web app designed to manage them, so stay tuned!
Special thanks to Maël Deschamps for his help in reviewing this post’s content, and to Tim van Dam from Smart Parks for his support during the project. You rock!