With a detailed notebook: detecting guns from a live camera using Tensorflow API, Google Colab and GDrive.

With a detailed notebook: detecting guns from a live camera using Tensorflow API, Google Colab and GDrive.

Alaa Sinjab
Dec 23 · 15 min read

Model Inference on the Equilibrium, 2002 Video Clip
Police response time is very critical when an incident occurs. In the United States, the police average response time is around 18 minutes.¹
In my last project, I was trying to minimize the police response time by detecting weapons in a live CCTV camera as an approach to alert them as soon as a gun is being detected. The main motivation for this project is due to the increasing number of school mass shootings in the U.S.
In this tutorial, we will:
Perform object detection on custom images using Tensorflow Object Detection API
Use Google Colab free GPU for training and Google Drive to keep everything synced.
Detailed steps to tune, train, monitor, and use the model for inference using your local webcam.
I have created this Colab Notebook if you would like to start exploring. It has some steps and notes that were not mentioned here. I suggest looking at it after reading this tutorial.

Model Inference on the Equilibrium, 2002 Video Clip
Let’s get started, shall we?
Road Map:
1. Gathering Images and Labels.
2. Setting up the environment.
3. Importing and Installing Required Packages.
4. Preprocessing Images and Labels.
5. Downloading Tensorflow model.
6. Generating TFRecords.
7. Selecting and Downloading a Pre-trained model.
8. Configuring the Training Pipeline.
9. Tensorboard.
10. Training.
11. Export the trained model.
12. Webcam Inference.
1. Gathering Images and Labels.
I will be using pictures of pistols. The original dataset was collected and labeled by the University of Granada in Spain. The dataset contains 3,000 pictures of guns in various positions, rotations, backgrounds, etc. The guns are already labeled as well (not the best). You may use your own images or use the dataset I am using here!
➊. Collecting Images:
If you have your images already collected, great! If not, depending on your problem, you can take pictures from your phone or scrape Google for images.
Remember: Garbage In = Garbage Out. Choosing the images is the most important part!
Here are some tips that might help when collecting your images:
At least 50 images for each class. The more, the better! Get even more if you are detecting only one class.
Images with random objects in the background.
Various background conditions; dark, light, in/outdoor, etc.
One of the easiest ways to collect images is by using google-images-download. You may also download the images using this tutorial, it provides multiple ways to collect images from Google.
Save your images in a folder called images
Note: Make sure all the images are .jpg, you might get errors when training if the images are in different extensions.
➋. Labeling Images:
Once you have your images gathered, it’s time to label them. There are many tools that can help you with labeling your images. Perhaps, LabelImg is the most popular and easiest to use. Using the instructions from the github repo, download and install it on your local machine.
Using LabelImg is easy, just remember to:
Create a new directory for the labels, I will name it annotations
In LabelImg, Click on Change Save Dir and select the annotations folder. This is where the labels/annotations will be saved.
Click on Open Dir and select the images folder.
Use the shortcuts to make this faster.

Shortcuts for MacOS:
_____
| CMD + s | Save the label
_
| w | Create a box
_
| d | Next image
| a | Previous image
_
| CMD + + | Zoom in
| CMD + – | Zoom out
_____
By default, The labels will be in PascalVOC format. Each image will have one .xml file that has its labels. If there is more than one class or one label in an image, that .xml file will include them all.
2. Setting up the environment.
➊. Set up The Google Colab Notebook:
Create a new Notebook.
From the top left menu: Go to Runtime > Change runtime type > select GPU from hardware accelerator. Some pretrained models support TPU. The pretrained model we are choosing in this project only supports GPU.
(Highly Recommended) Mount Google Drive to the Colab notebook:
When training starts, checkpoints, logs and many other important files will be created. When the kernel disconnects, these files, along with everything else, will be deleted if they don’t get saved on your Google Drive or somewhere else.
The kernel disconnects shortly after your computer sleeps or after using the Colab GPU for 12 hours. Training will need to be restarted from zero if the trained model did not get saved.²
I also highly recommend downloading Google Backup and Sync app so it’s easier to move and edit files as well as to keep everything synchronized.

Head to your Google Drive and create a folder named object_detection
Open Google Backup and Sync app on your local machine and select that object_detection folder ←
Note: This method will greatly depend on the speed of your internet connection.
.
On the Colab Notebook, Mount gdrive and navigate to the project’s folder, you will be asked for the authorization code:
from google.colab import drive

drive.mount(‘/gdrive’)
# the project’s folder
%cd /gdrive/’My Drive’/object_detection
➋. Upload your images and labels:
Inside object_detectionfolder, create a folder data that will have the images and labels. Choose one of the methods below to upload your data.
Using Google Backup and Sync app:

Uploading the images and annotations folders is easy; just move them to the data/object_detection folder from your computer.
.
.
2. Uploading directly from the notebook:
Note: This method is the slowest.
Use the following to upload directly to the notebook. You will have to either zip the images folder or upload them separately (uploading a folder to Google Colab is not supported). Organizing and having the same files structure is important.
from google.colab import files
uploaded = files.upload()
3. Uploading directly from source:
You could also download directly from source using curl or wget
The working directory so far:
object_detection
└── data
├── images
│ ├── image_1.jpg
│ ├── image_2.jpg
│ └── …

└── annotations
├── image_1.xml
├── image_2.xml
└── …
Tip: You can view the full working directory on Google Colab Notebook by: Open the left panel by clicking on the top left arrow. Or use ⌘/Ctrl+Alt+P
Then Click on Files from the top left menu.
➌. Splitting the images into training & testing:
Depending on how large your dataset is, you might want to split your data manually. If you have a lot of pictures, you might want to use something like this to split your data randomly.
Note: The images inside images don’t need to be split, only the .xml files.

The working directory at this point:
object_detection
└── data
├── images
│ ├── image_1.jpg
│ └── …

├── annotations
│ ├── image_1.xml
│ └── …

├── train_labels //contains the labels only
│ ├── image_1.xml
│ └── …

└── test_labels //contains the labels only
├── image_50.xml
└── …
3. Importing and Installing Required Packages.
➊. Install Required Packages:
Google Colab has most of the packages pre installed already; Python, Tensorflow, pandas, etc.
These are the packages we will need and they don’t get pre installed by default. Install them by running:
!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk
!pip install -qq Cython contextlib2 pillow lxml matplotlib pycocotools
➋. Importing Libraries:
Other Imports will be done when needed later.

We need Tensorflow version 1.15.0. Check Tensorflow version by running:
print(tf.__version__)
4. Preprocessing Images and Labels.
We need to create two csv files for the .xml files in each train_labels/ and test_labels/

These 2 csv files will contain each image’s file name, the label /box position, etc. Also, more than one row is created for the same picture if there is more than one class or label for it.
Other than the CSVs, we will need to create a pbtxt file that will contain the label map for each class. This file will tell the model what each object is by defining a mapping of class names to class ID numbers.
You don’t have to do any of these manually, the following will convert the xml files to two csvs and will create the .pbtxt file. Just make sure that:
The same folders’ name where the xmls are located is matched:train_labels/ and test_labels/ (or change them for the code below)
Current directory is object_detection/data/
The images are in .jpg format

The working directory at this point:
object_detection/
└── data/
├── images/
│ └── …
├── annotations/
│ └── …
├── train_labels/
│ └── …
├── test_labels/
│ └── …

├── label_map.pbtxt

├── test_labels.csv

└── train_labels.csv
5. Downloading Tensorflow model.
Tensorflow model contains the object detection API we are interested in. We will get it from the official repo.
Navigate to object_detection/ dir, then:
# downloads the models
!git clone –q https://github.com/tensorflow/models.git
Next, we need to compile the proto buffers — Not important to understand for this project but you can learn more about them here. Also, the PATH var should have the following directories added: models/research/ and models/research/slim/
Navigate to object_detection/models/research/ dir, then:
# compils the proto buffers
!protoc object_detection/protos/*.proto –python_out=.
# exports PYTHONPATH environment var with research and slim paths
os.environ[‘PYTHONPATH’] += ‘:./:./slim/’
Finally, run a quick test to confirm that the model builder is working properly:
# testing the model builder
!python3 object_detection/builders/model_builder_test.py
If you see an OK at the end of the test, then everything is going great!
6. Generating TFRecords.
Tensorflow accepts the data as TFRecords data.record TFRecord is a binary file that runs fast with low memory usage. It contains all the images and labels in one file. Read more about it here.
In our case, we will have two TFRecords; one for testing and another for training. To make this work, we need to make sure that:
The CSVs file names is matched:train_labels.csv and test_labels.csv (or change them in the code below)
Current directory is object_detection/models/research
Check if the path to data/ directory is the same asdata_base_url below.

7. Selecting and Downloading a Pre-trained Model.
➊. Choose a Pre-Trained Model:
A pre-trained model simply means that it has been trained on another dataset. That model has seen thousands or millions of images and objects.
COCO (Common Objects in Context) is a dataset of 330,000 images that contains 1.5 million objects for 80 different classes. Such as, dogs, cats, cars, bananas, … Check all the classes here.
Training a model from scratch is extremely time consuming; it may take days or weeks to finish training. A pre-trained model has already seen tons of objects and knows how to classify each one of them. So, why not just use it!
Because our interest is to interfere on a real time video, we will be choosing a model that has a low ms inference speed with a relatively high mAP on COCO.
The model used for this project is ssd_mobilenet_v2_coco. Check the other models from here. You could use any pre-trained model you prefer, but I would suggest experimenting with SSD ‘Single Shot Detector’ models first as they perform faster than any type of RCNN on a real-time video⁴.
Explaining the difference between object detection techniques is out of the scope of this tutorial. You can read more about them from this blog post, or learn about how their speed and accuracy compare from here.
Let’s start with selecting a pretrained model:

We will download the selected model. I have created these two models configurations to make it easier. the ssd_mobilenet_v2 is selected here, try using faster_rcnn_inception_v2 later if you would like. Just change selected_model above.
➋. Download the Pre-Trained Model:
Navigate to models/research/
DEST_DIR is where the model will be downloaded. Change it if you have a different working directory.

While training, the model will get autosaved every 600 seconds by default. The logs and graphs, such as, the mAP, loss and AR, will also get saved constantly. Lets create a folder for all of them to be saved in during training:
Create a folder called training inside object_detection/model/research/
The working directory at this point:
object_detection/
├── data/
│ ├── images/
│ │ └── …
│ ├── annotations/
│ │ └── …
│ ├── train_labels/
│ │ └── …
│ ├── test_labels/
│ │ └── …
│ ├── label_map.pbtxt
│ ├── test_labels.csv
│ ├── train_labels.csv
│ ├── test_labels.records
│ └── train_labels.records

└── models/
├── research/
│ ├── training/
│ └── …
└── …
8. Configuring the Training Pipeline.
This is the last step before starting to train the model, finally! Perhaps, it will be the step where you might spend some time to tune the model.
Tensorflow Object Detection API model we downloaded comes with many sample config files. For each model, there is a config file that is ‘almost’ ready to be used.
The config files are located here:
object_detection/models/research/object_detection/samples/configs/
ssd_mobilenet_v2_coco.config is the config file for the pretrained model we are using. If you chose another model, you need to use & edit the correspondent config file.
Because we will probably have to tune the config constantly, I suggest doing the following:
view the content of the sample config file by running:

Copy the content of the config file
Edit it by using

Or, you can open and edit the config file directly from your local machine if you have everything synced by using any text editor.
Here are the required edits that need to be changed in the sample config file, also some suggested edits to improve your model performance.
➊. Required edits to the config file:
model {} > ssd {}: change num_classes to the number of classes you have.

2. train_config {}: change fine_tune_checkpoint to the checkpoint file path.

Note: The exact file name model.ckpt doesn’t exist. This is where the model will be saved during training. This is its relative path:
/object_detection/models/research/pretrained_model/model.ckpt
3. train_input_reader {}: set the path to the train_labels.record and the label map pbtxt file.

4. eval_input_reader {}: set the path to the test_labels.record and the label map pbtxt file.

That’s it! You can skip the optional edits and head to training!
➋. Suggested edits to the config file:
First, you might want to start training the model and see how well it does. If you are overfitting, then you might want to do some more image augmentation.
In the sample config file: random_horizontal_flip & ssd_random_crop are added by default. You could try adding these as well:
from train_config {}:

Note: Each image augmentation will increase the training time drastically.
There are many data augmentation options that you can add. Check the full list from the official code here.
In model {} > ssd {} > box_predictor {}: set use_dropout to true This will be helpful to counter overfitting.
In eval_config : {} set the number of testing images you have in num_examples and remove max_eval to evaluate indefinitely

Note: The notebook provided explains many more things in regard to tuning the config file. Check it out!
The full working directory:
(Including some files/folders that will be created and used later)
object_detection/
├── data/
│ ├── images/
│ │ └── …
│ ├── annotations/
│ │ └── …
│ ├── train_labels/
│ │ └── …
│ ├── test_labels/
│ │ └── …
│ ├── label_map.pbtxt
│ ├── test_labels.csv
│ ├── train_labels.csv
│ ├── test_labels.records
│ └── train_labels.records

└── models/
├─ research/
│ ├── fine_tuned_model/
│ │ ├── frozen_inference_graph.pb
│ │ └── …
│ │
│ ├── pretrained_model/
│ │ ├── frozen_inference_graph.pb
│ │ └── …
│ │
│ ├── object_detection/
│ │ ├── utils/
│ │ ├── samples/
│ │ │ ├── configs/
│ │ │ │ ├── ssd_mobilenet_v2_coco.config
│ │ │ │ ├── rfcn_resnet101_pets.config
│ │ │ │ └── …
│ │ │ └── …
│ │ ├── export_inference_graph.py
│ │ ├── model_main.py
│ │ └── …
│ │
│ ├── training/
│ │ ├── events.out.tfevents.xxxxx
│ │ └── …
│ └── …
└── …
9. Tensorboard.
Tensorboard is the place where we can visualize everything that’s happening during training. You can monitor the loss, mAP, AR and many more.

You could also monitor the pictures and the annotations during training. At each evaluation step, you could see how good your model was at detecting the object ←.
Note: Remember when we set num_visualizations: 20 above? Tensorboard will display that much pictures of the testing images here.
To use Tensorboard on Colab, we need to use it through ngrok. Get it by running:

Next, we specify where the log files are stored and we configure a link to view Tensorboard:

When you run the code above, at the end of the output there will be a url where you can access Tensorboard through.
Notes:
You might not get a url when running the above code, but an error instead. Just run the above cell again. No need to reinstall ngrok.
Tensorboard will not log any files until the training starts.
A max of 20 connection per minute is allowed when using ngrok, you will not be able to access tensorboard while the model is logging to it. (happens very frequently)
If you have the project synced to your local machine, you will be able to view the Tensorboard without any limitation.
Go to terminal on your local machine and run:
$ pip install tensorboard
Run it and specify where the logging dir is:
# in my case, the path to the training folder is:
tensorboard –logdir=/Users/alaasenjab/Google\ Drive/object_detection/models/research/training
10. Training… Finally!
Training the model is as easy as running the following code. We just need to give it:
model_main.py which runs the training process
pipeline_config_path=Path/to/config/file/model.config
model_dir= Path/to/training/
Notes:
If the kernel dies, the training will resume from the last checkpoint. Unless you didn’t save the training/ directory somewhere, ex: GDrive.
If you are changing the below paths, make sure there is no space between the equal sign = and the path.

Now set back and watch your model train on Tensorboard.

11. Export the trained model.
By default, the model will save a checkpoint every 600 seconds while training up to 5 checkpoints. Then, as new files are created, older files are deleted.
We can find the last model trained by running this code:

Then by executing export_inference_graph.py to convert the model to a frozen model frozen_inference_graph.pb that we can use for inference. This frozen model can’t be used to resume training. However, saved_model.pb gets exported as well which can be used to resume training as it has all the weights.

pipeline_config_path=Path/to/config/file/model.config
output_directory= where the model will be saved at
trained_checkpoint_prefix=Path/to/a/checkpoint
You can access all exported files from this directory:
/gdrive/My Drive/object_detection/models/research/pretrained_model/
Or, you can download the frozen graph needed for inference directly from Google Colab:
#downloads the frozen model that is needed for inference
# output_directory = ‘fine_tuned_model’ dir specified above.
files.download(output_directory + ‘/frozen_inference_graph.pb’)
We also need the label map .pbtxt file:
#downlaod the label map
# we specified ‘data_base_url’ above. It directs to
# ‘object_detection/data/’ folder.
files.download(data_base_url + ‘/label_map.pbtxt’)
12. Webcam Inference.
To use your webcam in your local machine to inference the model, you need to have the following installed:
Tensorflow = 1.15.0
cv2 = 4.1.2
You also need Tensorflow model downloaded on your local machine (Step 5 above) or you can skip that and navigate to the model if you have GDrive on your local machine synced.
Go to terminal on your local machine and navigate to models/research/object_detection
In my case, I am navigating to the folder in GDrive.
$ cd /Users/alaasenjab/Google\ Drive/object_detection/models/research/object_detection
You can run the following from a jupyter notebook or by creating a .py file. However, change PATH_TO_FROZEN_GRAPH , PATH_TO_LABEL_MAP and NUM_CLASSES
Run the code and smile 🙂

Conclusion:
Detecting objects from images and videos is a bit easier than in real-time. When we have a video or an image that we want to detect an object from, we don’t care much about the inference time the model might take to detect the object. In real-time object detection, we might want to sacrifice some precision over a faster inference time.
In the case of detecting guns to notify the police, we don’t care much about detecting exactly where the gun is located. Instead, we probably would optimize for both:
False positive: Not detecting a gun when there is one.
True negative: Detecting a gun when there isn’t one.
I hope you found this tutorial on how to detect custom objects using Tensorflow easy and useful. Don’t forget to check the Colab Notebook for more details.
Please let me know if you have any question down below, I will try my best to help!
Resources:
⋆ Tensorflow object detection API. By: Tensorflow.
⋆ Inspired from: Train Object Detection for free. By: Chengwei.
⋆ Raccoon detector dataset. By: Dat Tran.
⋆ Train Yolov3 using Colab. By: David Ibáñez.
)
Thanks to Brenda Hali.
Computer Vision
Deep Learning
Object Detection
TensorFlow
Google Colab

155 claps

WRITTEN BY

Alaa Sinjab
Data Scientist with background in CyberSecurity. https://alaasenjab.github.io/site/
Follow

Towards Data Science
Sharing concepts, ideas, and codes.
Follow
About
Help
Legal

2020 Predictions For AI, DL, And ML Evan SparksContributor

3,872 views|Dec 19, 2019,2:04 pm
2020 Predictions For AI, DL, And ML
Evan SparksContributor

AI
I write about artificial intellegence.
As the decade wraps up, many of the most evident shifts in technology have already taken place in the AI, DL, and ML landscape. Billions in venture financing have been raised to chase AI opportunities, and media outlets now have dedicated beat reporters focused on the market. Though the space has come a long way, it’s also clear that there’s still a long way to go.

While reflecting on the past year, I started thinking about what 2020 may bring. I wanted to share some predictions on what will shape the industry landscape and the work I do at Determined AI.

Photo by: Photofusion/Universal Images Group via [+]
UNIVERSAL IMAGES GROUP VIA GETTY IMAGES
Exclusivity and Consolidation

In November 2019, we saw Graphcore make an exclusive deal with Microsoft – in many ways, the first of its kind. This deal provided Azure cloud customers exclusive access to Graphcore’s Intelligence Processing Units (IPUs). Then in December we saw Intel buy Habana Labs, an AI chipmaker, for $2 billion to build out their AI strategy. I see this type of exclusive deal – between a legacy player and an AI startup – as a road to further adoption on a grander scale, and at a faster pace. However, I also think that this trend will continue to create clear winners & losers.

I believe that there will be a consolidation within the chip industry, and that several smaller AI chip companies will be bought by big players. The economics simply don’t make sense for most startups to manufacture chips.

It’s easy to see how these trends could lead to increased oligopoly pricing – or if we see one major player make the bulk of the moves, even monopoly pricing. If this pricing shift occurs, it means that AI chips will continue to be sold at a premium and only increase the haves versus have-nots in the AI space – continuing to widen the gap between players who are able to invest billions in an AI strategy, and those who simply never could.

The gap between the rest of the world and top AI companies – especially those focused on what we consider RedAI – will only widen. However, I think this divide will eventually lead the public to advocate for Green AI, focused on bringing awareness to the larger community and shaping important decisions for the next decade. This means greener research, greener algorithms, etc. in the future.

Big Growth for NVIDIA

In 2020, I believe NVIDIA’s data center revenue will continue to embrace double digit growth, quarter to quarter. Despite a couple of down quarters in early 2019, I see the rise of NVIDIA this year to be increasingly strong due to continued adoption of deep learning throughout the industry — partly fueled by more talent coming online and also new modeling approaches gaining adoption which I’ll touch more on in a subsequent prediction.

PyTorch > TensorFlow

Many advantages of PyTorch – from ramp up time to easier debugging – lead me to believe that PyTorch will continue to beat out TensorFlow, creating a systemic switch for teams and programmers in the long run. We’ll see more companies moving away from Google as developers advocate harder for PyTorch. Concretely, I predict that PyTorch will be the top framework (by the percentage of new projects it powers) by the end of the year.

I also think that we’ll see an increasing number of companies continue to invest in building out their AI teams. Stronger internal resources will lead companies to improve their on-premise delivery, therefore moving towards an increase in AI models and continued maturation in the space.

Deep Learning for Language Modeling Makes Big Jumps

Due to recent research advances (e.g. ELMo, Bert, the Transformer), I believe that within the next year, there will be massive adoption of deep learning for language modeling in industrial settings, everywhere that there is text. Within the year, we’ll see language modeling indistinguishable from humans at a level that most experts weren’t expecting for at least half a decade. Think something along the lines of a smartphone video input being translated to an AI-generated news story, indistinguishable from a journalist’s writing.

2020 Election

AI will play a critical – and likely negative – role in the upcoming U.S. presidential election. Deepfakes have the potential to further confuse an electorate already struggling with accusations of “fake news.” However, it is a game of cat and mouse, and some of the world’s most prominent researchers have been working on tools that can detect if images and videos have been retouched, altered, or digitally generated.

All in all, 2020 will no doubt bring exciting new innovations in the AI space. I am looking forward to what the next year will bring, and plan to continue to set my sights high for what’s coming next.

Get the best of Forbes to your inbox with the latest insights from experts across the globe.
Follow me on Twitter or LinkedIn. Check out my website.
Evan Sparks
Evan Sparks is CEO of Determined AI. He holds a PhD in Computer Science from Berkeley where his research focused on distributed systems for data analysis and machine…Read More
Loading …
Also on Forbes AI
Workday BrandVoice: Business Leaders: Integrating AI And Machine Learning Starts With Your Data Scientists
The Best iPhone 11 Cases

© 2019 Forbes Media LLC. All Rights Reserved.
AdChoicesPrivacy StatementTerms and ConditionsContact UsJobs At ForbesReprints & PermissionsForbes Press RoomAdvertise

Design a site like this with WordPress.com
Get started