Kittens vs Tarsiers - an introduction to serverless machine learning

Kittens vs Tarsiers - an introduction to serverless machine learning

In this first part of this blog post series we will develop a serverless function to identify Doppelgängers.
tarsier-cat

We will be using Functions as a Service (FaaS), which is a framework for building serverless functions with Docker, TensorFlow for image recognition. Needless to say, we will be using Docker.

Throw any url at it, and the function will tell you what it (thinks it) sees.

How does it tell you what it thinks? It runs an image classification with Inception provided by Tensorflow. Inception was trained on ImageNet image database.

ui-tf-jmkh

tabby, tabby cat (score = 0.67659)
tiger cat (score = 0.22639)
lynx, catamount (score = 0.02460)
Egyptian cat (score = 0.01052)
Persian cat (score = 0.00236)

cat

Note that the function will be able to identify other animals and objects

Boston bull, Boston terrier (score = 0.17929)
king penguin, Aptenodytes patagonica (score = 0.16913)
French bulldog (score = 0.02329)
toy terrier (score = 0.00909)
Italian greyhound (score = 0.00524)

penguin-dog

But let's face it, tarsier and cats are the REAL Doppelgängers!

tarsier-cat

Will we be able to split them apart?

Demo

Let's start with a demo, and set up FaaS so we can test out the doppelgangers and see what the function makes of it.

You can try this out in Play With Docker using the PWD button, how that works is by downloading a Docker Compose Stack file and all the images that are needed for FaaS and for the machine learning function (which we will write).

If you are already familiar with FaaS and cannot wait to play with this, head over to PWD and give it a round of tests

You can Try in PWD

Wait a while until the Tensorflow Docker image extract. Be patient...

then click the blue badge at the top to open FaaS UI:

pwd-blue-badge

then click on the tensorflow entry on the left
pwd-ui-tf-jmkh

type or paste in the Request body field any url pointing to an image and click INVOKE.

Test this on the internet celebrity Hipster Cat.
grumpy-cat

It got identified as a tabby cat with an 23% confidence level. It spotted a carton, a bow tie and maybe a welcome mat.

tabby, tabby cat (score = 0.23278)
Persian cat (score = 0.09450)
doormat, welcome mat (score = 0.05816)
bow tie, bow-tie, bowtie (score = 0.04336)
carton (score = 0.03877)

Let's test a harder one:
penguin-dog

A boston bull and a king penguin are spotted with high confidence level (16.9% and 17.9% respectively).

Boston bull, Boston terrier (score = 0.17929)
king penguin, Aptenodytes patagonica (score = 0.16913)
French bulldog (score = 0.02329)
toy terrier (score = 0.00909)
Italian greyhound (score = 0.00524)

You can also use curl if you prefer:
(this url points to another cat photo)

curl http://localhost:8080/function/func_tensorflow -d \
       'https://cdn.pixabay.com/photo/2016/07/10/21/47/cat-1508613_960_720.jpg'

You can find in my github repository all the code and simple instructions to set it up and test it locally too - but I advise you to read along as there will be a trick or two (one of them to go async :-) )

A big thank you to Alex Ellis (@alexellisuk) for creating this community and his continuous support!

Ingredients

Serverless

Not familiar with what Serverless can do? Read my guide here.

We will be using FaaS, which is a framework for building serverless functions with Docker.

That means that FaaS can run on a Docker Swarm! sweet!

TensorFlow

We will use TensorFlow for image recognition. But first, let's define TensorFlow and see what it can do for us.

About TensorFlow

The Tensorflow site define TensorFlow as such:

TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.

In easier words, you write your code once, and it can execute on a CPU, GPU and now even on a TPU!

Where can we run TensorFlow?

here

Imagine that you can run your code on your local machine, phone, raspberry pi and even on GCP - CLOUD TPU - introduced by Google here.

gcp-tpus

What can we build with TensorFlow?

Nothing better than to browse through Awesome TensorFlow to get some ideas.

For convenience, I'll list a few things which one can do with TensorFlow:

  • build a smart, conversational chatbot
  • translate between the writings of Shakespeare and modern English
  • generate art, music or handwriting...
  • teach a bot to play a game (Deep-Q learning)
  • search, filter, and describe videos based on objects, places that appear in them
  • lip reading
  • real-time object detection
  • image recognition...

and so much more!

Next

Before going further, check out FaaS and Star the project!
You can start building your serverless function and have it featured in the community Alex built! Pull requests are welcome!

Next we will dive deeper. We will be building a Serverless FaaS function for deep learning in 10 minutes.

In future blog post we will do one of the below:

  • Add an overlay of the tags on the image and store that it an s3 bucket (use PIL)
  • do the same thing with FaaS and Tensorflow but on a Raspberry Pi 3 with the same code but on pi cam
  • go asynchronous, scale the workers and get back the results
  • Execute the same thing but on a GPU/(later on a TPU)
  • train the model so it recognize other things - maybe to distinguish Darth Vader from any father??
  • a slack integration that will classify images shared?

Tell me which one you'd like to see next?