Bringing Machine Learning to the Web and Beyond – Cambridge ML Summit ‘19

name is Nikhil. DANIEL SMILKOV: I’m Daniel. NIKHIL THORAT: So we’re
from the Google Brain team. We work on PAIR, which
is People in AI Research. So we’re here to talk
about TensorFlow.js. TensorFlow.js, as it sounds,
is a JavaScript library for machine learning. It’s part of TensorFlow. And we’ll kinda go into some
of the nitty-gritty details of that. So our talk is called “Bringing
Machine Learning to the Web and Beyond,” and we’ll talk
about what “beyond” means in a second. OK, so I don’t have to convince
people in the room here, we all know that Python is the
dominating language of machine learning and data science. This is a KDNuggets
software poll that polls people about what
sort of libraries that they use and language they use. Python dominates. We all know this. There’s good reason for this. There’s lots of
libraries and tooling and educational
coursework around that. At the same time,
the rest of the world sort of works in JavaScript. By most popular language
by pull requests, we see that JavaScript
absolutely dominates. And so we kind of wanted
to bridge these two worlds. And we think there’s a
lot that JavaScript has to offer to machine learning. And this poll was actually
done by [INAUDIBLE].. OK, so have people
seen this before? If you haven’t, I
would definitely recommend checking it out. It’s super cool. Daniel and a couple other
people at Google built this. It’s an in-browser
interactive visualization of a neural network training. You can go and choose
a synthetic dataset, you know, change hyperparameters
in the model, the activation function learning rate,
all that kind of stuff. And you can see
immediately on the right how it generalizes
for that dataset. And this is just
kind of a lot of fun, so I definitely recommend
checking this out. And this thing was a
huge educational success. This is actually used in
courses all over the world now. And I think Daniel still gets
emails about this to this day. So, we were kind of stepping
back and we were wondering, you know, what was
it that made this so successful and so popular? And we think, obviously,
it’s because it’s an in-browser machine
learning application. And in-browser ML has a lot to
offer, so you just click a link and you’re there. There’s no CUDA
installation, there’s no wrangling with
Python dependencies, you just click the
link and it’s open. And it runs on pretty
much any machine. This one was super
interactive, right? Like, you can immediately click
different activation functions and choose how many
neurons are in each layer, and immediately see the results. So this is a ton of fun. We didn’t take advantage
of this in the Playground, but your device has
all these sensors and there’s standardized access
to the sensors through web API, so you have camera
access and G.P.S. and microphone,
all these things. And importantly, the data
can stay on the client. So we can do privacy-preserving
on-device machine learning with this kind of thing. So with that,
obviously, we released a library called TensorFlow.js. This is part of
TensorFlow ecosystem. We released it last
year in March, in 2018. In the browser, we do GPU
acceleration with WebGL. If you don’t know WebGL is,
it’s a web-based graphics rendering library, normally
to put a 3D scene on the page. It turns out that
the math for that is very similar to
machine learning, so we took advantage of
those kind of shaders to do all of our linear algebra. TensorFlow.js allows
you to do inference. You can take pre-trained
models from the Python world and just execute them,
or you can train entirely in the browser, and node.js,
and even more platforms which Daniel will talk about soon. OK, so when we
built the library, there was a couple goals– two sort of overarching goals. One is that we wanted to take
these JavaScript developers and sort of empower them,
give them sort of the ML tools that they would need to
succeed in the space. At the same time,
we wanted people with a machine
learning background to be able port their
work to the web, and sort of have a fully
featured library there. And these goals
are kind of often at conflict with one
another– like, you want to make it simple, but you
want to make it fully featured. So we’ll talk a little
bit about the decisions we made to resolve that. So two overarching
principles that we had are we want it to be super easy to use. So we just want people to get
started and just start going. And we kind of,
sometimes, decided to take a hit on performance
for those reasons. Then we’ll talk about a couple
of decisions there as well. Again, we don’t want to
sacrifice functionality for simplicity. So there’s are two
overarching principles. We decided to go eager only. So from the very beginning,
we’ve never had a graph, we’ve never had session
APIs, we just said eager. This is a lot simpler to sort
of wrap your head around. This is when TensorFlow 1.0 was
out, so it’s all graph-based. So we also have first-class
support for layers API. This is Keras-inspired. This is what we tell people
that they should probably use if they’re getting started. This is completely
compatible with Keras. It provides serialization,
ability to add layers, and that kind of thing. And then even on top of that,
we have a whole repository, which we’re working
very hard on, to provide pre-trained
models, like PoseNet, that give you a tensor-free no
machine learning jargon API. So you just kind of
get a JSON object from an image element,
which is pretty nice. At the same time– oh, sorry, here– so
we focus on performance here when and where it matters. So what this means
is, we make sure that our [INAUDIBLE]
are fast, we make sure our comms
are fast, but we’re not going to force the user to
have to take that burden. So we provide gradient support. So on-device, you
can actually train models, compute gradients. We have an eagefr-based training
API just like TensorFlow does. We support about
130 TensorFlow ops. And for those models that
we talked about above, so for PoseNet, if you want
to do transfer learning, you can actually reach in
and get the Tensor out, so it’s not just
completely wrapped for you. So if you want to do
something like that, you can. OK, so what can you do
with the library itself? So there’s kind of this
spectrum of, like, easy to hard, and that’s objective, obviously. But starting on
the easiest side, you can take a Python model
that’s been trained already and you run a script and
you get a bunch of artifacts that you have for the
web, and you can just make predictions on those. So right there, on device. And you can do that in the
browser, in Node, et cetera. You can also retrain
those models, so once you have that
model in the browser, we have APIs for
fitting, for data, that may be on your webcam,
or from your microphone or anything else. And this is sort
of in the middle. And on the harder
side, you know, is writing the model
entirely in JavaScript. So what this means is,
we have a full library for stitching together
your own model and training it in the browser. OK, so just a quick high-level
architectural overview, these are sort of
layers of abstraction. At the very, very highest,
we have that Models. These are pre-trained
models with nice, easy APIs. We put them on NPM. We host all the
weights on GCP for you, so you just have to NPM install
and everything comes for you. We have this Layers API. This is Keras-compatible API. It lets you stitch
together layers, serialize models,
that kind of thing. And then, our Core
API, below that, is Linear Algebra Kernels. These are all
GPU-accelerated, and it’s all eager-based for gradients. In the Browser, as of today– Daniel will talk about
some future stuff– we sit on top of WebGL. Again, this is the
graphics rendering library. And in Node, we actually bind
to the TensorFlow C++ binary and we can get acceleration with
GPUs and CPUs, and eventually, maybe, TPUs. So these are completely
compatible with one another, which is pretty cool. And we have– again,
we have the ability to take models from
Python and import them into the browser for
execution or retraining. So that’s kind of the high-level
view of what TensorFlow.js looks like. So I’m just going to poke into
the Models repo really quick. So if you haven’t
checked this out, go check out that GitHub repo. There’s a bunch of models there. A bunch of different modalities. I’m going to just show you
how easy one of them is. So the model I want to point out
is a human segmentation model. And so this human segmentation
model is called BodyPix, and the goal of this model is
for every pixel on the screen, just tell me if that pixel is
part of a human being or not. It’s very simple, conceptually. And also, we give
body parts as well. So it’ll tell you if this pixel
is part of an arm or a head, or that kind of thing. So this is what the output
of that model looks like. So I’m going to try to show you
how easy this is to get going. So, all you have to do is
script source tf.js and BodyPix, two things we host for you. This is a fun little
visualization by Dan. These are also in
NPM, so you can NPM install those like that. And the first line–
it’s two lines of code– so, the first line is you
await BodyPix outload. This downloads all the
weights from GCP for you and puts it in this
little net object, and then, you just call estimate
single person segmentation. And that image
object right there is actually an
HTML image element. So, like, a regular
query selector on image. And you get this
segmentation object out, which is a JSON object. That’s it. And it has all the mask
information in there. And we also provide a bunch
of utilities inside of those to help you draw
things like this, so we can do background blurring
for you, or like coloring of your body parts,
and that kind of thing. So that’s BodyPix,
I’d definitely recommend checking it out. We have a bunch of
other models, as well, that have very similar APIs. And with that, I’m going
to hand it off to Daniel. DANIEL SMILKOV: Thanks. So when we started this
project, we were initially focused on the browser. And a lot of people, when
you hear JavaScript today, the first thing you
think is the browser. But modern JavaScript is so
versatile and so powerful that it runs on all
sorts of platforms. And to give you some
idea, obviously, we all know the major
browsers here, but we know that there is no
JS that makes JavaScript compelling for server side. There is Electron, where people
use to build modern desktop applications like Slack. And what was surprising to me,
also, in the last two years as I was learning
more about this, is that more and
more platforms– there’s more and
more hybrid apps that allow you to write
JavaScript and then deploy it to a mobile device. So like, WeChat, React
Native, and Angular in conjunction with Native
Script allows you to do this. So the orange boxes
are the platforms where TensorFlow.js runs today. And I just want to quickly go
over some of these platforms and showcase a few interesting
real-world examples that use us. So on the Browser, there is
this Creatability project, which is a set of
experiments by Google, to explore how to make
creative tools to be accessible by everyone, so
people that have impairments. And they’re using all of
the sensors on the laptop, like the microphone and the
accelerometer and the camera, to allow people to
express themselves. Uber Manifold is using
us, also in a web app. Their web app allows you
to analyze an ML model and compare two ML
models together, but a lot of the computation– so it has a server, but also,
a lot of the computation to make it super
fast and interactive is done on the browser itself. And what’s interesting here
is they don’t use us as an ML library, they actually used
our lower-level ops, like map models and other
linear operations to do fast linear
algebra, because we accelerate on the GPU. And the third example
I want to showcase is more of the classic example
of using us in the browser. Airbnb has a web page. They ship a model
on that web page that, when you’re trying
to upload a profile photo on device, it tries to
run a simple image recognition model that warns you if you
have some personal or sensitive information. Turns out, a lot of people are
scanning their government ID and uploading them
as profile photos, so Airbnb warns you on device. And the reason why they’re
doing this on-device is it’s much simpler
for privacy reasons than to upload this and
analyze it on the server. On the desktop side,
especially with Electron, I want to showcase two examples. So Clinic.js is a desktop
application that allows you to monitor your Node.js
application, for CPU cycles, for memory usage, and they
actually use a TensorFlow.js model– you know, with a bar
of machine learning– to distinguish between
memory use done by the kernels of Node.js itself
versus your app, your user app. Magenta Studio is also
an interesting example. So what they provide– Magenta is a team
at Google Brain that explores how machine learning
can come together with music and music generation. So they have a few models that
can generate music and music sequences, and they built
plugins that plug into Ableton. So that allows musicians
to quickly interact with machine learning and
augment their creativity. And what’s interesting here
is that Ableton has a plugin system, and all of the plugin
system is powered by Electron, so it’s very
interesting that you can write JavaScript applications. We’ve heard similarly
for Photoshop, that Photoshop has
now these filters, or plug-ins, that
you can augment in Photoshop via Node.js. And the third example I
want to showcase is WeChat. So, this is on the mobile side. So WeChat is a mega-popular– it started as a chat app, but
it’s a mega-popular platform nowadays, has more than a
billion users, mostly in China. It has over a million
mini programs. So this is very similar
to the Apple store. And turns out that you can
write a mini program using JavaScript. And we actually
recently released the plugin for
TensorFlow.js that allows you to run models in WeChat. And this particular
one is a makeup app that allows you to overlay
makeup and quickly see. All right, so those
are the few examples. I just want to briefly touch
upon three other projects that are around our library. So, tf.vis,
and some Node updates. So, for those of
you that have used Python, this is analogous to in Python. It’s a library that
allows you to construct data processing pipelines
such as batching, shuffling, filtering. And the secret here–
or not the secret, but the gain that
you get is that all of these transformations
are done lazily, so you can work with
data that doesn’t fit in your main memory. Tf.vis on the
browser side allows you to visualize to gain
interesting insights about your ML model, so it
has this hideable drawer called The Visor and
you can place in charts that we provide. We provide a bunch
of building charts. But we also give you high-level
visualization methods to visualize layers and
filters, and some properties of your training data. And lastly, we also give you
a built-in evaluation utility. So you can quickly plot accuracy
or confusion matrix or AUC curves. All right, so on
the Node.js side, Nikhil mentioned previously
that we are not constrained by– when we’re service side,
unlike the browser, we don’t have a sandbox,
we can bind to any library. So we bind to the
TensorFlow C library and we get the full performance
of TensorFlow for CPU. And if you have
CUDA installed, we can utilize your
GPU efficiently. And this is great for servers. However, we’re also working
on a Headless GL back-in in for Node.js, and we’re
very close to releasing this. And the reason why
we’re working on this is because we want to make
sure that if you are writing an Electron app, you can
take advantage of the GPU even if you’re on
Max or you don’t have CUDA drivers installed. And also, the package
is much, much smaller compared to downloading the
full TensorFlow library. Lots of libraries
started using us and I think that’s just amazing. So, a few libraries here is
the Face-API, HandTrack API, ML5.js– check it out. Also, if you’re more
beginner machine learning, it’s a great library
to start because it provides more high-level APIs. And I just want to say thanks
to all of the libraries that they’re using us
and they’re helping us make TensorFlow.js better. So we launched this about a year
and a half ago, in March 2018, and we’ve been getting
lots of momentum. We have a lot of NPM downloads. We have almost 200 contributors. And I think it’s
pretty exciting. Everything is open source. And I just want to also thank
the community for actively helping us make this better. All right, so the
future, we’re working to extend our
Hybrid-Native platforms. So we’re working on a
React Native plug-in, so that if you’re
a React developer and you’re using React
Native, you can load a model and make prediction
in a Native app. And this is coming very soon. On the side of trying to
make machine learning more accessible, we’re working on an
integration with Cloud AutoML, as you’ve seen
before, where you can train a model without
having to do any coding. And then, you will be able
to download the TensorFlow.js version of that model if
you’re using the edge product, and then just run
it in the browser. We’re also constantly
looking for new models, pre-trained models, that
are useful out of the box to add to our models repo. So we’re working on
object segmentation model, for example, that we
are planning to release in the next month or so. On the performance
side, performance can always get better. So we’re actually working
and have prototypes on two different technologies. So WebAssembly, that helps
us accelerate our computation on CPU, because not
all devices have GPUs, not all devices have
WebGL drivers or drivers that can run WebGL,
so WebAssembly is going to help there. Everything is open source. We just started
this, but feel free to follow and also
make contributions. And on the GPU side, there
is a new platform coming in, WebGPU, that gives us much
closer to the [INAUDIBLE],, to the GPU. So we should be more
performant than WebGL. And this is, I would say,
still at least a year out, if not longer, for
all browsers to adapt, but it is a standard
that has been accepted by all the browsers,
so it’s pretty exciting. There’s also a book. A few of our
wonderful colleagues are working on a book
that’s going to come out, I think, somewhere in the fall. So you can preorder it now and
you can use this coupon code to get a discount. AUDIENCE: Hey,
hey, hey, go back! [LAUGHING] DANIEL SMILKOV:
All right, so I’m very optimistic about the future
of machine learning on the web. Two years ago, none
of this was possible. We didn’t even think that we
could be doing something that can be performant and
fast in JavaScript, but today, it is very
obvious that things are moving at a high pace. And there are three things
that are driving this progress. There’s new web
standards coming in, like, WebGPU, WASM, which
stands for WebAssembly, there are potentially
new specs like WebNN, like a neural network spec. All of this would give us
closer to the [INAUDIBLE] to do faster computation. There is another also force,
and that is just ML research. ML researchers are coming
up with more efficient architectures. For example, MobileNet V2 has
half the flops of MobileNet V1, but almost the same accuracy. And now with AutoML and
neural architecture search, you can let computers find the
most optimal, lightweight model that can do the task for you. And the third avenue,
where this is pushing, is us working on optimizing
TensorFlow.js itself. In the last year, we’ve
made a lot of progress, we’ve became more
efficient at backing WebGL textures, fusing ops so
we minimize calls to the WebGL backend, and et cetera. So all of these three
things combined, I think, makes the web move really fast. Everything is open source. And we have a wonderful website. Please check it out. We are also on GitHub and
we have a mailing list. And I just want to say that
the work that we’ve shown here was created by lots
of different people, both inside and outside Google. We have a lot of amazing
[INAUDIBLE] contributors as well. And we are hiring a
developer advocate. If you’re passionate about
JavaScript and machine learning, here’s the
link to the application. And thank you. [APPLAUSE] [MUSIC PLAYING]

Be the first to comment

Leave a Reply

Your email address will not be published.