Video details

Bringing ML and Linear Algebra to NodeJS


Ping Yu & Sandeep Gupta

No Python required - this session will highlight unique opportunities by bringing ML and linear algebra to Node.js with TensorFlow.js. Nick will highlight how you can get started using pre-trained models, train your own models, and run TensorFlow.js in various Node.js environments (server, IoT).
EVENT: NodeJS Interactive 2019
SPEAKER: Ping Yu & Sandeep Gupta, Google
PUBLICATION PERMISSIONS: The Linux Foundation provided Coding Tech with the permission to republish NodeJS Interactive talks
CREDITS: Original video source:


OK, I guess we can get started. So thanks again. Thank you so much for joining us. My name is and they've got that. I'm from the dance floor team in Google. And this talk is about machine learning and JavaScript. I also have my colleague being hit. And he will go present with me. So, again, thanks and welcome. So, you know, we're seeing. Once, yeah, we are seeing machine learning is beginning to have a really big impact in almost all fields of life around us, right? Everyday we see major news headlines. We see new breakthroughs, whether it's in transportation, health care, environmental sciences, all kinds of applications, even arts and creativity. There are problems that were really, really difficult to solve. Just a few years back are now being sold in very impactful ways, in very significant ways by these kinds of, you know, new computational approaches and new ways of using computer systems. Maybe just a quick show of hands. How many of you are actively practicing machine learning? I have some familiarity with machine learning. OK, so a relatively small number, because many of you are somewhat new to the field. Let me just take a couple of minutes and introduce some terminology here, and this will give you a flavor of why these kinds of methods are becoming so popular. So in most of classical programming, right, when we are trying to solve a problem or write a program for a computer to solve a problem, the way we have usually done this is that we first come up with these rules and we try to write explicit code to codify those rules. So, for example, if you're trying to write a program that takes images and tries to detect whether it's an activity of, let's say, walking or running or bicycling, you might come up with some features or rules to describe that. One way to do that might be that let's measure the speed of the person. If the speed is less than some number, then we call this walking. If the speed is more than that number, maybe it is running late. So you come up with these kinds of rules and you implement a computer program to to solve the problem that you want to solve. The issue with these approaches is that they very soon run into limits. Right. You encounter situations where your new rules no longer work. And even if your rules work well for the problem you're trying to solve, they're generally not generalizable or extensible to a slightly different problem that you want to solve another time. So that's where classical programming runs into its limits. Like in machine learning, it turns this whole concept sort of on its head. It turns it upside down. And the way you approach this is that what if I had some examples where I already know the answers and what if I can feed a lot of these examples an answer. So we call this training data and answers on that training data into a computer program or a model. And this model has the property that it can learn from these examples that it has fed so that it can come up with what these rules are. Right. And these rules may be in a form that humans can understand audit maybe in some abstract form that a computer program is choosing to describe that problem. So this becomes a very generalizable way of solving a problem. And so in practice, the way you do that is you collect a lot of training data and you have that data labeled. This is this is these are called human generated labels. You feed it into a machine learning program or a model and then outcome. These rules are a trained model. So this is the training phase of the machine learning process. And once you have trained a model now. So this is sort of some representation of the model. Now you can feed new data into this model and then that model is ready to give you new answers, our new predictions on this model. So this is called the inference phase. So this is sort of the most, you know, at a high level conceptually. This is how a machine learning way of solving a problem will look like. So just to look at this little bit visually, let's say you're trying to classify images. This is sort of what's happening there. Right. So a model is a collection of layers are. And all these layers that are these are just computational blocks. Right. Each layer and each element in that layer is just doing a very simple math operation. It's taking in some numbers. It's multiplying those numbers with some other numbers. And it's producing a new output. And you do all this and you feed this forward and you via this model so that when an image is fed in, it produces an output that is close to the output that we expect. So if you feed in an image of a cat, we want all these things to flow in and the output of our model should be a number that we designate as indicating cat. If it does not get. Then we calculate an error or a difference and we propagate that error back to our model. And we sort of tweak or adjust all the parameters of our model until we get the right answer. Right. And we keep on doing this for lots and lots of examples. And then we have a train model for that particular task that we are trying to solve. So this is sort of how a machine learning approach of solving a problem looks like. Now, the reason why machine learning is really taking off now. And it has become such an important part of problem solving today is for three main reasons. First is that, as we just saw, it relies on availability of a lot of data and not just quantity of data, but good quality data data that's labeled and curated and and represents the full variety of the situations you will encounter in real life. So now the good news is there are lots and lots of these very large publicly available data sets which make it easy for any developer to get started and train powerful machine learning models. The second aspect is that these models can be computationally quite expensive, although these are simple computations that they are running, but they just run millions and millions of them. And so you need, you know, very significant computation, power to be able to run these models in a practically useful timeframe. And now there are these. Custom hardwares that a GP who advances and new types of accelerators that are coming up, which have made it very practical to run machine learning models in a very, very reasonable amount of time. And then lastly, the research in the field of AML has been growing and advancing at a very fast pace. There are new publications that come out all the time and new ways of solving problems. So basically all of these things kind of now give everyone the ability to do this for whatever problem you are interested in. And this is where frameworks like denser flow come in. Right. So so Tensor Flow is an open source library for doing machine learning. And what that means is that a lot of this mechanics and logistics of training, a model of doing this kind of back propagation and and adjusting all the weights of running your experiments, of creating a diploid model which can then be deployed and used in production. All of these things are managed for you by a library like Densify. So you don't have to reinvent all of this stuff. It also has a bunch of retrain models available that you can use off the shelf. But Densify Law was written with a Biton front end. Right. And this is where most of the machine learning tools that are out there today require one to learn Biton. And in fact, Biton is termed as the language for data science, which is unfortunate because JavaScript is the most widely used programming language. And really, there haven't been too many accessible machine learning frameworks that JavaScript developers could use natively in JavaScript without having to have the burden of learning a new stack or a completely new programming paradigm. So motivated by that, we released this library card and several JavaScript. Which is basically a version of denser flow that is JavaScript native. And so it lets you run machine learning models and even train machine learning models in the browser. So you can run it in in Web browsers to JavaScript. You can run it server side with no. Yes. And as we'll show you, you can run it in a variety of other platforms where JavaScript can be used. This library is GPO accelerated. And again, we'll talk a little bit more about performance later. But we use Web G.L. acceleration in the browser. So it is very, very performant for the common types of machine learning models. And it is completely open source. So it's it's open for the community to use based on extend and contribute back into the into the library. So when we release this, our hope was that this gives Web developers and JavaScript developers back developers an easier way to get started with them. And we have been very happy, actually, to see the adoption and uptake. This is an interesting sort of example. This person, peer remotes, he gives a lot of very influential talks in the JavaScript community. And he gave a talk recently at Nordic Dargis and he sort of highlighted some of exactly these these motivational points that we had. Is that, you know, this whole excitement of AML, like the JavaScript community, is kind of missing out on. And now with denser flow. Yes. You know, there's an easier way to get started and to be able to use machine learning. And in fact, he says down there that now you can bring the power of machine learning to your Web application or JavaScript application with 10 lines of code. And and sort of we love we love this testimonial, except that it's five lines of code. It's not even ten lines of code. I assure you that. So here, for example, this is how, you know, this is sort of the rough template of bringing in machine learning into your application. The first two lines of code, which are basically in boarding library. And here we are showing the Naude example on top. So you import that D.F. as node package. And the second line is importing one of our many pre trained models. So this is the Coco SSD model, which is an object detection model that is trained so that when you give it images, it will recognize a bunch of common objects present in that image and it will give you a bounding boxes for where those objects are. So you can load the library and the model if you're running this in browser. Then the alternate is to just script source it from our hosted libraries. And then you create an instance of the model, which is right there on that first line. So basically I'm creating an instance of my Coco SSD model and loading it. And then I load an image and and decode it PND image to convert it into a form that my machine learning model can ingest. And that's it. And then I call my model detect function. So, you know, model dot detect. You posit the image object and you'll get back your predictions. So what you'll get back, you'll see that image on the right. It has identified a cup and a phone and a mouse, and it puts bounding boxes on those objects and you'll get the gist on object, which tells you the names of those objects. And it tells you the coordinates of where these are. And it also gives. Your probability of how confident it is of that prediction. So, you know, a super powerful model. Five lines of code. Your Web application can now be using an object detector. Right. And you can do similar things with text and speech and lots of other types of models. So why is this a good idea? So on client side, in the browser, there are many, many advantages of running machine learning in blindsight browser. Gives you a lot of interactivity. Right. So easy access to sensors like webcam and microphone, et cetera. So you can immediately take advantage of all this sensor data and put it into your machine learning model. So in that object detection, guess, for example, the images could be coming from a webcam stream. There is nothing to install. So, you know, you can share with your users. You just shared a UI link and they have a Web page which has the model in it. And, you know, they don't know that. And they're and they're using that model directly from from that you add. It has huge privacy implications because you're running these models locally. Clydeside. No data is going to the servers. Right. So for healthcare or any other privacy sensitive type of applications, you this has enormous implications. Also, it reduces server side costs because you don't have to stand up complicated architecture. I lost my. Projected. OK. I can think the connective was a little loose. Yeah. OK, let's hope this works. And then lastly, as I mentioned, that because we use Webjet acceleration, you'll get really, really good performance on the server side. You can you can run more powerful models that may not be practical to run client side in the browser. And you can basically take full advantage of whatever hardware you have. And these could be, you know, multiple code machine or GP use or even other custom hardware. There is a very large NPM package, Eco-System. So if you are using machine learning ignored, you can benefit from this whole ecosystem of node packages and you can bring machine learning directly into your node stack. So you don't have to have like separate Biton data science teams and a separate back on node beam. We're not really talking to each other. You can bring machine learning directly into node and have like a single stack. And we because we bind to tens of those sealife that even we run server side, we get lot of performance benefits and sort of directly the ability to use any conventional traditional machine learning model that has been trained on the python side. So there are three main ways of using this library. One is and that object detection example was was one example of this. You can take a existing machine learning model, whether it's a denser flow JavaScript model or whether it's one of your Python models and you can bring it in and run it with tons of low JavaScript. Second way to do this is to take a P train model. But then often you have to customize it on your own data to solve a particular problem. So we have easy ways of retraining a model and being able to modify it on a small amount of additional data so that you can retrain it. And by the way, another thing you can do is if any of you are familiar with the Google Cloud Auto Amell service, which is a really nice cloud based way of bringing your own data to the cloud and getting a custom model trained for you with no AML expertise needed, then that's also compatible with denser flow JavaScript. So you can train and auto emmel model on the cloud and get a trained dancer for JavaScript models that you can run. And lastly, for those of you who want to experiment and write new models from scratch, there is a full programming API, low level programming API with JavaScript. So you're writing JavaScript code and you can write new new models. So the library could use it in any and it could be used in any of these ways because JavaScript is such a versatile language and it runs on many, many platforms. You can use denser flow gess in all these places. You can read it in browser. You can run it on native mobile hybrid platforms such as React Native. We just recently announced integration with React Native and you'll get first class support with Abdula acceleration to react. You can run it with node and then desktop. There are examples of people building electron applications and using denser flow just through electron. So, so many ways of using this and we are continuously working on adding support for more and more platforms. So, as I said, we pre-package a bunch of e-mail models for Common Amelle tasks. And here are some examples. There are models for doing image classification, object detection that are models for recognizing human bones. We had some of those demo at our booth. And, you know, you're welcome to stop by after this and see some more of these. We do pose detection. There's a very nice model for audio command. So if you speak words, it can recognize the spoken words and you can use that to drive actions. And then we are increasingly doing more and more that on text. So text has a variety of use cases like sentiment and toxicity. And all of these models can be there, just done used with a script source from our hosted scripts, or you can npm install them. So here are some good examples. And using these models as building blocks, you can build applications that solve these types of problems like accessability sentiment analysis, conversational agents and a variety of different things. So all of those examples you are seeing on the right are instances of models just running a client site. So let me just take a couple of minutes and show you a very quick demo and just to show you how easy it is to retrain a model like this. So this is an application called Teachable Machine. Has anybody seen Teachable Machine so far? So this is something that I would encourage you to try out on your own time as well after this. But let's just, you know, see how this works. I'm going to skip this tutorial. So this is a teachable machine Web site. And this will show you how you can take an existing machine learning model and retrain it in a matter of seconds. So I'm going to skip this tutorial. And what you're seeing here is a simple image classifier. And this Web session has already loaded a powerful image classification model called Mobile Net, and it's running in my browser session. And we are going to modify this model to do a very simple rock, paper, scissors classification. OK. So we're going to output the word rock for my first class paper for second and scissors for. Third, OK. And now we are going to record these training images. So these green sample green glass will be rock. Then this will be paper. And this will be scissors. And we'll just, you know, record some images from my webcam. So let's record rock. I said, hold on this button and record some trading images. So this is rock rock. Now let's do paper. Rock, rock. And now let's do scissors, paper. Because I'm just recording some training images right now, so I ignored its predictions, scissors and I'll be trained. And now it's ready. So now I have a new vision of models running in my browser. So let's try it out. Rock, paper, rock, paper, scissors, paper, rock, scissors. So there you see how easy it is to train it with, like, you know, scissors. 50 images years. And the nice thing about this is that you can very similarly train a speech model on a pause model. And now with a new vision of PTB, the machine, it gives you the ability to, once you have been smarter, to save it. And you can put it on on your chair, drive some of it, or you can download a disk compatible model that you can run offline. So very, very approachable way of getting started with machine learning. So at this point, I want to turn it over to my colleague Bing, who will tell you more about the API. Thank you. Thank you. Hi, everyone. I want to show you guys first. This is a diagram of the library. As I said, it mentioned you could use a model directly. Sorry. Will allow you to use the might. And the very dot is. I'm going to wait for a model beauty. Can you turn your mike, please? All right. So there is a model building API. So it gives you more abstract API way to build things. You need it. And a core API fine-grained and of API. Now you two can show how you construct your network internally and how you be able to find control over the training. And that's the future. And as you mentioned, we support those kinds iis fusion silverside. Otherwise I will use regio. So give you automatic TV, use adoration. And recently we just create a Alpha Rudys for web savvy. So give you your better set original CBO site as well. On no side we use. No, just see by me directly. You two tend to still see libraries. So you can utilize Seedorf tens of love on this, I don't know, running mine. And also we support TVO supporting all any devices that have food support which like and B just give you cards. You can use kind of low budget video library. And also we have something called this job. I view what it does give you and if you have to stay after that's done this without you. Yeah. This is outrageous. And also the simple example be devices that don't have a good view as well. What it does give is you. Is the Web G.L. the software will give you the Web to your API. Actually, you can use Ottoway if you want to. So. All right. So. Now, you we do give you a lot of models by, but the fact is that you may have your own models or you have our AML department that build your own model. All you have seen some, you know, nice model outside. You want to bring it into your jobs. Good application. You can do that. So we give you a converter that can cover any tend to flow models into a JavaScript friendly format. And we do a lot of optimization for you so we can run faster on your browser. And, you know, native mobile devices and we give you also the API for you to low download your model from like any static file serving services like S3 or Google cloud storage. And you can inject that it directly into your application and run the prediction as what, you know, show earlier. So that's for the prouder sigh before service I. We just announced a new API. There's basically if you want to run PIJ cell model Einstein, no. Now, you don't need to convert. You can directly random using our seed library. And which means you have better op support. Our converter actually support about two hundred ops, the core ops of tens of lol. But actually a tangible has about Saoud in us. So this will give you a solid enoughs so you can run a really powerful machine learning models inside a node right now. And it's the board boase tend to flow 1.0 version 2.0 version and give you better performance. Why is that? Because PI song Pennsville Flow actually runs these type Heisel. So you have the python layer. It give actually caused a lot of delay but you know Veha is much faster than Python. So that's why we are slightly better than Heisel. When you run you tend to float model directing side. No jass. So here are a sample home number. For mobile, now, that's what you said Sandy was showing, there's image recognition model Tenterfield Jazz give you about 20 millisecond before its time, which means every recognition of the image takes about 20 missiles, which give you 50 frames per second. If you're building, you know, like Real-Time Abdikadir, that's plenty for you to play with. And T.F. Lie, if you don't know, is a native implementation of Google. Another open source project, it runs on their mobile device. It is for problem is around like 19 fucking iPhone. We do have some, you know, a performance improvement room for improvements for and joyful. But are we really working on that? So on server side, you can see the. That no performance is very close to the sport does performance. So the ball like if you've got a really cool GPA, you you got Baaghi a millisecond praying for us. Are you guys ready for some call? All right. So. Now you can bill, you can lower your model, you can lower our pre train model. What about you want to build your own model? I want to show you how to do that. First, we want to tackle the high level API. So you promised me way this was loading our NPRM packages, getting our. So this is loading out no packages. So, Betty, we know Padget. We have to see by need to, you know, defend the flow, see library. We also package you to see library inside this. No MPM. And also, you can load the GPO virgin if you have to review your neighborhoods car. Let's go back to the, you know, image recognition model that's shown earlier. So you, as Sandy mentioned, a typical new worn out are kind of constructed layer by layer. Each layer express certain feature of previous layer and pass on the resile from to the next layer. We'll show you how to do that. The example will show how to build image recognition model using that layer API. So it's kind of create your eyes a lot of code. But I mean, to be honest is not that bad if you consider like using light your career. And you're saying it's not more complicated than that. So here, the first thing is you use a sequential model. What is sequential model is basically this like 90 percent of the model there are sequential. So what it means that you layer the layers one by one. So the next we add a couple layers. Those are very typical new Anaka layers to convert to give you, you know, a feature extraction type of feature. And max pooling give you a kind of zoom in kind of feature so you can look at the detail of that image at the end. We well flatten the image into a one dimensional vector. So you order for its output, the classification for the cloth that we want. So what this last layer will output is the probability for each class. So that's about it. I mean, that's the model we we built and then we use compile Messer to to basically setup how we would train this model. So particularly we set up the law function to use Kattegat. We'll cross entropy was. You know, Loni. But the fact is that you don't care and you just copy that dobbed him either is gradient descent is also easy to copy just three letters and then take that model. Now you're ready to train. The train is the very same, but which is one MESSER you get X, that's you. That's your input wise. That's your output of your data set. And you park is how long you gonna run this training session of their training is done. You can of course you can look at. We have other messages to show you what the detail the training is, what the accuracy is. You know, all the other good stuff. I'm kind of getting joy here, but after training is done, you can save the model to a five. That's for No. But it can also save it to local star ish in Brother of Stupid to some sober through HDB class. So we provide a lot of those different varieties. At the end, the Marlos ready? Now you can plug it in into your application and start, you know, making your predictions. So that's Hollaway API or what is a low level API level? I actually drill down inside the layers, each layer x rays, many, many smaller operations. For example, their metrics multiply. There are some kind of, you know, you know, cost function or something like cosine sine function or some kind of activation function. You would use. I wanna show you how you can do that with tend to jazz. So let's say we have a set of data set. We want to estimate you can see those dots. Those are, you know, kind of a polynomial kind of function output. But, you know, is not really perfect. Right. So you can't really know what exactly the function. Yes. We want to estimate the parameter for that function. So A, B and C will be the goal for us, for this model to estimate. We're doing the same thing. We're loading our library. We create three variables with initial value as zero point one. What is the Varro variable? Are the tensor that the model going to update the training session? We're starting to tune to make sure the output is the same as what we gave it to them. So here we use tend of low lead, low level API, like, say, at multiply square, you know, those kind of low level API. You could construct a tiny graph, tiny new network graph. So this looks a little bit crazy, right. So just for a tiny little function and just so long. So we have a better way to do that. So we can use chain function to measure to make it more concise to to express the same thing. Here we define the lost function. Remember last time, we just say all it is a category of entropy cross anchovy. Here we have to define our own because we wanted more control. Right. So here is a mean square error. So what he does is calculate the difference between your model output and, you know, your data set output. And then we use the same as the SAGD function in gradient descent function, too, as our optimizer. At the end, we manually run POCs time of minimization of the lost function. So at the end of the day is the same as what you just saw internally in the high level API. So that's why there are so for. Tens of low. We also part of the tens floor ecosystem. We not only give you JavaScript, also give. We also integrate it into you tend to flow world. We have intensive BHAU 10 aboard the visualization to give you a visual idea how the training happens. Like give you a lot of charts to show how the accuracy increase. Why the training happens. So those are at the end of the line. The last line is how you plug it in into the Tenzer bore visualization. They'll be there, you can see a life when you have the tangible opened. Oh, yes. Here we are. We do have a graph. So that's what do you see when the training happened, you know, previous example. All right, now I'll hand it back to Sandy to talk about our user and community. Thanks. Thank you, Ping. I know we're running almost out of time. I just take about three more minutes and give you show you a couple more examples and also point to some resources to get started. So, you know, Densify Rojas's is a growing community. As I mentioned. It's an open source project. We are very sort of happy to see the download statistics and also seeing more and more people join as contributors. More than 200 people contributing GAUD actively to dance apology's and invite. And we invite you all to become part of this. And also, many developers are building really powerful extensions and libraries on top of tens of logins. So there are some examples there which let you do some some specialized custom stuff on on top of the library. I want to show you three examples. This first one is from near farm, and your farm is here at this conference. One of the one of the sponsors of the conference. They have built this really nice application called Gauld Clinic Dargis, which basically is a profiling tool. So it plugs into your Naude workload's and it helps you profile and look for performance issues, memory utilization, CPO consumption, things like that. And in this tool, they run a dance floor. Yes, Mardle for Dino Ising and understanding what the what this profile data means. And you can check out a demo of this at the at the near phone booth here at this conference. No Dad is a opensource library from IBM. This is a floor based way of wiring together. I.T. Solutions. Right. So you have a drag and drop model and you can create iwobi workflows and no drag offers integration with denser flow. Yes. So all that and suffragettes capabilities are easy drag and drop modules that you can use to bring Emil into your iota stack. And very similarly Low Sound is another company that builds enterprise grade iota's services and solutions. And Lucent has been looking at ways of incorporating tens of low G.S. based prediction for client side edge edge prediction of machine learning. And they wrote a really nice blog post to show how you could use this to build like a predictive maintenance application with sensor data. So a lot of these types of examples are beginning to show the power of like an easy Amelle workflow in Naude and IAP. So just in closing, I wanted to show this. If you have machine learning needs, if you envision as as JavaScript developers or as no developers, if you have certain needs or requirements from the library, we would love to get your inputs. There's a there's a you Exide study that we are doing. So, you know, please feel free to join us and and give us your feedback. We would love to hear from you. And and here are some links that are very useful for getting started. Of course, of low dot org slices slash. That's our main Web site. It has all of the things example's documentation, tutorials and the link to the GitHub repositories up there where we have, again, a lot of these examples built. We have a mailing list in red there. If you join the mailing list again, that's an excellent way to interact with other developers and directly with us on the offensive loading steam. One thing I wanted to show you is that if you go to the Google Kord labs and we have these running at the Google booth outside, if you go to the Google code labs and search for tens of largesse, all offered examples available as interactive notebooks. So you can click through this and basically gets to be up and running and get started and explore all these different features we talked about here today. Lastly, there is a new textbook that has come out, which it is really a very nice way to learn the basics of machine learning from a JavaScript programmer point of view. All the examples in this book are written in JavaScript. So that's something that's that's worth checking out. And also, we have a overarching, comprehensive, denser flow course on Coursera, which has a DSG s module now just released this week. Earlier this week. So so plenty of resources to get started. And looking forward to, you know, your your involvement in in the community. So, again, thank you so much for that attention.