Video details

How to Build a Netflix Clone with GraphQL, React, and DataStax Astra


Watch this workshop to learn how to build a simple ReactJS Netflix homepage clone running on Astra DB that leverages a GraphQL API with paging and infinite scrolling.
PUBLICATION PERMISSIONS: Original video was published with the Creative Commons Attribution license (reuse allowed). Link:


Today we're going to be talking about building a Netflix clone with JavaScript, GraphQL and data stacks Astra. So the nice thing about this workshop today is that you don't really have to install anything you can be in your browser, that's all I'm in is just in my browser there to be able to execute everything in this workshop. You have GitHub for the source code, the exercises and the slides you have Gitpod, which gives you a visual code like the Microsoft Visual Code environment. You have Netflix I to deploy and run and host the things that we're going to be doing today. And then you have the Astra database, which allows you to have GraphQL and GraphQL playground, and that connects directly to the database. And you can see it's pretty simple. And it's one of the native interfaces on our Astro database. So your mission should you choose to accept it in the kind of old school mission impossible context is to deploy a Netflix clone to production over a global CDN using a new SQL database with Paging and infinite scrolling. Good luck. So for the course of this, we're going to kind of cheat a little bit because we'll have a GitHub repo that will give us some guidance here and some set up. But I think this will be a fun exercise, and it kind of shows you how you can use the GraphQL API on top of Cassandra. This is the basic setup as we have, like, a reactive interface, the react interface on top of GraphQL, and that GraphQL goes down into the database. And it's just another seamless API that can be written on top of Cassandra, in this case the gateway itself. So when we talk to Netflix about our plans to include better APIs so that people didn't have to rewrite, there's always a capability of writing to the native database interface with like you put in your dependencies and all of that. But we wanted to meet developers where they were, and so we created an open source gateway, and you can host this on top of your existing Cassandra cluster or whatever Cassandra based solution that you have. And then it's a native interface within Astra. And so I'm going to go ahead and show you that after I give you just a little bit of architectural background here. Well, why don't we just jump in? So if we go to this URL and Eric, if you wouldn't mind putting that into the environment, there is DataStax devs appeal. Week three GraphQL, and if you want to follow along, that's great. All of this is self contained. And there's a live demo that Ania Kuba. If you haven't seen some of her kind of live coding exercises, she's pretty amazing. She blasts through stuff and is very engaging. She did, like a three hour version of a similar exercise with us, and this kind of takes some of that and puts it into a shorter form so that you can get it all done and you have an example. So if you wanted to do this yourself and wanted to have some examples of how to hook it all up, this allows you to play with it a little bit more. So as we're starting, you can do it on your own computer. I like Gitpod because this allows you to Netflix. I just because it allows you to have kind of just like a persistence out in the cloud, so you don't have to worry about it. You don't have to pay for anything. In the workshop. Everything is free, and you'll need a GitHub account and probably Chrome or Firefox in order to do this. And Netlify and Astrodb will be worked through through the exercise. So no big deal you can follow along. It shouldn't be too bad. There's some slides in here. There's the discord chat, but let's just jump in. So what we'll want to do is to start up with going to Astra. So just go to Astra DataStax. Com and then you get to this screen. This will allow you to register. And if there's no credit card, it's a free developer account, and you can deploy databases and all sorts of stuff. And so I have a free account. I don't even use a paid account, which is kind of handy because I can do any kind of proof of concept or create whatever databases I want. And it's a little bit different than what you might normally expect from a Cassandra database, because Cassandra typically like a Netflix, uses it this way. They have thousands of instances of Netflix that are persistent. They're there all the time. They span it across multiple availability zones. And basically you're on instances of the used a lot of AWS. In our case, what we've done is we've seen that cloud native applications and kind of serverless and operations have been more of a draw from the ease of use as well as elasticity as well as cost perspective. And so what we've done is we've created a cloud native serverless version of Cassandra, which hides any sort of infrastructure. And all you really pay for or even care about is the throughput and the throughput. You pay for per million read request per million write requests, and then you pay a monthly cost for storage. And it operates on AWS, GCP and Azure. And so let's just go ahead and create our database in here. So let's follow along with this. So I'm going to create a database in Astra. Can I get Netflix Workshop DB in here? I'm going to put it in Australia East in Azure, so that's nice and local. And then I'm going to make a Netflix key space to try to just give Cedar Foundation here. And like I said, all of this is free. You don't need to worry about any cost here. So let's go ahead and create this database and it'll just be just a minute for that to get spun up. This is just activating right now, and you can have any one of a number of databases. You don't have to pay for database at all. Even if I were not paid plan, you just pay for what consumption that you use, which is kind of handy. Let's go on here. It's pending. And so it'll take a couple of minutes and then I'll receive an email. Let's see. It'll be I've done it a couple of times today, so I'm kind of re restarting all of this, but, oh, it looks like it may be online or the region availability is online. Never mind. So it's still pending. It might just be another minute. What I need to do now is create a security token. And so what I want to create in here is Astra operates on the principle of least privilege being a good kind of like multi tenant system. And so I'll go in here and I can just go into managed organization settings. The nice thing here, too, I like about this is that if I have a buddy that I'm working with and he or she is creating a database, I can add them to my organization or vice versa. And we can work on the exact same database. And it's still a developer account across us. And so I have my own personal organization and I go in here and I say I need to create a token. And so like it says in here, if you're following along, you create a token with database administrator. So I come in here and I did that today, and I have a database administrator. So what you do is you just say database administrator, and this gives you specific access. You can have a custom role as well, and you can give it specific read, write access and have a service account and do all sorts of stuff. This allows you to do things like create a private endpoint, like a private link type of endpoint and that sort of thing. And so if I generate that token, I've already done that. But let me just generate another one just for fun. If I go in here, I can download that and save that to my environment. And then let me get a new screen here just for fun. And I'll go into my environment here. It looks like it's active now. So this is my database ID. I'll go in here and I have my Netflix key space. Let's take a look at the health of this and let's see if there's another what's next on our checklist here. So I'll want to go into opengraphywell playground. And so I'll connect to my database. So here's my Grafana display for my database. But I need to go into the GraphQL API. You can connect with a lot of different things. You can use spark with it. And you can use all these interfaces. But one of the native interfaces is graph you. Well. And so I'm going to start a playground to mess around in. And what it will ask me for is in the Http headers. There's a token under the covers. So that was the token that I created here. So if I go on here and I use my headers down here that populate me got my Astro token right there my database administrator token, and I come back here. And what do I do next? I am in the playground and I have that, and I want to create something. So in this case, I'll copy this over. I'll go to my playground and I'll create a table. So my key space name is what I already had, and this is to do the GraphQL schema is Netflix key space, and the table name that I'm using is reference list. And this is just the syntax to create a table within that key space if not exist true. So it'll just be a non operation if it happens to be there. I have a partition key, which is the label and then clustering keys. I have the value, and I'll sort that store that on disk. So let's go ahead and create that with the key that I have in there. So I've created that reference list and let's go ahead and see what we have next. Insert some data. So let's get some data in here. Basically, this reference list is for genres. So we have a bunch of movie genres that we want to put in here. And like I said, this is a simplified example of what Netflix does, but it's true to form in the sense that Netflix stores every piece of data that's not kind of a CDN optimized movie file or picture file in Cassandra, including account data recommendations where you left left off. If you are on your phone and you're on Netflix, and then you started on your laptop. Once you get home off of the train or whatever, then it'll start where you left off and all of that data is stored in Cassandra. So let me copy this out. I've already done that. Then I'll go to my GraphQL screen and I'll write my query or mutation here. But I also need to put that in here, too. So let me get that token again and I'll populate that so it knows how to authenticate, and I'll get rid of that and I'll add all this data. I have some sort of problem here. Oh, that's right. One step I missed is that I want to put in change the key space from system to Netflix key space. It's trying to insert that into my system key space, and it's like, what? So there we go. So we have all of that in there so we can add some other data if we want to. Let's see. So in terms of getting data, you can see the basic syntax that we have for inserting data into our cluster is just inserting. And then the genre is the label, and the value is what we have for that value. Let's go ahead and get all this genre here, and then we'll go ahead and see Oops wrong one. I thought I copied that. Never mind. So we can see that it's stored in the database. And just as kind of an interesting exercise here is that you can store it in GraphQL. And if you have somebody who's more familiar with Cassandra like one of the back end Cassandra engineers at Netflix, let me do expand. That means that it goes horizontally instead of like vertical columns. So it makes it easier to read and Select Star from Netflix reference list, so you can see it stores it in a way that you can connect with the Cassandra interface or the GraphQL interface or rest interface, and it really doesn't care. It's a native interface into the database. So let's go back. What else do we need to do here? I suppose we need some movies. So let's create a movie schema. And what we're doing here, let me bring it over to the playground here under the GraphQL schema. We have a movies by genre. And what this means is that my partition key is going to be the genre. So we're going to put a bunch of movies in each genre into the cluster. The clustering keys are going to be the year and the title of the movie, the year descending. So the more recent movies are going to be at the top or in the front from a data perspective that will optimize that. And then we have our send off for each movie. We have the title of the movie at the clustering column and then the synopsis duration and then a thumbnail, and it says text. And so we're really not storing the actual thumbnail file in the database. We want this to be a CDN kind of replicated thing. I'll show you in just a minute. But let's create this table first. So that's done. We'll go back here and we're going to insert a handful of movies. So Inception, Prometheus, Aliens, Blade Runners, just the movies that we might like to see in our database here. So let's put those in the database. Looks like they're there. But let's double check and we'll play with that a little bit. So getting the movies out of there and we can see that this is just basic GraphQL with the order by and all that kind of thing. One thing that we could do, and this is part of what our task is for today is how do we do Paging? Because you look at Netflix and everything doesn't load on the screen at once. You want to do that on demand as you scroll or as you click the arrow. Let's start with some sort of paging built in here into our GraphQL, which we can implement in the back end as well. But just as an example. So we have a page size of two. And what that does is it gives us two in a page, and it gives us a page state at the end, which is kind of handy because we can do something like this. We go down a little bit further. If we have options page size two and then put a page date in there, then it keeps the cursor alive basically or not cursor. But you know what I mean. It keeps it so that we can get to the next page as we scroll, for instance, in a UI setting. So if we do that, these are our first two year descending. And so we'll get what is that aliens and Blade Runner in the background. And you can see here, too, is that the thumbnail is this empty four file that relates to I think all of them or most of them are set up correctly to point to a movie file that has to do with that movie. So one thing that we can do as well is we can download this data set. Hopefully you're able to follow along, if not everything's laid out in this GitHub repo for you, you can download the data set. I have it kind of baked into my back end. Here is you can download the data set, and it's not very large. But what it does is it allows you to see how you can let's go back in here. We'll load some data. And if it's up to, like, 40 megabytes, you can upload your data set in this UI. I clones the repo in my local environment, and so it's in the data directory of it. And so it's just a CSV file. So if I do that, it's going to tell me. Okay, let's upload this. And then it gives you a preview of what's in the file with all of the fields and the intended types. And so what I'll need to do is say the only thing I really need to specify is the partition key, which in this case is the genre. And so I can add clustering columns if I want to. But as soon as I have the partition key, it should be fine. And now I just need to tell it what table or the key space to put it in, which is a Netflix key space. So it's importing now. So what it does is it triggers the job in the background. It takes about a minute or something like that. So it'll says that I have my workshop DB created and it'll in the background. Do that. Let me see if I've got everything caught up here. So it's a handy way to just kind of load some data in the background and a number of if you go on here, there's a sample app Gallery that has a bunch of different apps and a lot of those apps use that same interface to just give you an idea of how to mess around with the database. One thing that we added today, as a matter of fact, yesterday or today is the ability to add another region. So if I wanted to add a region, I can add one of the other regions in Azure, and I can replicate over to that region because Sanders has always been able to do that. And that's one of the reasons why Netflix uses it. But now with the serverless model, we needed to kind of have some specifics around how to handle that in a cloud native environment. So let's see. Okay. The load job started. I think it's going to take less than a minute. It's a pretty small file. Let's see where we're at now, so I should be able to get it's going through, telling me what I need to do with all of that. And then we say what target. And then I should get some emails to say that I can check that out, and then CQL let me just see if I can do that now. Started. Let me mess with it. Let me see. Looks like it's done. So we have, like, Suicide Squad, Toy Story, Teenager Turtles, too. Of course, I guess. I don't know. So a bunch of different movies loaded. And so that's how we've kind of bootstrapped the database for the purposes of what we're trying to do here. So what we want to do then is in this repo, you may notice that there's not just data in this repo. Let's go ahead and deploy to Netlify. So I'll connect to my GitHub account. So that's why you need a GitHub account. Is that it's going to give us some access so that Netflix, I can deploy this repo and essentially fork that repo into your own GitHub account and then make that into deploying now. So it'll deploy that from if you look in the bottom here. My roommate from College decided that my computer was called the Jerometron 5000, so he just decided that. So I thought that would be an easy way if you look at the bottom of this window. Jerometron is my handle. And so it forked to the repo into my GitHub repo, and it's now deploying into that repo in Netlify, and it'll be just a minute for it to kind of download and then do some building and it's deploying and all of that. And what it does is it deploys from my own repo. What we need to do is go to my repo that has been configured here and only open that into a new window. It's doing that right now. Still. Okay. Well, I won't be impatient. I'll let that kind of finish. Hopefully everybody's doing well in the background there. Let's see if there's any questions in the interface. Thanks, Eric, for answering questions in here. Like he says, there's some discord stuff if you want to join discord. We have kind of a continual persistent conversation in there. If you have any questions about this, I'm kind of going through this fairly quickly, but hopefully it's not too hard to keep up with this site deploy in progress. This is still let me see where this is at. This is trying to deploy. Still, it'll take just a minute and what we'll do is open Gitpod and get pod is a nice environment to do all of this and let me scroll up and just show you what else is in this repo. So you'll see that there's the source, which has some basic stuff in there. There's like the components like you have the scroll bars and a NAV bar and a card for each movie and that sort of thing. And then you have an index JS and it's fairly simple when it comes down to it. You can see in here. There's like the hero section. And this does a get movies on the netlipy functions. And so if we go to the getmobies in the functions, the GraphQL is fairly straightforward. Right. So the query. So you have like this astrographql endpoint, which is in the environment that we'll set, and then all we're really doing is what we did in the GraphQL playground. Right. So we have the movies by genre and we can get the page state and that's what is coming out of the request. And we have the genre that's coming out of the request. And we have the page date down here and to keep in the background as well as the various values. And then just what we do on various responses. And we have the fetch that gets it based on the token. And the token is the other thing that we need to populate in here. Let's see if this is completed. Looks like it's done. So if we go into here, this is my fork of the same thing. And so into that and deploying to get Pod, but then it would deploy into Gitpod in the data stacks. We have devs repo, and that's probably not a good idea. So let's go down back where we were and this will deploy in my own environment in GitHub. Where are we here? Okay. So this will be using the Jerometron repo and what this does. It sets up an environment for me. It has the repo and it'll go into visual code Visual Studio code environment online in my dedicated environment. And then I can configure some things to make it all work together. So this is going to jump in. It has all of the information in here, and then it's doing all of the NPM initialized stuff in here. We'll have that all set up. I'm going to add a new file in here as it's doing that. So let me jump back into here. I'll verify that I'm in the environment that's appropriate. And that should be fine. The other thing that I'd like to do is make sure that I have an environment. This sample environment is let me create a new if you're familiar with, like, the NPM, that's the fly stuff. You just need an environment to put some of the variables that you need in there. So, for example, if I want to do this and I'll get rid of this, and so what I need here is my token that I've gotten. Where is that? Okay, there's my token. Put that guy in here and then my endpoint. And if I go to connect in here, I'll Scroll down to where it says, Write data, and it gives me the endpoint in there with the Netflix key space already built in. So I'll go on here. I'll go to connect graph you. Well, what's interesting, too, just as an aside is there's also a document API and that Document API usually Cassandra needs some schema associated with it, and it hasn't active in the past, like a graph or a document database. But the document API allows you to just persist JSON like nested JSON, and it creates a table indexes for you and all that kind of stuff. And so we can add indexes to our GraphQL stuff. But the document interface allows you to do that and does that for you. So let me jump back here. So change the location to the key space URL in the graph Deal playground, which we already know is Netflix key space. And then we can go back into our environment. I don't want to have that. I want to go here. Okay. So that's my Astra endpoint and this auto save, which is handy. So what else do I need to do here? So let's double check that I'm in the right place. This will say, are you sure you wanted to allow this? Sure. Okay. So I am in dramatron. No worries there. Okay. And then I install the Netlify CLI in this environment, downloading the world, so to speak. Luckily, data is not unlocked down, so I can download it from all sorts of places and don't have to worry about it too much. That's all done. And then I got all this information and did not leave the curly bracelet. Oh, yeah. Well, I've already done all that. So once I've done that, let me make sure I did that correctly. I have my environment here. Perfect. And then I'll go on here. And then I will NPM install to kind of build this all up. And then we'll launch the application and that's basically this is just a normal Netlify thing. So we'll launch it in the environment in Netlify. So it's kind of just spinning up a Dev. So netlified Dev basically starts a local Dev environment. You can do this on your machine as well. And then it spins it up on the local host. And so it says, okay, I got my Netflix account and this you can see based on the very simple interface that we have in here allows you to have they have all the components in here. You have them have bar and all that we can go into a new browser tab. Click on this. You can see it in here. So aliens watch this as a kid. It scared me quite a bit with Ripley and Newt and everybody. So you have all these different genres, and then it auto plays the MP four that gives you a little bit of blurb about what it is. We can go to the movies and this kind of just gives you an idea of how you can make this all work. And these are the IMGR CDN type of data files that were out there that we talked about earlier. And so if we refresh this page, it might be a different inception. This is still aliens, but it's got the wrong. It's just a data thing anyway. So this all seems to be working. Okay, so that's in our Dev environment, we can also log into Netlify and have that go to production. So let's quit out of that and I'll log in there and the login needs to go to a new window. The simple browser doesn't really help with that. So I'll authorize this. I'll log in there because I'm already logged in to Netlify, and then I will link it. Let me go on here and again. It doesn't like that any either. Oh, no. How do you want to link this? Let's just do the remote origin of that. That says, sounds good. Let's do this thing. And then I will take the Env file and upload it to Netlify. So if I do that in here, I will import the environment variables that I put in this environment file. It's kind of handy. And then I will deploy it to production in Netlify in my little free environment that I have. So I'll build it first, making the production build. So this is taking it from Dev all the way up through to production. And then I'll run deployprod to just make it work. This is where it is. And then there's the Netlify CLI command to open up the site. Let's do that. And I think it'll just do it in here and then we can put it into this window. Then it'll go to. So this is just the simple interface here. So there's Terminator two also scared me as a kid. There's that guy with a sword and it was all trippy when I was a kid. Okay, so we have the same interface and it's been deployed in the Netlify app environment with a URL that I kind of did the basic URL there. So there it is. It's all put together. Hopefully you've been able to do this as well. You've deployed everything to Netlify, and so from start to finish, this isn't really too bad. We had some pre baked goods, so to speak. Astra is pretty handy to have this. I mean, there's other databases out there that are cloud hosted DynamoDB Cosmos, but I think one of the nice things here is that this is a cloud agnostic thing so that you can run it in whatever cloud you want to, as well as having these APIs that are pretty nice to have as a native interface into your database. It's a first class citizen, so you can do essentially what Netflix does. You can create your own streaming service to compete with Stan so you can host like The Castle or whatever if you want to be your own ideas, man, so to speak. I watch that movie pretty ridiculously funny. So you have the databases, and we also have a managed Pulsar in here, which allows you to do things like you can CDC from your database into other services like Elastic or whatever. That's kind of a nice feature to have as well. So hopefully this has been helpful for you today. Let's see. Is there any other questions that people have? I know this looks a little bit trippy in the interface, but let me jump out of the presentation mode. Well, let me finalize what we have in here and just show you how this all comes together. So we have this GraphQL. This is kind of a replay of what we've done. We have this GraphQL layer, and then what we did just to kind of recap. We deployed that we created the instance, and then we have a token. We created the GraphQL in here as a native interface into the database, and then we have Netlify, which does its thing to push to its kind of CDM through their system. And then Gitpot is the one that takes everything in an environment, an online Dev environment like you can edit the code, set it up and run it locally with the Netlify Dev, and then it pushes. We linked and pushed that environment up to the Netlify site manager. And so this is a serverless infrastructure up here, and this is a serverless infrastructure down here. And so everything's serverless now. Obviously, there's service running in the background, but it makes it nice from a billing perspective. So you just pay for what you use and you get the nice emoji smiley face. The other nice thing about Astra database in a service form is that it elastically scales with you. We tested from a few thousand requests per second to 20 million requests per second. And I think it's a P 99 latency of about 20 milliseconds, which is actually really good. And the serverless nature of it means that you can scale in seconds. And so if you have a bursting event like you're opening up vaccinations to a new site, or if you have an event like this, you can scale very easily and very quickly so that you never have any sort of downtime and you just pay for you don't have to worry about pre provisioning the Ops. It provisions that out because you're paying for the throughput. You don't have to worry about that at all. And then it'll scale back down. So you're not really paying for persistent infrastructure. That's the nice thing about the serverless system, and it's all secure through the keys that are for your organization and that sort of thing. So we did the deploy to prod, and there's a bonus there in the repo if you want to do that. And then again, I just wanted to call your attention to our data stacks. Cassandra day to kind of give you a hands on workshop for a full day. It's free, and it also has a call for papers. If you want to build some sort of demo or do something like that, we'd love to have you involved and just want to invite you out to that. Thank you again for your time today, and I hope this isn't too trippy, and I hope that that gives you some ideas to kind of apply your GraphQL expertise in a new way and give you some new thoughts in terms of Cassandra. I know Cassandra traditionally hasn't been as developer friendly, but I think we're making strides with Astra, with Stargate and with that sort of thing. And so you leave the management of Astra Cassandra to us, and you leave the API gateways to us, and you just focus on developing whatever applications that you might dream up or for your organization. We work with a number of banks here in Sydney, Telcos and others, and also a lot of startups. Netflix is just one of several that use it for streaming gateway and other things. Thank you very much like to also call your attention to we have a cloud native database and a self managed form, and we call that case Sandra. So if you're looking for a way to deploy this style of cloud native database, we're working on open sourcing the ways that we do that in Astra. But the cloud native style database that's out there now in a selfvanish form, is under the banner of Caesandra IO. That's the website to get to using Cassandra in a cloud native way. So if there's nothing else, I think that's everything for today hope that you've enjoyed building this and look forward to seeing hopefully some of you at the Cassandra day next month.