Video details

Serverless Prisma 2 with GCP Cloud Run - Émile Fugulin @ Prisma Day 2020

Serverless
07.01.2020
English

The talk was streamed on #PrismaDay​ | June 26th, 2020
Prisma Day is a community-focused online conference on modern application development and databases.
https://prisma.io/day​
Subtitles for this video have been professionally transcribed. Prisma can provide the subtitles for your tech video as well - get in touch with us at [email protected]

Transcript

- Hi, everybody. So the topic of this talk is serverless Prisma 2 with GCP Cloud Run. So during this talk, we're obviously gonna talk about Prisma. And Docker also, so why we should use Docker how to build an efficient image. And after that, we're gonna jump to Cloud Run. And how is it a good serverless platform for us and how to do the secret management, and finally, how to do the deployment. And in the final stage, we're also gonna talk about how to integrate with an existing database. So a bit about me first, I'm a backend and DevOps Freelancer based near Montreal, Canada. I started working with Prisma at the beginning of the year and I really enjoyed the technology. I have already done two projects for some clients using the technology and I'm an active member of the community. I contributed a bit to the GS client and also to the Rust engine. And I'm helping a lot of people. So if you need any help or have any question, feel free to contact me to any of these contact information. So let's jump into the topic Prisma 2 is really great for development. Like the engine is downloaded in the background for you probably a lot of you didn't know there is an engine at all. The CLI makes client generation easy. TypeScript has first class support. The imports are all from @prisma/client making them easy and convenient throughout your project. And most importantly, the magical auto auto-complete that we love at the end of the day, and that's what makes us productive from day to day. But I also found that Prisma is hard to deploy somewhat. By default nodes only understand JavaScript. So if you're using TypeScript at all, that's already a problem. The CLI is usually not included in the production dependencies only in dev dependencies. We do need an engine already to run in production, not something that is downloaded after. And we wanna make our build artifacts small. So they're easy to deploy in production. So if you have something like mono-repo with an API package, and a common package that is not published to any central repository, you might have a hard time to deploy Prisma. And my solution for you today is to use containers and most specifically the Docker and Docker file specification. So why containers, most of you're probably already familiar with the idea of containers, so they are ready to run. They contain all the dependencies and your application code. Usually you only wanna ship your JavaScript code, the generic JavaScript code, and they run on a variety of platform. Like I said, today, we're We're gonna use Cloud Run but you could use a Kubernetes cluster. You could also use Docker Swarm, Mesos or any proprietary thing that runs Docker files. And just a small distinction between an image and a container. So usually I wanna refer to an image an image is like a blueprint. And the container is like is the runtime like, if you build a house, you have the blueprint and the house so the container is house. So let's build one. This is the most simple Docker file you usually find for an old projects. It is simple, but is it really good? No. There are several problems with a simple Docker file like that. So the first one is you're always using the latest node version because you didn't tag the node that you're gonna use. Usually it's not something we wanna do in production. Next up, we got everything from the local disk to the Docker image. And the way the Docker caching works is usually per line. So if something changes in one line, all the next lines are gonna be invalidated and needs to be rebuilt. And we want the caching to be as efficient as possible. So our build time is as low as possible. So that means that every time you've changed one file in your file system, it's gonna invalidate this line, copy everything back. And then the next line is a yarn install, meaning that you're gonna install your dependencies every time you build your container. So also, the problem with yarn install is that it's installing both the production and the dev dependencies. So you have a bigger, well, you don't need the dev dependencies in production, right. And finally, you have a yarn bill. So that means like you take your TypeScript code, generate some JavaScript code, but you still have the TypeScript code inside your image which is not needed in production, and frankly, it's not a practice. So you can like most of the best practices I'm gonna use are based of Docker, Docker blogs of Docker best practices. You can read about them more at the link below. So here's my improve build that I design for my projects. I originally posted it on my medium blog post, and it's a lot beefier, it's a lot more complex. And we're gonna go through it to see why it's a better approach to building a Docker image for Prisma. So the first thing you'll notice is we're gonna use multistage build. So that means like, you can have all the dependencies during your build, because effectively we're gonna build three images, the base image, the builder image and the runtime image. So only your production dependencies will be inside the runtime image and you copy only thing only the thing you need for runtime inside the final image so that means you won't have the TypeScript code inside the final image. The size reduction is usually pretty significant. So without any optimization, you can go from 1.2 gigs to usually 200 or 300 megabytes for the production image. One thing to note is I usually prefer to use slim images to alpine images, even if beginners usually use alpine. This is my personal preference, but I found that usually the trade off between alpine to get from slim to alpine is that you lose g-lipsy and you gain muscle and sometimes runtime code is not like you can have unexpected behaviors with muscle. So I prefer the g-lipsy, implementation and also you lose the APG, which is kind of a big deal. Since the package manager and alpine is fine but not perfect. Let's just go back a bit to show like the multistage builds, so the first stage is the base that we just saw. And the second stage will be the builder around here. And the final stage is the runner, the actual runner that will only this part will be shipped into production. Go forward. And then we are at the builder. So, here in the builder, you can see that we copy the package.json and tsconfig.jason with this *.json. And this is interesting because that allows us to cache those dependencies and before we copy any of the other files, so that means like every time since the caching is done per line, and if you invalidate one line, all the next line will be discarded. If we don't modify any of our json files, everything here will be cached between our bills and that will improve the build time quite a lot. One thing to note here is we're installing the production dependencies first, this is a trick because our common package is not published to any central repository. That means if you try to install your production dependencies inside the final image, it will say that it cannot find the common package, which is normal. So one trick is to you to install them before your dev dependencies. So you they are not polluted from your dev dependencies and copy them over to somewhere else. If your common is either published to a central repository or if you don't have a common, you can just skip there this step and put the install of the production dependencies inside the final image, but then you also will lose the caching because here, like I said, all the next steps will be invalidated on every bill. So I prefer to put them here. Notice that we follow the same link here. This is needed because common is usually the common package will usually be same link into the API package, no dependencies. And we also use a no hoist. So that basically, the API package dependencies are inside its own node modules and not merged together with all the other node modules. That makes it just a bit easier to just get the runtime dependencies of API alone and not other node modules that you might have. And finally, we're installing the dev dependencies around here so we can proceed with the build. So in the next step is to basically copy your source code from API and common package. So that will mean like I said that you won't have to re-fetch your dependencies on a rebuild if you cache your builder image and this is very great to speed up your build up. Next we're gonna do a common a package first, generate the Prisma client here, and then build our API package in the right order. Obviously, if you have other packages, you need to be in the right order for them to build correctly. And then we go from the builder, stage to the runtime stage. So we copy everything we need for the runtime. So the first dependencies, so our production dependencies that were copied in temp, and then some of the generated stuff that from Prisma so the Prisma client, and the .prisma/client, inside our node modules our dev node modules. And then finally, we copy the common distribution to our common distribution of our node modules. And the last step is to compile JavaScript code from our API package to actually run it. And also if you might usually wanna copy back your tsconfig, and also your package .json sorry. And since this will allow you to run script basic. And the really last step of the bill is to drop the privileges. Because by default node runs as routes, it's usually not a good idea to run your stuff as root. If you have some kind of vulnerability, basically the attacker will have routes if they drop out of your application. And also you can ensure that if we were to go about them. So now what we have an image, we need a runtime, we need a database. And my solution for you is to use Cloud SQL and Cloud Run which are both products of GCP. So why Cloud Run? It's a serverless platform, meaning you don't have a server to maintain. There's still a server despite the name, but you don't have to maintain it. It's also a paper request. So that means I you start paying when your first request comes in, and you stop being where your last request comes in. So if you have a low traffic, or if you're a startup, it's a really great way to save money. Or if you have just a small project, and it's almost infinite scale, we're gonna see what is the limiting factor actually. So it's based on Knative, a Kubernetes Knative which is like an open source technology. So it's a, basically their way of implementing this technology with GCP Cloud Run. So it's easy to switch to GKE which is a managed Kubernetes if need be, they do have an offering called Enthos. If you have an existing cluster, or if you really want to run on your own cluster, that works pretty well. And basically, what we're gonna get from Cloud Run is an HTTP endpoint that you can use to ping your API. And we're gonna see that in the moment in the demonstration. So but first, let's talk a bit about the database since it's an important part of our deployment. So I decided to use Cloud SQL, but it's not necessary. It has a great integration with Cloud Run. But anything else could work too. One thing to know is usually your SQL database is gonna be your limiting factor. Because of the way the database were designed, and it's usually old software, even if it's good software is usually limiting on the number of connections, since every connection is using a bit of RAM. It's not a server less platform where you can connect as many people as you want to database. So, how to measure the connection, it's not an exact science, but Prisma uses this formulas to determine how many connections they are gonna use per instance of Prisma client. And then usually you have 100 connections for a basic database. And not modify or not optimize on GCP. So if you have only one CPU, that means like you can have usually 33 instances, but I would still give it a slightly like a bit of margin of error because it's not a perfect science and we don't wanna bust the connection limit of our database. One thing to make sure and GCP if you use Cloud SQL is to enable the cloud SQL API. So I have a small demonstration for you that just shows how to set up your database if you've never done it inside GCP. So let's go full screen. So here we choose the Postgres database, we give it a name, give it a master password, it doesn't matter since we're not gonna use the master user. Choose a region whatever you want. For this demonstration, it's important to use a public IP even if it's not reachable from the internet, since it as a firewall, we need a public IP, then it's always a good idea to choose a back up during the night and choose the high availability if you're in production, so it creates two machines basically. And you can swap between the two. So you have less downtime if your database dies. So here, we're gonna create the database. Next, in the meantime, we enable the Cloud SQL API, admin API. That's something people always forget, for some reason, I did that the first time. So that's why I'm showing it to you. So you remember that you need to enable this API. Otherwise, you're just gonna not work. It's just gonna not work. So that's I'm creating a new user. So usually, what I prefer is to create a service user that I can discope later, so it doesn't have like drop permissions, for example. So you can protect a bit your database and emit the permission and finally, you probably wanna create a new database to like a logical database. Since yeah, using it but the default one is usually not recommended. So just a simple video to show you how to go along and create a database. Also, I wanted to touch a bit about the migration. So for I'm not gonna talk a lot about it, I've already inside this demonstration, it will already be done for you. The way to go about it is to connect your database using Cloud SQL proxy, which basically opens like a tunnel between your computer and GCP. So you can talk to your database. Create another user for migration, it's always a good idea in the user with all your privileges that is not the same as your service user. And if you're using schema first, you can use the experimental migration I found that it works quite fine for basic needs. And if you're using the introspection, then use whatever tool you're already using. I prefer to do it migration during usually this CD though continuous deployment phase. So whatever if you're using GitHub Action, Cloud Build, anything will work. As long as you have a public IP, the Cloud SQL proxy will be able to connect to your instance. So, yes, I'm not gonna go much more over migrations. But if you start topic, you wanna hear more? Let me know and we can discuss it later. Next up, I wanna talk a bit about the secrets. So why do we need a secrets management so you should not put secrets inside image like you don't put secrets inside your code. It's not secure. We just want to have the secrets live in the memory at runtime. So one great thing is that there's a service called secret manager inside GCP where you can keep your secrets secure, have an easy time rotating them and see like, also who has access since it's an access based on the service accounts and if you need to keep track of like, who rotate those secrets, when for compliance purposes, it's a really great service. And what you're gonna put inside this secret for the database URL is this format here. So it's pretty typical format for Postgres. And the only difference is you have to use the locals and the host file for called SQL. So here we're gonna see it in action. So first you copy your connection name from your database. Then you go over to the secret manager, create a new secret, give it a name and just write the format that I told you. And here you can see a bit like the first version and we can create new versions. So the only problem with, am gonna do full screen so the only problem with secret is there's no real integration with Cloud Run, so my solution was to use a custom script that download the secret at the startup, write them into a .env you might require some Docker chain Docker file changes since we dropped the permission to node which doesn't have any write permissions, and inject them as environment variables inside your secret. So that's the your startup script. That's the changes you need to make the Docker file. Just give it some folder you can write to. And obviously you need to use in your schema.prismadenv thing. So finally, let's deploy the service. So here we have a pretty standard multi mono repo, like I said, and within API package. I don't have a common here, but still, it's the same thing. So here, it's also using Nexus. It's a graph QL API. Pretty standard. So and also the Docker file like I showed you inside the whole presentation. So it's not really different. So the first thing you'll need to do is create a new service account for your service. After that, you have to give it some permissions. So the permission to read the secret we just created. After that, it's the permission to connect to the database. And finally, we're ready to build our Docker image. So I already, so it's usually using all the caching. So you see it doesn't take long to build. And finally, we push it to the GCR, which is the Container Registry of GCP. Next, we go into GCP. So GCP Cloud Run, we created a new service. Usually you wanna be in the same region as your database for latency purposes. You also can use here the Cloud Run for Enthos, give it a name, and usually allow an authenticated is what you want, because if it's a public API, you need people to query your API. Select the Docker image we just built and go into the event settings. So you have the container port, the container commands if you need to change it, and then the service account, you need to select the one we just created. Usually, you want to have like the maximum capacity, which is 80 right now 80 concurrent requests per container, the request timeout, the CPU memory, you can all customize and you want to change the auto scaling. So currently, it's 1,000, like I said, your database is gonna explode if you leave it at 1,000. So five is usually fine for starting. You can increase it up to 35 or 33, like I said. Give it still a bit of a nudge. Finally, you want to put some environment variables or the know the introduction, but don't put anything sensitive here since it's not encrypted. And the lastly is to connect to a database. So we're gonna use the Cloud SQL connection with our database here that uses the public IP. If you wanna use a private IP, you can do it with a VPC connector, but it's out of scope for this presentation. So finally we create the service. And then we can see that it started, we have the randomize URL at the top, and then we up into postman. And then we're gonna just do a small graph QL query to see the service works. And then we have the response. Cool. So let's go back. So a bit about the downside of this service. There is no integration currently with GCP load balancer, but it's coming soon. I tried it and is pretty great. I can talk too much about it because it's still under NDA. Also, there is no minimum number of instances, that means that if your service doesn't receive any requests, it will stop all the instances automatically for you. That means that you will have a cold start on the first fresh container that runs like you saw in the video, it's usually six seconds. That's my experience, too. So if you can do some lazy loading, it's usually a front end concept where you load the pieces of code and instead of like loading everything up front, it could become a back end concept with serverless, which is because the cold start is usually a problem with any serverless platform. So prefers lazy loading, if you can. And you can also kind of cheat the platform by using a GCP health check that pings your service so you're only gonna pay for like this health check. Every, I don't know, 30 seconds or so. And it's gonna keep at least one container running in the background for you. One other problem is also that there is no background jumps. So usually you need a worker with a cube, but it's usually fine for that. And the Prisma engine used to die, but it's really not a problem anymore. We just need to keep it in mind. So because it's more sensitive to regressions in the future. So going further, we can do DNS name with DNS with Name mapping, TLS monitoring and infrastructure as code. That's one of the topic I really like. Usually, I use Terraform. But I also discovered Pulumi recently. So basically, you write code to build your infrastructure. And Pulumi allows you to write TypeScript code to build infrastructures. So that's the same thing that we've built in the command but with code. So in conclusion, Prisma is really production ready. Containers are a nice way to package your project. And building a good image is not easy. Cloud Run's a good serverless compute platform. Cloud SQL is not perfect, but it does the job and do not be complacent with your secret management. Thank you very much. And if you have any questions, I'm here for you. - Emile, thank you so much for your talk. - Hi, yeah, my pleasure. - I really enjoyed learning about some of those Docker optimizations and like, what is possible these days with Cloud Run. I think it's like super cool to see how, like Docker and containers in general have really sold like the packaging problem and allowed for you to I mean, I think most of the serverless platforms these days are essentially running, on containers. So are you ready to start with a Q&A? - Yes, yes. Sure. - Okay all right, I mean, I think I have also a question. So I think one of the interesting things I'm curious to hear about from your experience is, what are some of the trade offs when comparing something like Google Cloud Run with serverless platforms like AWS Lambda, Netlify, and Versal? - Yeah, great question. One thing I found is most of like other serverless platform are still using their custom infrastructure, in the sense that you need to use some kind of a handler that is specific to that platform, or you're using like serverless, the serverless framework. So with the container approach that a Google Cloud Run has you just take your already existing workflow and frameworks and what you're used to working with. And then you just can translate that directly to a production ready environment and you just need to listen on the right port, for it to work. - Right - And that's it so, yeah. - I also imagine it's much easier for local development 'cause essentially, in your local environment, you're running exactly the same runtime as you would in production. - Yeah, exactly. I found is yeah, it's basically you're running an express server, like you would put in any production environment or whatever server you're using for a node or another language. - So someone asked whether you can share the link and you shared the link to the article, which was sort of a basis for this talk. So if you're interested, you can look in the Slack channel and you find the article which explains all of this. The talks will also be obviously available later. And the question came in, from Pascal, who's also really active with the questions. So thank you, Pascal. Is Google Cloud Platform cheaper than AWS Lambda in your experience, and I imagine he's talking about Google Cloud Run. - Yeah, yeah. Well, I looked at the number because I actually didn't really check, but 'cause it is a bit more expensive than Lambda, but I think the trade off is acceptable. I still think that if you're switching from a 24/7 running machines to something like serverless, you're still gonna see some cost reduction anyway, because you're only paying when you have requests. So unless you're like having really big scale, then it makes sense more really to switch to more instances, than pay per request, you're still gonna see a lot of cost reduction. - Right and under what kind of conditions would you recommend moving from something like Google Cloud Run to Kubernetes? - Well, yeah there are some of the downsides. Like I talked about, if the latency for starting a new app, like the four to six seconds is not acceptable. I think it's probably not the right product for you. And also, if you have the engineering team or scale to set up to Kubernetes and maintain it and update everything, like even if Google does a lot of stuff for you updating the nodes and everything is still a lot of overhead that you don't have to worry about. If you're a small startup, and you're just starting, you just want something up and running quickly. Or you just have a new projects and you wanna do a proof of concept or something. You don't wanna spend the time to set up your whole Kubernetes cluster. So those I think, are the most important things. - And you talked a bit about migrations of the production database in your talk. Can you expand a bit on that subject? I'm really sort of curious to hear about like, how do you make sure the database is ready when a new version is deployed? - Yeah, I've seen too well. It's really on your CI system from my experience, I think I've done one inside the getUpActions and one in site Cloud Build... - Where you ran the migration inside the CI pipeline? - Yeah, yeah, exactly you just, you build your Docker container, you push it. And then you run your migration. And what's great with it, is you can reuse basically, your, the container, you just build to run the migrations, because they have the the Prisma code and Prisma migration inside of the builder, for example. So that's a great way to do it. And it works well with Cloud SQL to connect to your database. - And lastly, let's see if we have any, okay. Also you talked a bit about you went into detail into some of the optimizations with regards to Docker builds and caching. Can you talk a little bit more about some of the tricks there that help with caching? - Yeah, yeah, sure. I think the best trick is really to have your package.json copied before and then your dependencies after that, so not copying your code at the beginning is really the biggest trick. And after that, what I did is basically you copy your production dependencies before your death dependencies. So they're really, like, separated and you roll on the average of your production dependencies IDN and not a mix of dev and prod dependencies, which you don't want. - That's what you did with the multistage build there. Is that right? - Yeah, yeah, exactly. - Okay. - So you copy your production dependencies somewhere else in the meantime. So you're not trying to fight your yarn package to like remove some dependencies that you don't need you just like really do it in order. - So we had another question from Pascal would subscriptions be possible, OTB was such a setup. Not so sure what he means by OTB. - Currently no, not with Cloud Run. There's this something they were working on but the problem was subscription is they're based on WebSockets. And WebSocket requires a constant connection. So it would be like paying for one request that never ends. And it's supported right now. So you can look at something like Pusher or another third party system if you wanna do, but it's not gonna to be native to. - Okay, so basically, with Pusher you would have something like a WebSocket connection from clients connecting to Pusher and then the Docker containers would just push the Pusher, essentially, the events for the I see, I see, okay. Okay, Emile, we're coming towards the end of the Q&A. Is there anything else you would like to mention before we finish off? - I'm glad that people enjoyed my talk. And don't hesitate to contact me if you have any questions or you need help on anything. - Thank you for your talk. And also for all of your great contributions to the Prisma maker system, I think you shared with me recently the awesome Nexus repository that you started so I'll probably share a link on the Prisma day channel if you're interested in that. And yeah, and thank you, Emile. - Yeah, thank you very much.