Video details

Can We Double the Node js HTTP Client Throughput?


The Node.js HTTP client is a fundamental part of any application, yet many think it cannot be improved. I took this as a challenge and I’m now ready to present a new HTTP client for Node.js, undici, that doubles the throughput of your application The story behind this improvement begins with the birth of TCP/IP and it is rooted in one of the fundamental limitations of networking: head-of-line blocking (HOL blocking). HOL blocking is one of those topics that developers blissfully ignore and yet it deeply impacts the runtime experience of the distributed applications that they build every day. Undici is a HTTP/1.1 client that avoids HOL blocking by using keep-alive and pipelining, resulting in a doubling of your application throughput.
PUBLICATION PERMISSIONS: Original video was published with the Creative Commons Attribution license (reuse allowed). Link:


Hi everyone. I am mate Colina. And today we are going to talk about the Notes http client. What can be so interesting about it? Well, what if I could tell you that, you know, you could just double the throughput of your http client or even better, maybe, you know, triple the triple. So you know, we can do that. And we'll see how a couple of things about me before we start. I am at Colina. You can follow me on Twitter at me to Colina. Please follow me back. I'm also part of Near Form technical director. There we are professional services company based in Ireland. So if you are interested working for us, check us out. We are hire globally and remotely. So hey, I'm also a member of the Notes Technical Steering committee, so I'm part of Note and I'm the co creator of a Few Bits and Bobs on MPM. I also write every week and I take with us letter called Nodland. So check us out. Anyway. I maintain a lot of open source, including node. So as part of my activity of both as a consultant, as a maintainer, I tend to strike and maintain a balance and feedback is my learning from when I help companies to when I maintain things in the ecosystem. As part of this activity, I tend to write and build new things when the opportunity arises. Most applications start as a monolith. You know, there's nothing bad about talking about monolith. Most apps start as a monolith. They're great and classified. The frame framework that have built is great for building monolith. So you know, it has almost 9000 downloads per month and 15 collaborators. We are working on version four now and we have a lot of ecosystem plugins, so it's pretty great and it works really well. It's also part of the Open JS foundation, so I have another talk about it. So I'm not going to spend so much time about fastIf. However, once we built our monolith, one of the hardest question is out. We start scaling our teams out, we start scaling. How do we start? So how do we start scaling our teams? How do we start improving our system? Well, we'll see that in a second. So what we want to do in those cases is that if we want to, we want to move to what's called a micro service architecture and where we have some sort of gateway that talks to several microservices, as you can see in our as you can see in your in the slide. So however, this is not just that because in the most complex enterprises and projects, the reality is way, way, way more gets very complicated very quickly because you have several layers of microservices that talk to each other. Now. The throughput and latency that each one of those introduce it's critical for the system and each one of the most bits that tend to be overlooked when opting for a micro service architecture. And by the way, fastIf is one of the architectures. Justifies great for building microservices. Also lambdas by the way, but it's great for building microservices. So please use fastIf. When we're building microservices. As part of this talk, we are going to go through and look at these links between the microservices and now we can use Http to provide the bug a link between the micro services easy debatable but also very, very performant. So let's dig deep into this topic. It's, you know, look at this because if you want to make it fast, it's that link can add up pretty quickly because if you add more levels of latency, if that link becomes low, then all your systems. So let's take about a little very simple server. This here is an IP service. Note that just respond with a given time out. Easy. And the client the Http client for this is again, it's very simple. You call a CP get, but you can also use Axios node fat request got whatever you have and you can call and then you can just pipe the answer back somewhere somewhere. Note that this call this system by default. Allocates a new TCP socket for every CCP for every Http request. So one no socket for every request and we dispose it. This is not efficient. Creating a socket. We'll see this is not efficient because creating a socket involves several round trips on the server. So essentially when you establish a new socket, you need to do a sin, you need to send a sin IP packet which then needs to be replied by the server. Now this is a back and forth and you're actually losing milliseconds here. So if you're on a very fast network like a server, this is actually not a big problem, but you still have to allocate file descriptors and Europe insisting system and so on so forth. So there's a limited amount of these things that you can use. Not that you get a little bit of latency as well. And part of the problem is also due to the so called TCP congestion window, as shown in this diagram. By the way, these diagrams are for a book called I Performance Browser Networking, which I would highly recommend you to go and read. So this congestion Windows what you have is you can see that if I'm sending a big packet on a freshly created socket. So if I'm sending a big file on a freshly created socket, I need to do a lot of round trips because every once in a while I need to send a knock to the server. And this is very important for Http for TCP because it ensured that all the packets arrives in order. However, you know, the congestion windows grows over time and at some point it's with using an algorithm and so on. It stabilizes. I don't want to enter all this stuff. It's very important, but outside of this talk. What is important is that if my socket has been open for a while the congestion window desire and because it's higher, we can send a lot of data before sending an app. Now in the example on the right, you can see that if we have a biggest congestion window, we can send all our data without receiving an act from the sender. In this way, we are actually reducing a lot, reducing the latency quite a lot. So in order to ensure the maximum bandwidth and the minimum latency, we must reuse existing connections. Right. Is that this is a fair assessment. So what what we were doing before creating a new connection every time is not efficient. So IoT core we offer a construct to do this, which is creating an Http agent. Http agents cave keep the connection alive. So it avoids then shake and maximize the connection window. It use the KIPP one one keep alive. It's a key feature of Http one one. This is actually critical for TLS connections, because on top of the TCP and Shake, you will also have the TLS and shake. So if you don't have an agent and you're calling with https services, you are setting yourself up for trouble. Now that this is not the default, so you need to configure it manually for your http client or set it to configure as a default in node. So it's very important that you do because the difference can be staggering. Now let's turn the idea into reality. One of the service scenario that I'm going to test it involves doing 500 poll request to one server, and it which in turn does another five five requests to other microservices. So essentially it's a lot of it's a lot of requests. Okay. And the server takes ten milliseconds to pursue the request and the client has a limit of 50 circuits. So all of these is fictional. So don't worry too much about it. Do your measurements. But the difference is staggering. So if you don't do keep alive, you will be in very, very, very huge trouble. Use Http agent with keep alive full stop. The difference can be enormous. So the secondary question then after we've seen how we can improve our client, you know it's we can use an agent then. Right. But can we still can improve things further? Well, we can. We can how? Well, you know, we need to go back into the pack and look what's there. So is there something in Http that can allow us to actually work at the higher speed? Well, there is one of the important bits is this thing called http. 1.1. Pipelining HTP 1.1 pipelining is allows to send more than one request at the same time using the more than one request at the same time. It's great. Minus one thing. All the responses with http 1.1 needs to be received in order. So it offers for the so called head of line blocking. So if the first request takes a lot of time, everything will have to wait. However, this it's a good technique. It's important to know that is possible because we can actually use it on our server to server. Not that add of line. We need to talk a little bit about ad offline blocking, though, because if you're doing this and you start losing packets or having a low request, you can actually block all the incoming requests for a while. So be careful on how much you are pipelining. The other important part that we need to talk before making a recommendation and discussing how and why we can actually improve the speed. Is the event loop. In the event loop Enos event loop. We have an event that we have called event. So it's called the loop. Events are not JavaScript though. So you have IO events. Events can be produced by the kernel or can be produced by a thread pull, and those events get put into a queue. Once an event is there, no JS can fetch that event and process it with JavaScript. That's what it does, that's all of it does. And with JavaScript, you can schedule more events to happen in the future, and then those would be cute. Or maybe we can just say, oh, I finished processing this event. Please send me the next one. That's what it does. How does this rating to pipelining? We'll get there in a second. But one of the important bit is to tell to note how can we make things fast in node. So in order to make things fast, you need to understand when the event loop is blocked. So if you're doing massive value, like in this case, you want to maximize the you want to maximize the time that node JS is doing AO. This means minimizing the time David loop is blocked. Right. Well, the event loop is blocked when we are executing JavaScript. So in order to make things fast, we need to reduce the amount of time we spend in JavaScript down to zero if possible. And this is the key technique that we have used we can use to improve things. Okay. Because because we can because we can schedule things with Http pipelining. One of that technique, combined with the event loop logic, can make sure that when we are processing things, we are processing a lot of events, a lot of things from the server or for the client and so support. So it's important, not that it's. It's also important to note that with http pipelining, you are going to have a lot of Econ results. You're risking Econ result. So if the other side is truncating your connection before you send any data. So recently we have changed the keep Alive agent to a logic of Lito. So last in. First out, this means that this reduces the amount of these tend to use the most fresh sockets. So it reduces the risk of them timing out the the risk is still there, though. So you need to configure your keep alive time out. Well, so considering all of the knowledge that we have went through so far, I am going to show you one of the best things that I wrote in a while, which is this new library called Union? Well, it's an Italian word that means eleven y eleven. Well, if you consider the number one one, then we have http. 1.1. So you see on which is pretty great now that this is totally a stranger Things reference. So if you don't, if you're wondering if it's why it's called it also is totally because strange things. Because when I was starting working on this library, I ended up writing songs thing just came out. So in fact, I've been developing this library for quite a long time. So do you work? Well, on is a new library for node and it keep alive by default. So you don't need to configure a keep alive agent. It also it's also a Leo scheduler. So by default again, it does not do any pipelining by default, but it can be configured to do so, and it can create a limited connection. Or by the way, you can also follow a direct. Now that this is really fresh, it's a fresh syntax. It use promises. You can also use call back if you want, but maybe not. And basically you can just just use the things that you like the most. It's very simple to use, not that you can only similar to a node as a concept of agent. And you can configure a global agent if you want to do so, you can configure for example, the pipelining factors. So it is fully capable of doing http pipelining don't pipeline too much because you risk a lot. But you can also configure the number of connection to each destination if you want to do so. So it's actually pretty great that you can can configure all those things. Not that if you're using us for testing, you might want to disable or deeply reduce the keep alive so you can actually change the setting and essentially configure the global dispatcher for the things. You can also use the lower level API so you can create a pool for a target destination. So if you as a pool with a target pipelining and connection numbers and then you can just call request as it was before. You can create a client, a client map, one single socket. Again, you can do all those things one at a time. We also have these interesting methods called stream request dispatch. We have pipeline. We have a lot of things. We have a lot of options in terms of integration that you can do. Oh, by the way, I was almost forgetting. We also have support for Mac. This is one of the greatest things that I really wanted to get in on the four. This is not in V three n before is coming out these days. When you watch this, it might be already to get out, but it's still not at this time, so it's still in the release candidate phase, so you can install it with mporter, for example, with the Mark, you can actually configure a global agent. A global dispatcher for that will Mark the response. And you can also enable a pastor mode, so you can only Mark certain things. This is really important because in order to support marks, for example, in Knock or in the NOC http that to rely a lot on monkey patching internals not or internal. However, we don't monkey patch with this is no monkey patching happening at all. You can just create one and it will just work. So it's pretty pretty great because you can also use it for testing your libraries. So I really love it. How does this compare? Is this fast enough? Well, let me show you. So if we're considering just a very a very simple system with the simple system as before with keep alive. If we don't enable pipelining, there's not much effort here, but if we enable pipelining, we can dramatically drastically increase the number of requests per second that node core can send. Why? Well, because we are minimizing the number of round trips to the underlining to the kernel and the operating system anyway, and we are using our socket the best the most essential. So for me, this is pretty pretty great. I also done some benchmarks on http two using this library called Fast Fied TTP proxy. It's a simple Http proxy system that is built on top of node JS and fastIf that can do T one to T one, but also HTP two to HTP two, but also http one one and by Traversa. So it also uses only by default for http 1.1 or 1.1, which is great. So you see it's fast actually. And you know, I'm pretty pretty happy about this. So hey, it's actually pretty good. Now I know that this can be improved quite a lot potentially because this is using one single connection and we might be eating some sort of http two limit. So yeah, this can be improved even. But I'm pretty happy about. So I am almost wrapping up. I just want to say that I want to recommend to you always use if you haven't watched this talk, I want to just get the most out of it. Always set an http agent check out on the TV for and if you have the problem of doing a lot of micro service system, and then she can actually drastically use the overhead for your the system. So hey, pretty great. We have a new dog website. So https and no JS. Org. Yes, it's part of node. Okay, only the is part of no no JS project now. So it's pretty great from my point of view. We need help. We need people to use undig and file box so that we can fix them. Please do that. We can also send PRS. There's a lot of activity is one of the most art projects in the nodes organization. So hey, pretty neat. Right. I also want to to show oh, nice. Cool. Here we go. Like I also would like to show you a little bit of of a demo of one. So here we go. So let's see that we have a server. So this is a server that does a few things and you can see it's pretty new syntax. So using the news, we can actually iterate over the incoming events income requests. We wait for the number of to the serve for be listening, and then we start processing your request. Pretty nice. Right. I like the syntax. So then I can actually do the node server. Right. So if I do that, then I can say, for example, I can curl it. And here you go. Yeah, it actually works. Cool. Now how can you only to call to create the server? Well, we can actually open up. These are the code which use the global request method from DC. We extract a bunch of things and then we Consolo it essentially. So let's see if it works. Cool. It seems working because we are actually replying that it's saying that we have a date. It's a 200 and it tells us that the server want to keep this connection alive for 5 seconds and it has a content length of eleven characters. Those eleven characters are a low word Ray. And if you're looking on the server, you see we have access to. So it's pretty great. From what I can see going back to our slide, I just wanted to finish up by pointing out to the fantastic high performance browser networking book by Eli Rigid. You can also read this online for free at HP. Com. Check it out. I talked a lot about fastIf so fastIf. Org and then we have only the and there is a guide on event loop and finally, not clinic. If you want to optimize your servers, we are about to wrap up. So I will just thank you very much for having me. You can reach me on Twitter at Moto. Colina and also send me an email asking for anything essentially like the universe and everything. At mate. Colina at near. From. Com.