Video details

Choreography vs Orchestration in serverless microservices - Mete Atamel & Guillaume Laforge

Microservices
12.16.2021
English

We went from a single monolith to a set of microservices that are small, lightweight, and easy to implement. Microservices enable reusability, make it easier to change and scale apps on demand but they also introduce new problems. How do microservices interact with each other toward a common goal? How do you figure out what went wrong when a business process composed of several microservices fails? Should there be a central orchestrator controlling all interactions between services or should each service work independently, in a loosely coupled way, and only interact through shared events? In this talk, we’ll explore the Choreography vs Orchestration question and see demos of some of the tools that can help.
Check out more of our featured speakers and talks at https://ndcconferences.com/ https://ndcoslo.com/

Transcript

So, yeah, unfortunately, we couldn't be there with you. It's still a bit complicated to travel Europe wide and beyond, but we are MIT and I were very happy to be there, even if only virtually with you to speak about choreography versus orchestration in serverless micro services or microservices in general. So I'm Ian Lahore. I'm a developer advocate for Google Cloud and I'm based in Paris. And today I'm with Mete. Hi, everyone. My name is Matte Tamel. I am also a developer advocate at Google, based in London. And as Gill mentioned, we are sorry that we are not there. Nbc Oslo is one of my favorite conferences and I spoke there a few times, but this time we couldn't make it there. But hopefully this will be good enough. And if you have any questions, please try to put them on a slid up and I will be monitoring that as we go along and I will try to answer them. And also in the end, we should have some time. So we'll check it there as well. And I understand that there is also a chance to ask questions in the end. So maybe we'll do that in the end as well, depending on the time. But yeah, feel free to ask questions as we go along because we like this to be an interactive session. So thanks very much. All right. Thank you. So, yeah, we'll be speaking about choreography versus orchestration. We'll explain what those two approaches or concepts are. So, of course, that's not about that kind of choreography and orchestration, although there are some similar or similities, let's say. So it's really two architectural approaches and different ways of messaging and having micro services interact with each other. But what we are going to do. So we'll go through the different approaches, but really I'm looking forward to the end where we speak about a concrete use case. Meta and I built an application called The Pick, a daily picture sharing application. And this talk is actually our learned experiences while we were building and rewriting this application together. So stay tuned, especially till the end where we cover our feedback that we learned along the way when building those applications. So let's first have a look at a simple ecommerce kind of transaction where you have multiple services and different technologies. Here I've illustrated with a few icons from Google Cloud products, but it doesn't have to be about Google Cloud at all. So imagine any technology that you prefer, any framework, et cetera. So we have a front end application that's going to receive some incoming requests from users. We have a second service that's a payment processor to authorize and charge a credit card. Then we have a cheaper service that's going to prepare and ship the items, as well as a notification service that is going to notify users whether the transaction was successful or whether the parcels will be delivered to their address, etc. So one approach is to potentially change those micro services. So the front end calls payment processor. When the payment processing is done, the payment processor is going to call the shipper to prepare the order. And then once the order is ready, perhaps this last cheaper service is going to call yet another service. It's a kind of a chain of services, so it's one approach, but as you can see, the services are pretty high key coupled as they invoke each other in a chain. So if something fails in the middle, what's going to happen? Right? So there are some pros and cons to this approach. Let's imagine we're using Rest to have each service call each other. When you use this approach, usually you've already split potentially a big model into smaller micro services. So it's easier to work on individual services because you can also have teams focus different teams focusing on different services, rather than having to ensure that the monolith is still in a consistent state. It's pretty easy to implement as well, in the sense that services simply call each other the Rest. It's pretty straightforward to do, however, in terms of consoling between all those services, especially when one service fails, what happens to the overall transaction. Also, the other thing that might be tricky is how error reporting or retry and time of logic works in this context, because it's really at the service point where you have to handle this. So you have to add some technical or infrastructure related code at Invocation sites, and it's going to add some boilerplate plate code inside your business code, basically. So it's not necessarily always super clean to do it that way. Well, depending on the technology or framework you use, things might be better or not. An alternative approach would be to use an even driven approach. That's the choreography pattern, basically, where instead of having a service called another service, which calls another service, et cetera, in chain here we're going to use some kind of message broker so it can be some pub sub kind of product with topics, subscriptions, et cetera. So here the front end is going to send a message through the broker saying, hey, I've got a new request, then service, the payment processor is going to listen to a particular topic and oh yeah, there's an incoming payment processing request. Okay, I'm going to handle it. If I can charge this credit card, then I'm going to send another event, another message, then the shipper service might be interested in that particular message, et cetera. The notifier might be interested in the shipping messages as well, et cetera. So it's more loosely coupled. There's no responsibility for one service to call the other service, so things are also more usable. So potentially, let's say the cheaper service might be used by other parts of your application or the notification, which is more generic. Perhaps it's easier as well to scale or reuse those services and scale them potentially independently depending on how much one particular service is used. So if we look again at another but slightly more complex example, let's say here you've got to check what's in the inventory. Perhaps you're going to call the inventory database and have different branches. So in one branch it's in the inventory so you can prepare the order confirmation. If it's out of stock, then you might need to call supplier API to replenish the stock, prepare some sages, et cetera. Then in either path, in either branch, you also have to update the inventory, prepare the order, notify the customer, depending perhaps if it's big order, perhaps you want to involve the sales rep or not. So you might have different branches. So compared to the previous example where it was still chain of messages, things were pretty much serial. Here we see that for more complex business processes, message driven approach still interesting, but perhaps it's harder to make sense of the flow of messages that are flowing around in our system. So pros and cons again. So in terms of pros, services are loosely coupled, which is nice, which is characteristic services can be changed and scaled independently, which is also a good thing. There's no single point of failure because even if a service goes down, then, well, it will pick up the messages when it's ready again to serve those messages. And another nice characteristic of this approach is that events are usually easy to extend in the sense that potentially there's a new event that is going to be created and new services or existing services might be interested in that new event. Or if there's a new service interested in an existing event, you can also add that service fairly easily without changing the overall architecture, et cetera. So it's fairly easy to add new messages, new kind of events and new services. In terms of cons, it's usually harder to monitor the whole system. It's more complicated to figure out, okay, this particular message trigger this flow of messages. So you have to keep track of some correlation, IDs and things like that to ensure that those messages are somewhat correlated. In terms of errors, retries timeouts, it's still complicated. You might be able to set some logic of perhaps the Pub sub system. So for this particular queue or topic of messages, then you want to have retries and that many retries. But it might be pretty much global to that topic and not on a per service usage perspective. The other downside I would say is the fact that the business is not really captured explicitly in the sense that the behavior of your application, the behavior of your business process is emerging out of the flow of messages of events. But there's not really a place where you can look up and say okay, so it should like in our flowchart, basically you don't have a flow chart saying okay, it's going here and there, it's going to go this other service because that event is going to trigger that service, et cetera. So it's not easy to monitor, track and also understand what's going on with this full of mistakes and events. And also who ensures that the whole transaction is successful. Okay, you've got messages flying around, but you don't necessarily have more global service that would ensure that, well, all those messages led to the end of our workflow, of our business process. Meteor, you want to say a few words about the orchestration approach? A different approach? Yes. So let me share my screen and you can let me know. Stop sharing. This is your turn. Okay. So you can see my screen? Yes. Okay. And by the way, someone mentioned in the questions to try to Zoom in in her slides, I'm sharing the full screen, so I don't know how I can Zoom in more, but let's keep that in mind that when we show call later, we'll try to Zoom in as much as we can. Okay. So as Gian described, when we are designing an architecture, we can have services calling each other directly and that has these issues of tight coupling. And then we can go with a choreography or event driven approach where things are more loosely coupled. But then all of a sudden you have this mess with messages where you have all these different message types that are flying around and then it becomes really hard to monitor your system. And most importantly, there's nobody ensuring that the transaction is successful. Of course, there are some patterns, like Saga pattern, that you can use to kind of make sure that certain transactions are successful. And then if something fails, you can apply compensation things. But there's nothing like overall like watching your events and event room system working correctly. So you need to kind of think about that. How you're going to do that? Now, another approach you can take, it's called orchestration. And in this approach, instead of services calling each other or instead of services sending messages indirectly to each other, you can have an external service orchestrating the calls in the order that you want. So if you go back to our previous simple example of an ecommerce transaction, in this case, you can have the front end receiving the request like before, and then the front end kicks off an orchestration. So there's an external orchestrator. By the way, this orchestrator, it doesn't have to be external per se. For example, you can have the front end kind of orchestrate things if you like, or you can have one of your services kind of orchestrating other services. So it is possible to kind of do orchestration within your front end, or maybe it's in a service within your architecture, but it is usually a better idea to have something external. And we'll talk more about this when we talk about workflows. So typically this is an external service whose job is to really orchestrate your calls. And what this orchestrator can do is that it can first of all, it has the rules of your business logic. So whatever you want to do, whatever services you want to call, in whatever form, you kind of explain that in your orchestration. And we'll get to this again when we talk about workflows, and then the orchestrator just follows those rules and then in this case it will call the payment processor and then the ship and the notifier. So it will make those step calls to each service. And since orchestrator is the one calling these, it can make sure that each step is successful. So for example, if something happens with payment processor, orchestrator can rewrite the call. It can also take results back. So if payment processor returns something, orchestrator can take that result and it can pass that result to the next service like the shipper, for example. So in this approach, of course, there's pros and cons. The pros is that the business flow, like hacking struct flow, can be captured centrally. And since it can be captured centrally, it can be source controlled. And this is really useful because you have this central place where you can say this is how things are supposed to work. You can version that so you can have multiple versions. And if something needs to change, you can have a new version and you can source control this. So if someone comes to your system and they want to know how the system is supposed to work, there's a place they can go and look at it and they can see the history of it as well. Since you're versioning it now, each step can be monitored. In orchestration, there are steps, and if a step fails, it's very obvious like what failed. Again, we'll look at this when we show our example. These errors utilize the timeouts that you need to think about. They can be now centralized because if you want to, for example, apply a central kind of error policy, you can apply that in each step in your orchestration rather than implementing that error policy within each of your micro services. So that allows your micro services to be more simple. They just do the business logic, but they don't really need to deal with the retry logic or the error logic. That all Microsoft need to deal with usually. And the good news is that the services are still independent, right? There are still micro services that are deployed independently. They can scale independently, but you are kind of bringing them together into what I call a temporary monolithic application, right? Like the orchestration, if you think about it, is kind of like bringing these services that are independent into a temporary kind of situation. So you're kind of getting the best of both worlds. The independence of my services versus the coupling that you get and the benefits that you get from the coupling with an orchestration kind of like a monolith. But an orchestration service is something new to learn and maintain. And it's a different kind of paradigm than what you might be used to, because usually we are used to independent services calling each other or events playing around. But orchestration is basically a new service that you need to configure define how it's going to call these my services. And then if you need to change something, the orchestration needs to change as well. So that's something that you need to get used to. Orchestration could be a single point of failure, because if orchestration source doesn't run, then nothing gets run. So you need to make sure that you're using something resilient and redundant and you are losing some flexibility by choosing orchestration. Right. Because in eventing, for example, if you want to add a new service, all you need to do is deploy the service and configure what messages that service should receive. Right? That's really it. So the configuration is quite simple. Whereas in orchestration you need to do more, you need to deploy the service, and you need to also change your orchestration and take that service into account. And you need to deploy your orchestration again, usually. Right? So that makes it a little bit harder to extend your system, and you're losing a little bit of flexibility there. And you still need to think about the whole transaction. Now, the orchestrator is the thing that can watch your whole flow, so you can watch that all steps are successful. So in that sense, we're doing better, right? We're making sure each step can be successful. And if all the steps work, then your whole transaction is successful. So in that sense, it's better than eventing. And I think it's easier than eventing in terms of how to make sure your whole transaction is successful. But you still need to worry about what happens when a step fails in the middle. Then you need to do what's called compensation steps. If something fails, like, let's say you charge the user and then you want to ship an item afterwards, but shipping the item failed, or maybe you're running out of stock or something like that. Now you need to go back and uncharge the user, or maybe remove the block from the credit card. So you need to have this compensation step. So you need to implement the Saga patron again. But I guess it becomes a little bit easier to do this because orchestration services, they give you tools to kind of capture the errors and add new steps to handle them. Like do kind of high catches in your steps, if you will. All right, so the question is, which one is better? Choreography, orchestration? Which one should I use? It depends, as in every software engineering question. But there are certain things that I look for when I'm thinking about my architecture. So if my services are not closely related, meaning you have these services that are independent, they don't really need to know about each other, and they can also exist in different kinds of transactions. Then in that case, usually an event driven system is better because these services, they can live in different contexts, and it will be easier to plug them in into those contexts if you're using eventing. Right? So that's when I start thinking about event driven. But on the other hand, if your services are costly related, meaning they usually are deployed together and they usually call each other in a certain order, then that's when I start thinking orchestration. Because if your services kind of act like a unit already, then why bother with the eventing flexibility? You don't really need that flexibility. Also, if you can define your flow of your services in a flow chart, like the example that Giann showed with that flow chart, then usually that's a good candidate for orchestration, because that flow chart has a lot of useful information about how your system should work. And if you want to preserve that, orchestration allows you to actually quantify that. First of all, orchestration allows you to kind of code that in orchestration, and then it also allows you to preserve that and version that. So if that's important to you, then orchestration is probably a better option. And of course, you can also take a hybrid approach. And I think this is what I usually see in real world kind of applications. You don't really go with just event driven, or you don't really go with just orchestration, but you usually go with a mix of both when it makes sense. So what you can do is that you can orchestrate what makes sense together. So if there's a transaction with multiple services that make sense to go together all the time, then you orchestrate that. But then at the end of the transaction, you might want to signal other orchestrations with eventing, because they don't really need to know each other. They just need to be able to signal to each other. Right. So for that, you can use event driven. Similarly, you can have your orchestration be triggered by an event, and for example, a user saves an image to a bucket and you want that event to trigger orchestration or image processing. Right. So that's a perfect candidate for an event triggering orchestration. And then your orchestration taking over and doing the work that it needs to do in a tightly coupled way with the services that you have. Right. Okay, so that was a really quick summary of choreography orchestration. Now, Guillaume will tell us what kind of services and tools exist in this landscape. And then after that, we'll get back to orchestration and look at in more detail on Google Cloud workflows. Yeah. Thank you, mate. And we actually have a question, an interesting one. If we have an error on some step and we need to show it on the UI, do we better notify UI directly? Or via orchestration? I guess it depends getting a bit on the situation. But for example, on a recent application that I built, it was a small demo. It wasn't a real application, but I created an expense report application. And sometimes let's say you are sending some receipts that are not valid or things like this or I wanted to see what was the ongoing state or status of expense report. So I was using in that particular application, at least I was using the Firestore real time database to actually notify of status changes. So if there's an error, I can notify you with that. If I'm just going to the next step, I want to update and say okay, we're currently in this particular step. So in my workflow definition, I actually had steps to update the status, basically. So if I wanted to say okay, the expense report was submitted or the receipts are not correct, I could just update the UI. Thanks to this real time notification mechanism, which was because that's a real time kind of database, I was able from the front end to check what was the ongoing status, basically. So yeah, we can show it on the UI and it can be a step of the actual orchestration of the workflow definition. Another question, could you combine Choreographing on orchestration? Yes, that's what Mate explains, especially when you've got several bounded contacts. The bigger picture might be actually a choreography of events, but then for well defined, bounded context kind of business processes, then there it's a great idea to use orchestration instead so you can combine the two. And in bigger systems, I think that would be the norm to have several approaches together. All right, so let me share my screen again. All right, so let me say a few words about the landscape for event driven systems, for choreography, I mean, all the cloud providers have well, there are several solutions for each cloud providers for AWS, for Azure, for Google Cloud. So on Google Cloud, you can use pub. Sub, or you can use even Art for events. But there are also things we're seeing a lot in the wild, some well known open source projects that are being used like Kafka, Apache, Polar, et cetera. I'm often seeing Kafka being used for event driven systems and quite successfully. But all the big cloud providers either offer their own solutions or some integration as well with those open source projects in terms of orchestration, same thing. So for example, AWS, you can combine Lambda functions with step functions. On Azure, you've got logic apps. On Google Cloud, you've got Cloud Composer, which is based on the Apache Airflow principal project. Composer is more for Airflow is more for data driven kind of pipelines and business processes, whereas workflows that's the project and product we'll be using demonstration workflow is more for business processes with some logic like branching and things like that. Not necessarily data driven. I'm just going to show those two screens very quickly. We don't have to look into the details, but we try to have a look at the orchestration products. So step functions, logic apps, and Google Cloud workflows. And basically those three products have more or less the same coverage. But we've been trying to keep track of some of the features that are available. But yeah, all things like branching logic, virus steps, calling APIs, et cetera. That's the kind of steps you can usually actions you can usually do in those kind of products. Can you want me to add something? Yeah, I just want to add a couple of things here. So when we did this research, one theme that emerged is that the base features are there for most cloud providers. So the base kind of orchestration is there, but step functions, it seems to be more geared towards AWS ecosystem. So you can orchestrate AWS Lambda and other AWS services, but they don't really allow you to orchestrate external services. And I think that's a serious limitation that you should be aware of. On the other hand, logic apps seems to be the most feature rich orchestrator per se. They let you orchestrate pretty much everything on Azure and outside Azure. And they also have a really rich ecosystem of external things that you can pull in and nice features like including code in your steps that run quickly, things like that. So that seems to be the most fully feature rich one. Workflow is second, most of the main features, and the step functions is mostly for AWS ecosystem. We have a few questions, and I'll take them as we go. So is it ideal and feasible for an orchestrated to be simple enough that it can demand 100% test coverage? Potentially, yes. First of all, when we use nochestrator, since it's a product that you use, you don't necessarily have since it's a cloud based here, at least in the ones we're mentioning, our cloud based it's complicated to do unit tests. You really have to do integration tests. And depending on how complicated the workflow is, if you've got tons and tons of branches and conditions, et cetera, combinatorial explosion might make it difficult and complicated to have 100% test coverage. So I wouldn't necessarily go as far as requiring that high level of coverage. Do you agree, mate? Yeah. I mean, it's hard to unit test orchestrator, at least at this stage, but I think you basically have to do like some kind of integration testing to test your full orchestration. That's the story today. But that might change with this open source project that Giam is going to talk about right now. And by the way, someone asked if these lenders are available. They're already available. Actually, I will put a link as a question there so that people know where it is. Thanks, Nick. Yeah. So another thing you might want to have a look at. It's the CNCF several workflow. It's actually something fairly new that lending or it's still an incubation, I think, at the CNCF. So it's a specification for defining workflows with the kind of DSL domain specific language and also providing various SDKs so that developers can call that workflow. But then there are two things. It's still fairly new, and all the products we've mentioned, as far as I know, at least none currently implement that specification. So your mileage may vary, but at least one or more providers or open source projects implement that specification. It's not of much use yet, but the other interesting aspects of this project is also that it's using the cloud events specification. So another specification that describes and standardizes the format of events for even payloads. And another thing that it's taking advantage of is the Open API specification to describe your services. The services you want to interact with within your service workflow have to be described with Open API, so they have to be rest, and you have to provide an Open API description for them. But it's interesting to keep an eye on this and see how it evolves. And if cloud providers adopt this standard specification. No, that's method that's going to Zoom in on one particular example, which is Google Cloud Workflows. Yeah. All right. So now in the rest of the time, I want to talk about workflows. And my point here is that I want to show you what you can expect from an orchestrator. And we're going to use Google Cloud Workflows as an example because that's what we know. But you can expect similar things from logic apps and other step functions, similar things just to give you an idea on what you can expect from orchestrator. And then we'll talk about our case study at the end to show you how we use these tools, basically in an app that we wrote. Okay, so what is workflows? Workflows is a service to orchestrate and integrate other services, basically. And the things that you can orchestrate is, first of all, your code running on Google Cloud. And by the way, it doesn't have to be servers code. So any code that you have running that you can access from the cloud workforce can include it in your orchestration. So this can be service functions, service containers, web applications with App engine, or even your app running on virtual machines, as long as it can be accessible with Http. Okay. So any code that you have, any Google API or Google Cloud API that you might want to call and any external APIs as well. So this allows you to write orchestration, where you combine your code, the APIs of the cloud, and also external APIs like Switcher API or Tulip or something like that that you might want to include in your orchestration. And I like this because I think in a typical kind of workflow orchestration, it's not just about your code. It's kind of a combination of many things right? So this allows you to include those in your orchestration. Now how do you define your workflow? Well there is a workflow definition language and this language can be in YAML or JSON. By the way, sorry for the YAML. I don't think YAML is meant for this, but this is what we have. I think we can blame Kubernetes for all this YAML craziness. But anyway, so we have this definition language and in there you have this idea of steps. So you define and name your steps and you define what happens in steps and you define input your steps and output your steps and so on and so forth. So if you look at our previous example of these three services calling each other, you would orchestrate this using something like this where you define three steps and then in each step you define what happened. So here we are saying that we are making an Http post call to the URL which is the URL of the cloud run service. And then we are passing in the input of payment details that we received as a parameter. So you receive it from the previous step and then we pass it as an input to this step. So this call is going to be made by workflows and then once you have a result then you can capture that in a variable called process result and then in the next step again you can take this and you can parse it. You do a little bit of JSON parsing and provide it as an input. Now to a cloud function. So you see how this works, like defining the steps and passing the variables and passing it a little bit as you go along. So this step sequencing is what orchestration is about. You can do things like serverless pause. So let's say you ship an item but you want to wait a little bit before you notify the user. So the orchestration can kind of add a timer and wait. And then as I mentioned you can pass parameters between your functions and you can kind of do a little bit of JSON parsing. Once you receive a parameter you can pass what you need and just pass that part. Although I have to warn you that this JSON parsing is very primitive. So if you need to do something more complicated, it's not there yet, but it allows basic pricing now. So that steps then on top of that you can have error handling in workflows. So for example in payment processor, if you want to make sure that you retry a few times before you fail the step, you can specify this maximum five times an exponential back off policy for that step and workforce will do that for you. This is very useful. It kind of takes out the error and recreate logic out of your functions or containers and gives you a way to do it globally with from the workflow. Also sometimes maybe you want to do different things depending on the outcome of a function. So if a function is successful you might want to call the not Fire service, but the function is not successful. You get an error error. And by success and error I mean the Http code that you get back. But you can define that precisely what it means to successful or error. And then depending on that you can branch to another service and you can maybe call the Pages service. So that kind of logic can be part of your orchestration rather than trying to do it in the code. And this conditional. Sometimes you want to do something you want to read like say the database and check a value and determine if something is in stock or out of stock and do different things depending on that, right? So that conditional logic is very easy to do. In workflows. You have a switch statements and you can basically switch and do different things. And this third party API calls like let's say at some point you want to request something from a supplier that can be part of your orchestration as well. You don't need to kind of deploy code to do that. The workforce can call that for you. And to wrap it up there's some other useful features like you can define sub workflows to encapsulate common reusable workflows and use them from multiple places. We have Connectors to connect to Google Cloud Services. So instead of trying to construct Http call from the orchestration, you can use Connectors and Connectors make it easier to call Google Cloud Services and also have the logic to rewrite things if something needs to be retrieved or wait for things. And there's a notion of iteration. So in your workflow you can do for loops. Basically let's say you want to call a Twitter API, get some tweaks and then for each tweet do something. So you want to do that in a for loop you can do that. We have Callbacks which allows you to write orchestrations with people in the loop. So let's say you have some orchestration running and at some point someone needs to approve the next step. Then you can have a callback that waits and an external user can send an Http request to the callback to continue the rest of the orchestration. So features like this have a lot to build. Pretty sophisticated orchestrations to deploy a workflow, you can do it from command line. So you define your YAML and deploy this. Deploy the workflow but it doesn't execute it. To execute it you run another command and then you can see the execution by describing the workflow. And all of this is also from the UI. So if you go to Google Cloud console you can see the orchestration defined here and you can see the visualization of the orchestration in here. And I think if you have time we'll show you this in our example. But this visualization is really useful to kind of visualize how things are supposed to flow in your system and then you can execute it here. You can see the logs here, you can see the execution is under here. And if execution failed, you can see what step failed, things like that. But we'll take a look at this when we showed application. Alright, so in the last 20 minutes or so, let's talk about our case study. So I will hand over to Guillaume to talk about this app and create a different part of that app. Thanks to you. Thank you. All right. Yeah. And there are a couple of questions. So let's have a quick look at them. So there was a question about Azure orchestration functions, orchestration versus logic apps. So perhaps you've got some more info on this method you looked at. Yeah. Azure has durable functions where it allows you to call other functions. So in that sense, it is an orchestrator. But from what I've seen, it's basically a simple orchestrator, I would say. But logic apps, on the other hand, it's more than that. It's not just about calling functions, but it's about calling any Http driven thing like not just functions, but any Http service, basically. And it also has more than just calling functions, but the logic basically, that's why it's called logic apps, the logic to handle the calls, to pass the results, to call the next service. So I would say logic apps encapsulates that and more. But maybe in a simple kind of use case where you just need to call some other functions, maybe durable functions is the way to go. Yeah, that's it. Also, there was something about actors which can be a good method for implementing an orchestrator. Is that something you see? Match. So I've had a look at some platforms like Aka and things like that. So that's also another valid approach. I don't have personal experience with these systems, so that's a bit complicated for me to contrast them. And there was a question, but I think that disappeared about monitoring, et cetera. So we're not really focusing on this aspect. But for example, on Google Cloud, there are various tools for logging for monitoring, creating alerts, etc. For example, you see that your workflow is failing lots of times on workflows you could create alerts, create metrics and so on to track this and create dashboards and so on. But it's probably the focus today. All right, so my screen is shared. So let me tell you about this. Pick a daily application in London, but it's pick a daily in the sense we share a picture a day. For example, it's an application a year or two ago, actually, before the start of the pandemic, and that we rewrote recently using an orchestration approach. But initially we had created an event driven approach, the choreography of events. This looks like this. I'm going to show you demonstration basically it's a way to share pictures so anonymous users can upload pictures. We're using machine learning to analyze the pictures and to check also if it's a picture that we are allowed to display to avoid showing bad things on screen. And let me show you here. So you see my app, correct? Yes. So the app is here. So I've already uploaded a few pictures. Perhaps I can upload a new one. So there's a page here for submitting pictures. So it's going to be processed, but it's going to appear in short moment. There's a page where you can see a collage of the four most recent pictures. And the home page displays all the pictures. But you can see that there are some labels that's the machine learning API that we're calling the Vision API that analyzes pictures and finds the labels of what's recognized in the picture. There's also the colorful border around the pictures because the Vision API also analyzes things like the color palette of the picture. And if I hit reload once more, the new picture has been processed and uploaded and I can see my pictures from the time where I used to travel in Egypt. That was a trip with my family. So we created this application and here's what it looks like in terms of architecture. So here I don't know if it works when I Zoom in. Do you see my Zoom? No, I will just see this one. Okay, well, never mind. So let me use this great pointer. So a user is going to upload a picture through the web front end here. I'm using App engine as my web front end. The picture is uploaded into a cloud storage bucket and here it's going to trigger some events. So when a new picture lands into a bucket, it's going to trigger the image analysis function, which is going to call the Cloud Vision API. And once the Vision API returns, we're going to store some picture metadata here in the Cloud Firestore Nosa database, things like the name of the picture, the labels, the color, etc. Etc. Whether the mail was created for the picture or not. And also in parallel, we're also creating thumbnails of pictures. So that's another service here. This one was a cloud function, but this one is using Cloud run. So it's a way to containerize applications and run Containers serverlessly. So this one is creating thumbnails and it's going to update the database with saying whether the mail was created, generated or not. And we're also going to post save in another bucket the mail of the picture. What else? On the regular interval of time, thanks to Cloud schedule, we are going to create a collage of the pictures. So if there are new pictures, we're going to update the collage with the latest stitched pictures together. We also created another service at the bottom here, the garbage collection service to another cloud run service using a container. Again, what's going to happen is that if you're deleting the full size picture, we also want automatically to have the thumbnail being deleted as well as the metadata in the database. So it's some garbage collection to also clean other things. So you delete the big picture, it's deleting the them layer and it's deleting the metadata. So we had basically events flying around. There are different ways to deal with those events because we were using different products. I think that's the next line. So for example, just showing a few lines of the headers of those services. So for example, here I was exporting, we implemented it using NodeJS, but we also have some implementations in Csharp Java and some Go as well in the open source code. So here we are using Cloud functions. Cloud functions. Well, you just receive an event and a context about that event, and the event contains the name of the file of the picture and the bucket in which the file was stored. But if you look at another service, this time we're going to use the I think that's the mail creation service, since we're using Cloud Run. Cloud Run is a container service that receives Http request. But here the events that are flying around are actually cloud storage events. So how do you pass such events to an Http based service like Cloud Run? Well, the thing is, for Cloud Run, we have to wrap the well, it's done automatically, but you have to do the unwrapping yourself in your service. You're receiving the cloud storage event, which is wrapped inside a Pub sub message and the Pub sub message is wrapped into an Http request. It's like onions that you have to Peel one layer after another. So it's a different way of dealing with an event. And the third way that's for the garbage collection service, this service is using the Event. Arc product, which is a way to direct events from one place to another. So they are different things, sources and things. You can have events flying in different directions. And event is using the Cloud even specifications like the serverless workflow specifications. What's nice with that is that you've got a standardized syntax for events, but then again, that's another way to unwrap the data within those events. So you've got three different ways of dealing with events in that particular Choreographic event driven system and Meta. We reward this application. We rewrote it to use noquestrated approach. Right. So I'll describe that quickly. Okay, so the event driven, the reason why we went event driven is that someone uploads a picture and then that triggers an event. So it kind of makes sense to follow the event across different services. Right. So that was our initial design. But the problem that we were running into with that design is that even though it was a quite simple kind of service with a few services, not too many, we were already getting problems with complexity. Like, first we had to deal with three different event types and how to pass those event types and how to deal with them. And every time we had to update something, we have to think about, OK, this is the service receiving this type of event. So there was already complexity there. Secondly, when something failed, let's say the user uploaded the picture and it didn't show up in our UI. We didn't know why things failed. We always had to kind of guess and we have to always look at, okay, let's look at the logs of this service. Do we say no? Okay, let's forget the next one. And that's kind of the problem with event driven systems is that it makes it hard to trace things. Of course you have things like error reporting and you can do correlation IDs and kind of try to trace things, but it's something you need to think about and implement. It doesn't come out of the box. So when Workforce came along at the beginning of this year, we said, okay, let's just take this and try to orchestrate it. Because when you think about it, we don't need the flexibility of full evening. We want users to upload the picture and then it triggers an event. And then from then on, what needs to happen is quite, actually quite clear and it needs to happen in a certain order. So we did that. We workflows, and this is architecture that we came up with. And when you first look at this, it might look a little bit more complicated, but actually anything in these boxes, it's one service, it's workflow services. So if you think of this as one service, then it's actually simpler. And the way this works is that we have the user at the front end uploading the features to the search, like before, that triggers an event. So in that sense, it's still event driven. That goes to a cloud function and then from there the cloud function starts the orchestration. So it starts the orchestration that we define in Workflows. Now, in this orchestration, we check the event type. Is it a new object creation or is it a new image deletion event? If it's image deletion, we delete the image from storage and from Firestore. So this happens within Workflows. Knowing these services, and if it's a new object, we call the Vision API. From Workflows, we get the data from Workflows and we need to do JSON pricing. So here we deploy functions to do that and we check the safety of the image within Workflow again. And if it's safe, we store in Firestore. And then if it's not safe, we end the orchestration. And finally, once the picture is safe, we make calls to thumbnail service and collapse service to create the thumbnail and collage of the picture. So you can see that in this model we are kind of combining a couple of things. Like, first, it's event driven, but at the end of the day that kicks on orchestration. Secondly, we are doing some of the work in workflows and some of the work by calling services. If it's simple enough, we do it in workflows, but if it requires more logic, then we delegate to an external service. So that's the approach that we came up with. And I think I can quickly show you here we are running out of time, but I just want to just quickly show you here. If I go to Google Cloud workflows, let me make it as big as possible. You can see that I have the workflow name and then we have the source here and it's probably hard to see, but here the workflow is defined and you can get this on GitHub, by the way, but we define all the steps what happens and then there's a visualization here that shows you what should happen in different kinds of scenarios. So this makes it really easy to visualize what's happening. So that's it with our workflows. I'm not going to run it because we're almost out of time, but we want to talk about the lessons learnt here. Should I continue with this part or do you want to take over? I was going to take over. Okay. Just to finish it. All the code is available on Google platform. Less photo sharing workshops. If you want to have a look at the two approaches, be sure to check this link out. Yeah. So in terms of lessons learned, what was interesting was that as we rewrote the application, it was refreshing to get back to using simple rest calls instead of various events formats that we had to deal with. We also had less code to handle, so no even parsing. And also even some of our services were going away because services that were just calling some other services or APIs, they could just be done declaratively. Within the YAML workflow definition. We also had less set up because we didn't have to set up the pub sub topics. We didn't have to set up the scheduler or even Arc to create the routes of events in our system. And it was also easier for error handling because the whole chain could stop on error and we would know at which step it would stop. That was easier to actually track errors and see that. Okay, that was this particular image that triggered that problem, instead of having to go through the various messages and understand which one is correlated to the other and the other things that we learned along the way. So we did the rewrite when the workforce product wasn't yet. There were a few quirks and limited documentation, but fortunately this has greatly improved in the past year. Even the thing you have to be aware of is that YAML is not a programming language. Okay. And don't put too much in your workflow definition. What is really a business rule the logic, et cetera. It's probably better in a function in a service rather than in a YAML definition file. And as well, testing, you can't really test YAML files, right? So it's more complicated. And if you're trying to build too much logic into your workflow instead of simple checks and branching and so on, you don't have the same support as with your ID for your programming language, for example. Also, potentially you might have a bit less of parallelism or eventing flexibility in the sense that you have to test the whole workflow again when there's a new event or something new happening, whereas with choreography, usually it's really just adding an event. Or it doesn't mean that you shouldn't run tests, but still, it might be easier to add new things. And when I say also parallelism, it's that we tend to write more sequentially in the workflow definition. But parallelism is a feature that is also coming to cloud workflow, so we don't necessarily lose our party losing per se. So that's about it. We are out of time. Here are a few links you might be interested in looking at with some feeds, with some getting started guides, and also some collabs if you want to get your hands on those technologies. So don't hesitate to pin us on Twitter for further questions as well. And sorry for the phone size, which was too small for you. We'll have to update the slides accordingly. Yeah, there are a couple of questions. I know we are out of time, but I'll try to answer them quickly here. So one question was someone mentioned that they're using logic apps for orchestration and their experience is that they are hard to test automatically and manually because they cannot run locally. That's true for workflows as well. Workflows is a service. There's no local workflows, not yet, at least. So, yeah, writing unit tests for them or running them locally is a challenge. We usually just run against the cloud and it's really quick so it doesn't really affect us so much in that sense. But it's something to be aware of and maybe a drawback or use orchestration services. And the last question is in orchestration, you typically send events that something happened, but when orchestrating, you send commands and Asynch awaits for an event. No, I mean in orchestration you send http calls. Basically, that's what it is. And you wait for the result or that http call. You can also send events. So for example, workflows can send a pops up message. That is also possible. But at the end of the day, just making http calls, that's how it is. All right. I think that's it. Thank you, everyone, for coming and sorry that we were not there. Hopefully this was useful to you. If there's any other questions, feel free to reach us on Twitter and enjoy the rest of the conference. Thanks very much. Thank you. Bye.