Video details

RustConf 2021 - Supercharging Your Code With Five Little-Known Attributes by Jackson Lewis

Rust
09.14.2021
English

Supercharging Your Code With Five Little-Known Attributes by Jackson Lewis
Attributes are one of the most useful and convenient features in the Rust language, enabling programmers to auto-derive traits, set up a test suite in minutes, and conditionally compile code for different platforms. But there are many other useful attributes, both inside and outside of the standard library, that often go unnoticed. Join us as we go through five attributes you might not know about - and how you can use them to both optimize and strengthen your code.

Transcript

Hi, folks, and thank you for listening in. My name is Jackson Lewis. And today I want to talk about cool ways to use rest attributes that you might not have heard it before. I'm going to assume that you have an intermediate understanding of rest for this talk because today is going to be a deep dive into some of the more obscure aspects of the language. That being said, I've done my best to make this accessible to everyone. Let's get right into it, shall we? So as our Einstein once said, rough attributes are pretty lit. I agree completely with Dr. Einstein. As a man who conceptualize spacetime, he would surely understand that part of being a programmer is acting like a time traveler. You must protect your code from your future self will forget all the nuanced details of what you wrote and ruin everything. Luckily, we re stations have tools to help mitigate this issue, code comments, Idiomatic guidelines, and strong typing, but in my opinion, the attribute is one of the greatest tools we have as rest programmers, one that can save on boilerplate, reduce the chances of making mistakes, and boost productivity all at the same time. So for those of you who don't know, an attribute is like a tag that you can put on a function struct or statement. Tell compiler to do something special. If you've ever derived a trait for a truck, then you've used an attribute from the standard library, or you might have used the test attribute before to write a unit test. But that's not all. You can also define your own attributes and import them from third party trades. For example, Serda is a popular third party attribute that will automatically derive functions to manage the serialization and deserialization of your trucks. You can see it being used here, and there are other popular libraries Besides Sir Day, like tracing async trade and so on. But I'm not here to talk about those because those popular libraries are only the tip of the iceberg. If you browse through Creates IO, you will find hundreds of attribute libraries that get completely neglected. And even in the standard library, there are attributes that no one seems to use. So today I want to talk about five of these attributes that are hidden in both the standard library and package industry and how they can give your code superpowers. So to keep this tag industry interesting, I'm all by explaining these attributes through a hypothetical story. Okay, what we have right here is the option enom, one of the most foundational types and rust. One day a controversial feature request proposes adding a third variant to the option enom called possible. While the maintainers insist that it's rather redundant, the creator of this feature request says that it will revolutionize the language, but there's a problem. Currently, people can write code that matches on an option like so we have a branch that matches on the sun variant and a branch that matches on the non variant. This code is an exhaustive match right here, since it matches on all the variants of the Enum. However, with the introduction of the new possible variant, this match on option will no longer be exhaustive, and this code will fail to compile. As you can see, exhaustive matches are fragile to changes in enum variants of Struct fields. Because of this, create maintainers can often feel pressure to push a new major version every single time they make modifications to public enums or struck in the worry that the changes will break downstream crates. But is there a way to prevent this from happening? As a matter of fact, there is. So let's rewind time a bit and go back to our hypothetical story. If the Rust team had planned from the very beginning to expand option in the future and didn't want to break existing code when they did expand it, what might they have done? The answer is the nonexhaustive attribute. Nonexhaustive tells the compiler that more variants or fields will be added to the enum or struck in the future. So any create which uses it needs a wild card when pattern matching, even if they've already matched all the existing variants or fields, it should be noted that adding nonexhaustive to your Enom or Struct is a breaking change to your API because it invalidates any match statements that do not have a wildcard branch. So if you want to make your number struck nonexhaustive, but it's best to do it when you first make it. Okay, let's go back to our code from earlier. If nonexhaustive had been added to the option enum, our pattern matching code wouldn't just look like this. We would need to add this wild card statement too. Yeah, even though we've exhaustively matched all the variants we need to add that wild card or the compiler will throw an error because the wild card will cover future changes to this enum. As you can see, even after we add the new variant, this code will not break since the new variant will be covered by the wild card branch that we added, and that's the power of nonexhaustive. Okay, so shortly after this controversial rewrite, you take a break from Open source to work for a local pizza restaurant. They are very forward thinking, so their websites back end has been written entirely in Rust. However, they also have a legacy codebase written in C, which is slowly being replaced with Rust coat. They have a function in this Rust code called Pizza validator. The checks of a customer's pizza order is valid. Unfortunately, it has two major problems. One, it's poorly optimized and two, it allows customers to order pineapple on their pizza. Gross. However, for one reason or another, the company doesn't want you to make any modifications to the existing function. Instead, they want you to write a new function that will gradually be phased in to replace the old one. So a short while later you've written a new and improved pizza validation function that outlaws Pineapples, and you've gone around the code for the new site and replace many of the existing calls to pizza validator with the new function. But there's a problem a few days later, in order for Pineapple, Pizza comes in, which shouldn't be happening. You identified the problem in a function called Add to Order. It turns out that someone else recently rewrote this code and then the process use the old validation function. But to be fair, there was little indication that using it was wrong. So let's rewind time again, what should have we done? What should have been done here to prevent this? Well, it would be nice if we could Mark this old function as being improperly used without outright removing it. And it turns out that the standard library gives us a way to do this. All we need to do is put the deprecated attribute at the top of the function and the compiler will emit a warning when someone tries to use it. You can even provide a custom message to go with this warning. So now if we try to use the old Pizza validation function and the new function code, we're going to get a warning from the compiler. And in this case you don't just want a warning. You want to make sure that using a deprecated function will throw an error for certain parts of your code base. So if you put a denied deprecated attribute here or at a higher level in the crate, the compiler will make this an error instead of a warning, and that's the power of deprecated. A few days after this incident, a report comes in about the legacy code base. A segmentation fold is occurring due to a mistake that a programmer made while using an unsafe function called Two Box. This legacy code base has several unsafe functions that require the usage of raw pointers in order to work with the old C library. And Two Box is meant to convert these raw pointers back into a safe box pointer. As you can see, whoever wrote this function less safety documentation in the comments, which is the right thing to do. However, there are still two problems with this code. The first problem is that Two box can still be used without the programmer looking at the safety documentation. And the second problem is that this safety documentation fails to mention that passing in a null pointer will also cause a segmentation fault, and it is unsound a passing a pointer that points to memory which was not allocated by the global rest allocator. But you know, I can understand that because when you're running a safety comment, it's possible to accidentally overlook some edge cases. Now, what can we do to fix this? Well, we could start by expanding the safety section and putting a debug assertion to check of the pointer is null, but still proving that the function is safe is still dependent on the function itself and not on its color. Wouldn't it be better if the caller was responsible for proving that they're using the function safely? Well, this is where pre comes in. Pre, which is short for precondition, is a very underutilized thirdparty macro crate that requires assertions to be placed at both the function definition and the call site. First, we need to add the preconditions to the function definition to two box as seen here. So in this case the pointer must not be null, the pointer must be unique and not be shared with another object, and the pointer must point to memory allocated by the global allocator. This looks very similar to what we wrote in the safety documentation. Okay, now let's go to the code that was set folding and see how pre can help programmers avoid mistakes with unsafe functions. So as a glance, it looks like this code is run when the user wants to create a new pizza with random toppings. As you can see, the code first creates a pizza struct by calling in the like CC library, which appears to be wrapped with a safe binding on the rest side. So we don't need to use unsafe here. The pizza struct does have a default method on the rest side of things, but it looks like we're using a function from C to allocate it for whatever reason. Then it passes the pointer into the randomized function, which will randomize the toppings for the pizza. Though this also calls across the CFI, it appears to be wrapped here, so we don't need to use unsafe, and then finally it gets a unique box pointer to the pizza back using two box. Since Two box is an unsafe function, we need an unsafe block here. However, since we've added preconditions to the two box function, we also need to assure that we're using this function correctly, or this code will no longer compile and adding these assurances will help us spot the reason why this code is set faulting. So first we need to assure that this pointer won't be null by putting the assure attribute, which is part of the pre library on this two box call. This will run an assertion in debug mode at the call site with the provided condition. In this case that pizza pointer is not null. Note that we also have to add a reason as to why this precondition is satisfied, which makes users more aware of whether or not the provided condition will hold as we add in the other assurances, along with their reasoning, we catch the problem. Two box is only safe when we're using a pointer which was allocated with a global rust allocator. So the first line of code where we allocated the struct from C was the offender. After we rewrite the code to allocate the pizza on the rough side, we can provide a proper reason in the sure attribute for why these conditions are satisfied. As you can see, pre is a great way for users of unsafe functions to be cognizant and put needs of what needs to hold true for the function to be safe, and that's the power of pre. That being said, I highly recommend that you limit this attribute to functions that you define internally, so downstream users don't need to add pre as a dependency to their grade just to call a function. The next week you're asked to perform a refactor on the pizza struct. When you go to make some changes to it, the definition looks off. Something about it is seriously bothering you, but you can't place your finger on it. Oh, that's right. It's these publicity modifiers right here. The fields are public so that you can read from them directly instead of using a getter function. That's useful. But any programmer can also modify these fields if they own this pizza variable mutable. Furthermore, since all the Struct fields are public, the struct can be initialized elsewhere using raw Struct initialization syntax, and all of this bypasses the existing functions, which are meant to handle state changes and initialization of the pizza Struct. But the good news is there's an attribute to fix this. It's so useful. In fact, I wish it was in the standard library. It's called read. Only. All you have to do is add read only as a dependency to your create, and then put the read only make attribute on the struct of your choosing. Any function that would have had access to these fields even if they were public, can still write to them. For example, this mozzarella methods can still write to the struck directly, but outside the file, you can only read from the public fields, and you can't initialize the struct with Struct initialization syntax. So this function right here will compile just fine since we only read from the cheese field, even though we're taking the truck is a mutable reference. However, this function will not work since we try to modify the field directly, and then we also try to build a new struct using initialization syntax. This will create two separate compiled errors. So but to solve our particular problem, we don't need to make all the fields read only. We just need to keep a few fields from being changed. The good news is that read only supports this too. By putting the readonly attribute on individual fields. The other fields can still be written to based on the normal publicity modifiers, and that's the power of read only. Okay, finally, there is one major problem that still needs to be resolved with the Pizza Restaurants website, the Legacy system, and the new system uses a separate database for storing customer orders. Since the new system should read from both databases, a trait called order has been created, which provides a common interface over the two order formats, which will let the new system you serialize operate on and then serialize these two schemas interchangeably. But unfortunately, that doesn't work because Sara Day can't serialize and serialized trade objects due to object safety rules with serialize and deserialize traits. As you can see here, if we try to create a box pointer to a dynamic trade object, the compiler will tell us that order is not objects safe, and this is because a serialize and deserialize traits are requirements for this trait, and those are not objects safe themselves. So let's go back to the order trade this time with no requirements for serialize or deserialize, because that clearly didn't work as planned. In order to serialize and deserialize an arbitrary object of this, we're going to need the third party type tech attribute. Type tag lets you serialize the trade object as an enum where each variant is an implementer of that trait and the content is the serialized version of that implementer. Here you can see that we serialize the order trait as a tagged enum, using the arguments we passed into the attribute to specify the names of the tag fields. In this case, type and content. And then as long as each implementation of order implements serialize and deserialize themselves, we can use type tag to make them Serializable Strait objects. All we need to do is put the attribute above each implementation, and this works even if the implementations are in different traits, which I think is really cool. Now the code which previously failed will compile without any problems, and that's the power of type tag. All right, that just about wraps it up. I hope you learned at least a few new attributes and how you can apply them to real world scenarios. Here is a list of cool attributes that I didn't have the time to talk about. I highly recommend you check them out on your own time. Thank you so much for listening and have a great rest of your rest.