Video details

React Native EU 2021: Lars Thorup - Sub-second integration tests for your RN app & Bluetooth device

React Native
10.12.2021
English

This talk was presented during the React Native EU 2021 - the largest community conference in the world focused exclusively on React Native.
Abstract: This talk is targeted developers creating apps for a Bluetooth device, such as a loudspeaker, a toothbrush or a dishwasher. I report on my experience using the technique of "mock recording" to get very fast and robust integration tests.
In collaboration with SOUNDBOKS, a Bluetooth speaker company, I have developed an open source tool for creating and using recordings of Bluetooth (BLE) traffic to test a React Native app using Jest. The tool makes it possible to run several integration tests per second as opposed to several minutes per end-to-end test. The tool is based on years of production experience using mock recording for web traffic.
In this talk I introduce the methodology, perform a live demonstration of the tool, and report on our experience using the tool during app development: how is the quality and speed of feedback from these tests, and how easy is the tool to use for developers?
The tool is available at https://www.npmjs.com/package/react-native-ble-plx-mock-recorder.
Lars is an experienced software developer, architect and coach. Lars is an expert in fast and robust test automation with a track record of successfully running 200+ integration tests per second. Currently Lars focuses on web and mobile development with React and React Native.
Twitter: https://twitter.com/larsthorup
Github: https://github.com/larsthorup
Additional Links: Link to slides: https://www.fullstackagile.eu/2021/09/02/react-native-bluetooth-ble-mock-recording/

Transcript

This conference is brought to you by Cold Stack, React and React Native development experts. Hello. My name is Las Topp. I will talk with you about how I have done integration testing and really fast integration testing for React native apps. Specifically when you are testing against a Bluetooth device that you are controlling with your app. I am a software engineer from Denmark, working as of working freelance and doing open source development for the Agenda. I will start by talking about Soundbox where I did this work. I did this work in collaboration with Soundbox and talk about how we came to the tool that we have developed here. Go into more details about when you want to test against the Bluetooth device. What is a good way to do that? You can do in trend testing, which covers a lot of good things, but they are also quite slow and fragile, and you can also do classic unit testing where you write mocks for your Bluetooth messages. You write those by hand, but that's not good either. And this talk will focus on how you can do mock recording, which will give us really fast and really robust but true integration testing, and you will see some demos along the way. But as you can see here, you can do full integration testing in a few hundred milliseconds, which is really nice. So I started last year working with Soundbox and Soundbox is a Danish company producing really awesome Bluetooth speakers that you can carry around with you. They are able to play really loud. You can have a small party going on, or maybe even a small festival using the sound boxes. And they come with a number of in addition to the very good sound they have. They come with a number of interesting features from an app perspective. So you control the speaker with an app on your phone and you can control things like equalizer settings and you can lock your speaker if you don't want anybody else to play on your speaker. And one of the interesting features also is that if you have more than one sound box, you can connect them together, join them together and play simultaneously and thereby creating a larger manifesto. You can say initially we started out doing intransing of the app and the speaker, and that's what you can see on the photo on the right. You can see two phones. In this case, it's a Samsung phone and an iphone, and they're part of the testing station. You can see the control panel of Soundbox and you can see a lot of cables and various other things. What is not on this picture is also a MacBook that's having the role as controlling everything and running the scripts. And we were running Intran testing using a very common tool called Fion for testing mobile apps, including React native apps, and that works pretty well. So we have been quite happy with that running Jenkins on the MacBook and then a number of phones to test on. So that was the set up, and I was hired to write some of these invent tests. And let me just tell you a little bit about what that experience gave us. So we ended up having around 15 different scenarios, which are like full use cases that a user can go through on their phone. And when automating these scenarios, we find that it takes a while to use Appium to do this. We generally see one to five minutes per scenario, so the total time to run all the tests, all the scenarios on all the actual phones is now two and a half hours, and getting feedback two and a half hours on your pull request is really too much. So we wasn't completely satisfied with this. It wasn't a real improvement on manual testing because you still get feedback quite a lot faster, but it's not very robust either. So we did see quite a lot of false negatives where you have to basically run it again and see that it was failing with the exact same set up, but the second time it's passing. So it was really a false negative. We were also looking into how can we maybe do better testing than these entrance tests to compare those two? If you look at intuition testing, it's much faster and cheaper than doing manual testing. And one of the nice thing about intranet testing is also that it does cover your entire system, so you will test not only your device, but also the interaction with the phone and the interaction with the server, and you are exposed to real world timing and wireless noise, which is good from a testing perspective. It might be bad from a robustness perspective, but it's still a good thing, so Intran testing can help you uncover really hard to reproduce issues by just relentlessly testing the same scenario over and over, and you can look at the Bluetooth locks and see what's going on when things eventually occasionally fail. On the other hand, we have unit testing where there are a number of benefits, like there is no physical set up. So you can basically run unit tests just on your laptop, or you get much faster feedback. You can get feedback from a large test suite in maybe a minute instead of having to wait hours. And unit testing is usually us also much more robust. You get basically almost 100% trustworthy feedback, so it's good to have a few of those end to end tests. But you should really use unit testing for most of your automated testing. If we look at how to do unit testing with React Native and Bluetooth, what we are doing at Soundtracks is we are using the quite popular library called React Native Ble PLX, and we are going to mock that so that we can write unit tests with just which is the normal common testing tool for React native developers. So the simplest thing and what we did initially was to just write like you normally do with unit testing. You write your marks manually by hand, so you write some code to simulate the behavior of a Bluetooth device, and you will eventually hard code your Bluetooth messages, and some device did some traffic patterns. And these manual marks are not really very reusable. So you have end up having a lot of code to maintain with the manual mocks, but that's not really their biggest problem. The bigger problem is that manual mocks really lie to us, because when we test our app code in isolation with these unit tests and manual mocks, then what happens when something changes in the protocol? Or maybe you get new firmware on the device? So the protocol changes. And if you don't change the app code, then probably the app will break because the protocol is now different. But with manual mocks, the unit tests will still pass, and that's not really good from a testing perspective. So you have an app that breaks because something changed, but your tests still pass. Really manual mocks, they lie to us, and we would like to get the best of both worlds. So Intran testing provides us with true integration testing, and we would like the speed and robustness of unit testing, and that's what we can do if we use Mark recording, and that's what this talk is about. And I've been doing mock recording for many years, not specifically for Bluetooth, but for normal web traffic, for http for web services. So testing web front ends by recording http traffic. And what I did here was to leverage learning and apply it specifically for React native Ble PLX library. So let's take a look at how that works. So the core of this method is to have a recording. So occasionally, maybe once a week or something, we will run on the phone against the actual physical device. So we will use an actual device like this control unit from the Soundbox. And then we run an app recorder on the phone and record traffic with the device, and we store those recordings in a JSON file that we keep in our Git repository. Those recordings are then like reusable marks. So when we run just on our computer, we want to test various components and various pages in our app. So we write just tests that will use these recordings, and we can run these tests really fast and we can run them all the time. But because the recordings really come from an actual recording of the traffic with the device, this is true integration testing because the messages that we recorded are actual messages that have been transferred when we communicate with the device. So let's look at how it looks. So you record traffic occasionally, maybe weekly, or maybe whenever you make a protocol change on a PR where you change both the firmware code and the app. You will run this again to record the changes in the traffic. And you can also in this recorder, you can verify specific expectations about values that might change every time you run a test or something. And I'll get back to that. We run this on the phone and you can see on the video here how the phone will run through various steps. Connect to the device, verify Power state, scan for device connect, discover, read device stuff, monitor stuff, et cetera, et cetera. So with this recording, we can now write tests for the app, and those tests are not normal tests, except that they're really fast and they are using real traffic. So in this particular case, I'm showing you a subset of a test that just demonstrates how to connect to the device and we exchange around 50 VLE messages in this case. And as you can see, the test runs in about 200 milliseconds. So you can run these tests. You can run quite a lot of these tests as often as you really like. So how does these recordings look like? They are quite simple. They basically mirror the API of React native Ble PLX. So it's just a long list of records, and each record is either a command or an event or a few other things that can be done with the Ble Plax library, and we just record all the requests and all the responses in this recording file. So when we want to use such a recording, we have to mock React native Ble PLX and using the tool that this talk is about, which is called React native Ble PLX mock recorder. Then we can just use that tool to automatically mock React native Ble PLX like this. So the usual Jest mock file will just be these five lines and then you write your app tests using just as you would like to write it basically. So in this case, here we are the first few lines. We are reading the JSON file with the recording, and we are configuring the belly manager to use this recording as a mock. And then we can just use React testing library to render whatever component. In this case it's a device list screen. It's actually from a different app. It's not the Soundbox app in this case. So we render a device list screen, and then we can control the recording. What timing do we want to do? So we can play back and forth or sorry, we can play forward in the recording onto specific labels that we have put into the recording. And here we can also just continue with the testing, click on things in the UI, and then we can verify that things are showing data that's coming from the recording. So we run through normal app testing without having a physical speaker and the timing of that. So we get really fast and robust tests. The recorder app needs to be a dedicated app that's dedicated to recording this traffic. It's basically a very lightweight React native app, and you can initialize it. You can see the command at the bottom. So that's how you initialize a new recorder app using the provided tool. It also has a template so you can just get started by using that template and then you have a recorder. It's basically just a small test runner that you can write tests for. That will let you generate traffic. And the reason we do it in this way, instead of reusing the app tests for having those in both playback and recording mode, that's not really possible. You cannot really run just on the phone, unfortunately, and also you cannot really run React native Bleep on a laptop. So we need to do it this way. And it also has a number of benefits that we'll get back to. So how do you write such in a recorder app? You basically write a number of small tests where you simulate or generate the actual traffic that you're interested in having in your recording. So before the tests start, you have to start the recorder. Then you do the things with Ble manager like you would normally do, like listen for state change events or starting a device scan and looking for scan events and putting in labels in the recording. You control exactly how you do that. And then when you're done, you save the recording, which you can then use architecturally. It looks like this. And if we go through it in steps, we can see that when we are running the app in production and we are not testing, this is real user will use the app. Then the app just uses Bleplx to talk to the firmware running on the device over Ble. And when we are running the recorder, we will instrument Bleplx with this recorder module so that every command and event we send back and forth to be PLX and the firmware will be recorded into this recording file. And then finally, when you want to run your app tests, we are completely not having the phone or the device in the picture at all. We are just running on the computer, running really fast, just running the app test. No build step included either, because this is just JavaScript, so you can just run your app tests against the app, and the app is using the mocked Billy for all the responses. So it looks like this. So now we have time for questions. Except that I did this recording a month ago, so I can't really wait for your actual questions. So I just tried to imagine what questions would you like to ask? So I have answered a few of those. So let's see. So if we are more than one developer on this team and every developer have their own device and you know, with Billy, every device has their own device ID and stuff like that, how do you handle that? Do you put the device ideas directly in the recordings? And we do put device ID in recordings. But you can specify a canonical device ID so that every developer can have their own local device, and we will only connect to those recognized devices. But what is being put into the recording file is the canonical ID so that you don't get diffs in the recording file every time someone else does a recording because it's only the canonical ID that's being used. Another question is, what about values that, by their very nature, just changes all the time? There are a number of characteristics that have this that are like that. And for instance, our SSI or a battery level, or maybe even the volume control might be set at different values between different recordings. And again, we would like the recording to use a canonical value. So as you can see here, you can specify the recorded value. So if you follow the test, we say B. Lee Recorder Q recorded value. So we will say that we will record a value of 42 specifically for this battery level thing, and then we read the characteristics from the device and it will probably return a different value. But what we record put into recording file will always be 42. So again, we don't have a number. A lot of changes around every time somebody makes a recording. But read characteristic for device will still return the actual value when you're running the actual recording. And we can leverage that in the test, as you can see here, to verify that battery leverage, for example, should always be a number between zero and 100. So we verify that in the test. So that becomes part of the test. So the recording app is also doing a little bit of the actual integration testing against the device. And what about still doing manual marking? Because if we have at least one integration test per belly message or message type, then we are really having a good enough integration test. And you sometimes might want to do additional testing where you, for instance, do large number of tests where you put in some parameters. You do boundary testing or combinatorial testing. Maybe you want to test corner cases with battery level. What if it's zero or what if it's 100 and you can easily use manually mocked the lead traffic with this library. So as you can see here, you just basically pass in another list of records that you write by hand. So that's a good way of combining your integration testing with combinatorial testing. The last thing here is a developer improvement, because one of the nasty things about the elite traffic is that it is quite low level, and it usually always refers to services and characteristics by UUID. For example, maybe eventually you will know all these UUIDs automatically, but also values are based 64 encoded. It's sometimes really hard to read all these long numbers and these encoded values. What we do in the tool here is that we add to the recording. Also a debug section where we translate. We are basically able to translate the service and characteristic view IDs using a translation that you provide in the recorder app, and also the values will automatically be translated if it's possible. So at least to Hex in as a buffer. And if all the characters all the bytes are ASCII, then we also show the Asky string so that makes it easier to read as well. And that's what I had to talk about today. So you can read more on the blog post that I wrote about this. You can scan the QR code to go to this URL. The tool itself is on GitHub on React native Bleep mock recorder, and there is a package on NPM that you can install. There is plenty of guides in the repo already, so it should be quite easy to follow, but otherwise you're also very welcome to reach out to me. I am available on Twitter and of course, there are still improvements that can be made to this tool, so I'm also very interested if someone starts using this and to get contributions, maybe from you, thank you for listening. Bye.