Building Operable Software with TDD (but not the way you think) - Martin Thwaites - NDC London 2023

preview_player
Показать описание
Building operable software is becoming more important with the adoption of Microservice based systems becoming more common. Developers are increasingly relying on building long-running "integration" tests in deployed environments because it's the only way to gain confidence to deploy their applications. There is a better way, which is to focus on "outside-in" testing that focuses on testing the boundaries of your service.

In this talk, we'll go through some of the pitfalls of relying on unit testing to give you confidence in an application. We'll then go through how you can use TDD as a workflow to build tests in a "Contract First" way and how much more flexible your testing becomes. We'll talk about the benefits over Unit testing focus, and how it can aid in understanding service boundaries. Finally, we'll show you can correlate all this up with Tracing tools like Honeycomb to see the performance of your tests and how internal code interacts.

This talk will be focusing on the WebApplicationFactory in .NET to provide the scaffold, and Honeycomb to provide the visibility, however, the concepts will likely apply to other languages.

Check out our new channel:
NDC Clips:
@ndcclips

Check out more of our featured speakers and talks at
Рекомендации по теме
Комментарии
Автор

nothing to add to the discussion, just wanted to say thanks for the speaker for engaging in the comments. its missing for most other conference talk videos online

animanaut
Автор

Love the practical examples in the talk. For the areas that "can't be tested this way", I personally think that the last two points (connecting to external dependencies, and configuration is correct) in our deployed environments to be more important to verify than those which can be achieved via ODD. For a greenfield project I'm working on, we have been writing our tests in a very similar manner (albeit leaning more towards the BDD style with Given When Then) and we're currently deciding how to achieve confidence on those points that I mentioned. We have two options that we're tossing up,
1. In a separate health check API that we ping during deployment to verify those things are working, giving us explicit confidence or
2. Running the BDD test suite against the deployed environment. The checks here are implicit.

I want option 2 to work because it means we won't need to maintain a separate part of the system; but if we find those tests take too long, are flaky or doesn't work for other reasons, we may fall back to option 1

aaronzhong
Автор

That’s quite in line with “Growing Object-Oriented Software, Driven by Tests” plus the clarification “When you don’t follow the ‘Tell, Don’t Ask’ rule, prefer social over solitary tests.”

mv
Автор

From my experience with whole-app automation testing in video games, I find that such tests tend to be slow and brittle. For one thing, having to boot the app means that every test run pays the app startup costs, which in a AAA video game can be on the order of minutes as the game builds and caches the data it needs (customers don't typically see this apart from shader compilation, because this data cooking process happens prior to shipping, but internally it must be redone every time the game's content changes). Similarly, having to boot the app means paying the compile-time cost of building the entire app, which for a AAA video game can also be on the order of minutes (possibly tens of minutes depending on what's changed!). Then even if it does boot up, any bug that causes the app not to function will block *every* test from running, which causes people to scramble to fix tests that aren't *actually* broken (somebody else's code was broken). Heisenbugs that only show up 1% of the time will randomly fail test runs for no clear reason (threading issues are a common cause of this). It has also been my experience that a test harness that can command a video game in a shipping environment needs more maintenance and has more ways that can fail than a unit test; not only does the game itself need a bigger API surface to talk to the tests (because the actual output of a video game is graphics and audio, not something that one can easily measure in a test harness), but also some tests need to take into account network latency, which is a source of flakiness as the time between test actions may be measured in milliseconds...

None of this is to suggest that we shouldn't have outside-in tests, only to give some perspective on what it was like to mainly have outside-in tests in the context of something that isn't a banking app - that I have not had a good experience with them and I don't think they're sufficient to avert manual testing and therefore if I'm going to do TDD, I would like to write more unit tests. Frankly, I find the main value proposition of TDD (which for me is "iterating faster") is hard to realize with outside-in tests, so I don't feel incentivized to use it with this kind of testing. Every attempt at it has been frustrating and I eventually gave up and went back to test-after with outside-in integration tests.

I would also like to note that it is nice to see someone acknowledging that you can do TDD with things that aren't unit tests, even if I'd generally find that more valuable.

babgab
Автор

I saw a lot of systems where devs did this type of "BDD" and just check a status code from the response. They were proud of their 80% code coverage

None of these systems were easy to maintain

There are a lot of articles and videos why e2e, acceptance tests don't help you with software development, These high level tests can't give you enough trust to deliver your software fast.

It is a good addition to unit tests and integration tests. But it is not enough to write only behavior tests, because it is not possible to test all logic of the application through high level tests,

alekseimenkov
Автор

All the things he is talking about from 13:15 (including abstracting into reusable steps, etc) is what I have been doing as part of BDD.
I guess I never used BDD "correctly" just how it made sense to me...

dogoku
Автор

Why do we use caches? Isn't it to get faster response? Faster than the original data source. Isn't then a performance test going to ask for the caching mechanism? "I need the results in less than 200ms". Same for parallelization?

unsegnor
Автор

Great talk and great practical application of contract testing. You lost me on the pro tip - 'can introduce path approval checks' how would that work?

WilliamPowerDental
Автор

Great talk with some interesting takes on the approach of using Outside-in TDD. Curious about how you would proceed in your test class as further requirements to the system is added. For instance would the progression of requirements for the specific endpoint mean adding complexity to the arrange code for the tests (extracted to methods ofc to keep test intention clear). In other words would the adding of business requirements then mean a continuous focus on refactoring towards test helpers for arrange code making them more generic and flexible over time (but also potentially more complex)?

nytofteAS
Автор

I'm not familiar with dotnet programming, but I would like to know what Span means in this context.

pendax
Автор

37:29 this isn’t how you’d write caching code? Why not? It’s like every single example ever given on the web. I lol when people say “this isn’t production ready code”… ok, well just show the production ready code instead of the simplified “never use this in production” code so we can all see what production code actually looks like and what we should be doing instead. Don’t perpetuate what we shouldn’t be doing in prod.

brnto
Автор

Is there a link for the code examples?

bikerd
Автор

There's no silver bullet. Testing from the edge ends up quite messy once you have complex business requirements, since you will need to have a quite big arrangement phase. If writing a CRUD, sure, go this way.

iorch
Автор

I do wish people stated with this approach, but reasonably complex systems do suffer from high combinatorial test cases at a certain point (e.g., "given my system has 100 widgets, it cleans up the oldest 50 widgets" -- am I really going to call the API to set up the Arrange phase?). I've worked with a lot of people that start at the class-as-a-SUT approach, so I value this talk as an introduction to an extreme alternative, but I hope the speaker hits on the downsides and how to cope with them.

KyleSmithNH
Автор

This talk is really let down by YouTube inserting ads for shower gel and laundry detergent every five minutes...

RoamingAdhocrat
Автор

24:52 that assert makes no sense and won't ever pass (or even compile)

Comparing whole object to the ID

NickMaovich
Автор

some good stuff but also a lot of needless grumpy rants and a bit too much of stating the obvious. "no one cares about your 4000 unit tests!!" good one.. 😴

interstellar
Автор

Observability is not the part of development code, when the code is touched by many developers your tests will eventually break if you assert on spans even if there are no behavior changes in your code. I think it follows the same path as comments, they are valid until they are not.

vikas
Автор

This is so wrong. This guy needs to listen to James Coplien as soon as possible.

GoodTechConf
Автор

We don't want to test the cache. Cache is not the goal - it's just the means to the end. And the goal is to satisfy NFRs - request latency, for example. One possible way to do it is defining clear SLOs and check for them in production.

AndreiMoiseev-gg