Day 12 - Investigate your slowest test(s)

Today’s task is to investigate your slowest tests.

If you happen to use Ruby and RSpec, finding your 10 slowest tests is simply a matter of adding --profile to your test command. If you’re using something else, hopefully a similar flag is only a quick google away.

Once you’ve identified your 10 slowest tests or so, give them each a once-over.

Some things to consider:

  1. Are any of these tests duplicates? It’s easier than you might think to accidentally write the same test more than once. You can’t see that you’ve done this by reviewing a diff in a PR, so this type of duplication can often creep into your codebase over time.

  2. Can any of these tests be replaced with a faster variant? If you’re testing a conditional in the view, can you use a view spec instead of an integration spec? Can you stub expensive calls? Can you perform less setup? Can you hit less of the stack and still feel confident things work? This video I recorded on integration vs. unit tests might be helpful.

  3. Are all these tests still pulling their weight? Remember: slow tests act as a constant drag against your forward progress. Make sure the cost is outweighed by the benefits of keeping them around. I encourage you to be pragmatic. If a very slow test is verifying something that’s not mission-critical, it’s worth considering deleting it.

As always, please share how things go. Did you end up making any changes?

More often than not, i find the database layer to be the bottleneck in my tests.

using transactions when supported & maybe a in-memory “alternative” like SQLite can help with this.

just keep in mind, the differences between the two databases. :slight_smile:

Tests are one of our fundamental problems. We have a relatively small test suite (7500 tests) that takes a ridiculously long time in comparison. Our problem is that nobody really knows how to fix it and when one of us tries somebody shoots down the idea or doesn’t like the way it’s done…

That being said, for a period of time we had config.profile_examples = 5 in our spec helper, which shows the slowest 5 specs ran. We did look at it for a while (and found it helpful) before ignoring and eventually removing it.

I looked into the slowest tests on a project that’s pretty young/new for us. Luckily finding the slow(est) tests is trivially easy when Jenkins is your CI server as you can click on “Test Result” and see a breakdown of execution times by test.

I was very surprised to see that the slowest test was one for code that’s for formatting log messages. Peculiarly there was one test took about 0.9 seconds, and the other 3 in the same test class (for the same module) took a combined 26 milliseconds. Clearly that test was the candidate to look at.

Looking at it however, it wasn’t obvious why it was slow. The only difference in this specific test was it was testing the case when the log record contained stacktrace info and that that stacktrace info was included on the emitted log message. Aside from that, it was virtually identical to the other 3 tests in the test class.

I then spent some time trying to debug why it was slow. As it turns out, pytest (which we use for running the tests) has a --durations=N argument that when supplied will give you a breakdown of the N slowest parts of your test suite. This is even more finely grained than you think as it breaks it down not only by the test time itself, but also setup & teardown times. What was surprising was what I saw when running just the 4 tests in that class – it revealed that the slow part was the setup() for that one test.

I then instrumented the setup method (added timing statements) to try and figure out what the slowest part of the setup was. As it turned out, it was the line that just instantiated the class under test. The weird part: this line was identical for all the tests in this test class so why was it slow for this particular test?

I then wondered if it was an ordering thing: pytest runs all tests in a specific order. Ie if you run them once, then run them again, it’s guaranteed that they’ll be run in the same order. I found a plugin (https://pypi.org/project/pytest-random-order/) that randomizes the order of the tests, and this was very insightful as suddenly one of the other tests in the class was the slowest. As it turned out, it was the time to instantiate the class under test for the first time that was slow. So the first test in the test class will always pay this “tax”. I believe the reason is I/O related (the class under test inherits from Python’s logging.FileHandler class, so I’m guessing that there’s some I/O operation that’s done lazily on demand).

I’ve now committed the change to have the tests run randomly (which is a good practice anyways as otherwise you can accidentally write tests that depend on execution order). Aside from that, some good practice with debugging skills, and finding that --durations argument for pytest is almost definitely going to be useful in the future. And I do like having test execution randomized, as I’ve been bitten by non-random test execution in the past.

1 Like

Database-related set-up and tear down with tests as an app gets bigger definitely have me preferring to build service objects for business operations and inject instance spies instead of database-bound where I can.

A certain group of tests were taking some 60 seconds to start up. I’m new to this project and thought that surely this can’t be expected. After doing some extensive profiler debugging which involved spelunking through both the Django, Nose, and UnitTest2 libraries, I managed to trace the cause down to the __import__ function.
Looking through the tests, I noticed that one of our app’s modules’s tests had been tweaked to allow running through from the command line instead of through Django. This meant though that when those tests were run through Django, it created an import explosion and every test run imported the entire app over again.

A well-placed if statement checking if the tests were being run through Django brought those tests down to a much more reasonable sub 1 second run.

Ha!

2 Likes

Whoa! That is a ridiculously good result. Well done!