Today’s challenge is to spend investigate the files with the highest churn. Is there anything that stands out? Is it worth trying to make them more stable?
This is really interesting. I ran the script in my project but I couldn’t quite grasp anything that stands out. I think, as it counts the commits done in a file, it will depend a lot on how you work with the SCM (whether you do several small commits or one big chunk of change).
This one was a bit funny, the project I’m currently working is new, but started as a fork of an existing project. As a result, a git churn shows churn for files that are no longer present (they were in the original repo). As such the --since argument proved quite useful.
In the 20 minutes I wasn’t able to identify any refactoring opportunities past stuff we’re already actively working on, but something I noticed that was interesting: we tend to practice TDD, and I saw a pattern in the churn output: the churn on a test file (ex “test_foo.py”) was unanimously lower than the churn on the file containing the code the test tests (“foo.py”) often by a 2-1 ratio. The TDD cycle is to write the test, get it green, then refactor to make the code better, and it was really cool to see that pattern maniffest itself in the churn numbers (the test file changes less than the code under test, which indicates we’re probably doing well with writing good, non-brittle tests).
@vinicius that’s right. The assumption the approach makes is that there will be a number of small atomic commits. If the codebase has evolved on Git in big chunks it might not reveal as much information.
The fact that the script doesn’t give valuable insight is itself a valuable insight .
There’s a strong argument that can be made on the benefit of having small atomic commits. For example in this post the author writes
Commit the bug fix as one change, and the layout changes as a separate one. That way you can easily roll back the bug fix without affecting the layout change. I would even say to commit each layout change separately as well, because it makes it easier to change the layout on the fly, or roll back a simple color change without affecting the other updates involved.
The extra time it will take to split the work in dedicated small commits will pay off when browsing through your Git history looking for why something was done in a certain way, or trying to fix a bug.
The ratio you found definitely points to your tests being an aid to refactoring, and focusing on the behaviour of the code, rather than its implementation. That’s how good tests should look like.
I wasn’t able to identify any refactoring opportunities past stuff we’re already actively working on
This can be seen as a validation of the fact that the stuff you are actively working on is valuable for improving the codebase.
Totally agree! This is how I like to work and usually do. However, getting everyone in the team to do the same it is not so simple. I fell like one need to suffer the pain in order to learn it (like having to debug everything they have changed in order to find the problem instead of revert last commit).