Today’s challenge is to spend investigate the files with the highest churn. Is there anything that stands out? Is it worth trying to make them more stable?
you can use it for plain javascript, too. if there are complexity analyzers for other languages like PHP, I’d be happy to provide that, too. It’s actually really easy to set up.
Cool command indeed. git-churn --since='1 year ago' app lib returned the Rails User class as our main offender. Like in many Rails apps, it is a large “god mode” class that is doing too much and could definitely use some separation of concerns here and there.
Didnt really discover anything out of the ordinary - the features working on in the past few months are dominating, and the top file is changelog.md as expected
We are currently working on improving our codebase, refactoring here and there. And most of unmaintained code has less churn rate.
Thanks for the tip. Will use the command to watch progress.
Will also check the attractor gem, seems pretty cool, thanks for mentioning @julianrubisch!
Didn’t discovered anything really suspicious, but this confirmed my feeling that we were having one particularly long CSS file worth to be modularized.
The command came at a handy time as I’ve just started an assessment of an existing codebase this week. There are a couple of file that stand out from the list already. So thank you, I’ll definitely keep an alias of that in my dotfiles.
Yes, the book “Deep Work” is really good one. Currently, as being in fasting period, I practice “boring” a bit as I have decided to skip YouTube, Twitter and reading news.
With detecting high-churn files one strange thing has happened. I again detected the crud.py file as being somehow off the rails as it happened before with counting number of function arguments.
If all goes well, I am to refactor the crud.py tomorrow. Looking forward to it.
Btw - files which depend somehow on configuration values (such as configuration files themselves, but also ci/cd automation stuff) are often high-churn ones. Sometime it is warning on it’s own - as application code and deployment configuration shall be kept separated (otherwise any further deploment distinct from the first one will teach us that we have some deployment dependent stuff baked into our app).
How do you use this in your day to day job? I can almost imagine this being a consulting gig, helping team to learn read the analysis and use it to address risk areas.
as application code and deployment configuration shall be kept separated (otherwise any further deploment distinct from the first one will teach us that we have some deployment dependent stuff baked into our app).
He @vicinsky! Could you elaborate on this? I’m curious if you have any real world example.
Are you referring to the fact that sometimes the deployment code is not atomic, so if you run the same deployment via CI twice, the second one will have unexpected consequences?
For context, I’m currently working on the team that manages all the CI pipelines for a bunch of mobile apps, so the topic of keeping deployment and configuration code separate, tidy, and robust is of great interest
Hey folks! This “guest contributor Giovanni Lodi”, or Gio for short Glad you found this challenge useful!
I really recommend adding this script to your PATH. If you use git-churn as the file name, Git will even recognize it as a custom command and you’ll be able to call it like git churn.
Analysis of the Git history are fascinating and instructive. They clearly reveal that the codebase “is alive” and how each contributor affects it.
Yes, partly consulting, but also when I’m asked to join a team working on a legacy app (I‘m a freelancer) and need to get an idea of the codebase quickly
start first (e.g. web) app, keep all the code and configuration in one repo
deploy it (e.g. git push and then from a server git pull) and run it
all done, relax awaiting a call from Silicon Walley
request to deploy second, independent instance comes
you git push, then git pull to second server and learn, that some configuration files (present in the repo) need to be modified.
This tricky situation often results in either messy git repo (containing config files for multiple servers) or with not storing (thus ignoring) some configs for particular servers.
Proper solution is to strictly separate an application code (which provides clear configuration methods and also allows some sort of application packaging) from actual deployment configurations. If the deployment configs live in (the deplyoment) git repo, it typically refers to somehow packaged application (a python package, gem, docker image) and adds specific configuration values for particular deployments.
Since now, change in deployment configuration does not have to touch the application code - what is deserved result.
Your “high churn” test revealed to me, that configuration files are often high on the churn scale. There could be at least two types of changes:
deployment node dependent - with each new node, new config modification may come
data changing as users are using the application - e.g. you have enumeration of device types and as is the applicaiton used in time, users are asking for new device types to be added (and this may differ per deployment)
Both things shall not happen in the applicaiton code. If it happens, it could be a sign, that the code needs refactoring to get this configuration stuff out of application code and move it into external configuration files, which shall finally live (and change) in deployment repositories.
Regarding “high churn” test, I propose following three questions about files with higher number of changes:
is this configuration related churn?
shall we refactor it out into external configuration files?
shall we move actual configuration files out of application repo into deployment one?
Adam Thornill’s book are already on my read-next list
Michael Feathers also once gave a talk or blog post abotu that topic, combining churn on one axis and lines of code in a file on the other axis. Maybe the gem in the first comment does the same. I’ve also written at some point a command line tool around that. Fun exercise and good insights