Day 24 - Investigate high-churn files

(This challenge was created by guest contributor Giovanni Lodi. I recommend checking out his blog.)

Today, please spend ~20 minutes looking into the churn your project has experienced.

In this case, churn refers the number of times a file has changed.

High-churn files are worth investigating. They’re not necessarily bad, but they might indicate a good refactoring opportunity.

Perhaps a file changes often because it is unclear and therefore buggy. Perhaps it’s doing too many things.

If you are using Git, you can measure how many commits there have been on each of your files using this command:

git log --all -M -C --name-only --format='format:' "$@" \
  | sort \
  | grep -v '^$' \
  | uniq -c \
  | sort -n \
  | awk 'BEGIN {print "count\tfile"} {print $1 "\t" $2}'

Consider saving it in an executable script. Call it git-churn, perhaps. This will make it easier to reuse and to pass parameters.

For example, what are the top 10 files in the core of your codebase that have changed the most in the past 3 months? Find out with:

git-churn --since='3 months ago' <core_of_the_app> | tail -10

Today’s challenge is to spend investigate the files with the highest churn. Is there anything that stands out? Is it worth trying to make them more stable?