2 Jul 2009

Book Club: Logging - Release It (Michael Nygaard)

Our latest technical book club session was a discussion of the logging section in Michael Nygard’s Release It.

I recently listened to an interview with Michael Nygard on Software Engineering Radio so I was interested in reading more of his stuff and Cam suggested that the logging chapter would be an interesting one to look at as it’s often something which we don’t spend a lot of time thinking about on software development teams.

These are some of my thoughts and our discussion of the chapter:

An idea which Nick introduced on a project I worked on last year was the idea of having a 'SupportTeam' class that could be used to do any logging of information that would be useful to the operations/support team that looked after our application once it was in production. This is an approach also suggested by Steve Freeman/Nat Pryce in Growing Object Oriented software (in the 'Logging is a feature' section) and the idea is that we will then focus more on logging the type of information that is actually useful to them rather than just logging what we think is needed. One thing which Dave pointed out is that it’s often difficult to get access to the operations team to try and get their requirements for the type of logging and monitoring they need and so often ends up being something that’s done very late on. On projects I’ve worked on there has often been a story card for logging and I think this is a good way to go as they are a stakeholder of the system so logging shouldn’t just be dealt with as a nice extra.
Something which I hadn’t considered until reading this book is the idea of making logs human readable and machine parseable as well. The default format of most of the logging tools is not actually that useful when you’re trying to scan through hundreds of lines of data and it was intriguing how a little indentation could improve this so dramatically with the added benefit of making it much easier to create a regular expression to find what you want.
One thing I’m interested in understanding is how we work out what’s too much logging and what’s too little since it seems that it seems that the answer to this question is fairly context sensitive. For example on a recent project we logged all unhandled exceptions that came from the system as well as any exceptions that happened when retrieving data from the service layer. In general the data we’ve had available has been enough to solve problems but we could probably have done more, just working out what would be useful doesn’t seem obvious.
I think it was Alex who pointed out that it’s often useful to have an explicit step in the build to remove any debug logging from the code so that it doesn’t end up in production by mistake. This seems like a pretty neat idea although I haven’t seen it done yet - it also leads towards the idea that logging is for the operations team which I think is correct although it is often suggested that logging is actually for developers since it is assumed that they would be the ones to eventually solve any problems that arise.
The idea of having message codes for specific errors messages seems like a really cool idea for allowing easy searching of log files - we’ve done this on some projects I’ve worked on and not on others. I guess the key here is to ensure we don’t end up with too many different error codes otherwise it’s just as confusing as not having them at all.

About the author

I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.