In an ideal world, developers would not need to waste precious time writing countless log lines along with every few lines of code they create. We would instead be focused on building the best features we possibly can.
In a perfect world, if a problem comes up and some data is missing, devs would be able to effortlessly extract that data. Of course, we don’t live in an ideal world. In fact, we live in a world where logging hasn’t changed in more than 30 years.
“Logging FOMO” - Fear of Missing Out on log lines and the data they comprise. The anxiety that something may go south in the future and you won’t have a log line to tell you exactly what happened. Trying to avoid that anxiety is the main motivation behind setting log lines before every line of code.
When a dev writes a feature, she may write anything between 100 and 1000 lines of code, and quite a few of these will be log lines. Why - some may wonder? Well, it’s all due to those pesky ‘What Ifs’. What if something goes wrong? What if I need more data for my APM? What If I won’t be able to understand what is happening? What if someone else needs this data point? What if in the future, this product will do something different? What if I forget how this code works? What if I’ll need to add another feature here?
Of course, things often go wrong, and when they do, the first step to solving the problem is understanding what had happened. To get the data that will add observability to our application, we spend a lot of our time adding, tweaking and removing log lines to collect just the data-points we require. These log lines often comprise many of the pull requests we do each day.
Although it may seem like no big deal to an untrained eye, the process of getting logging “just right” for your application is no small task. You are going to spend a lot more time and effort here than you had initially expected; time and effort you would have otherwise used to build better software. You will read through the code and find the meaningful flows you want to record while avoiding potential hot spots that can fill up your logs with noise. You will identify the handful of useful variables to extract and will transform these into a log-friendly format while keeping the log volume reasonable without accidentally sending sensitive data out of the system.
Worse still, whenever you want to change your logs, either to get an extra piece of data to fix a bug, or to remove a piece of data you no longer want to extract, you’ll embark on your own little private Odyssey. You’ll start by re-running the tests, hoping they are stable and deterministic, praying that no one else wrote a test which depends on the log you’ve just changed. You'll search for someone with the skill, authority, and the time to approve your pull request. Then, if you’ve managed to successfully deploy that change to production, it better not cause that awkward moment when your innocent log line throws an exception or otherwise impacts the flow in some unexpected edge case.
While you’re busy writing logs to your heart’s content, you may not realize it, but too many of these logs have issues of their own. You might be logging security-sensitive information such as passwords. It could be that the piece of information you are logging falls under GPDR or CCPA and must be removed. Worse yet, your manager might be paying too much for your log aggregation service and will ask you to remove big or repetitive logs to save on costs. In some companies we spoke to, entire dev teams spend days or weeks at a time just to reduce the verbosity of logs that generate too much volume. They stop building and developing features and spend significant time removing the scaffolding and the dust - the useless logs. These are the very logs they initially spent so much time setting up.
To make matters worse, log lines may not accurately reflect the true state of the system. Your log may not mean what you think it means if, for instance, whoever wrote the line misinterpreted the code or didn’t write a clear enough message. If the application around it changes, your log might grow stale. Have you ever had your favorite log line disappear because the application now takes a different flow to handle that request? Fixing a faulty log requires more releases and hotfixes, and of course, more waiting. All of this just for the sake of logging.
Building, maintaining and paying for logging pipelines is an expensive and complex undertaking. To overcome it, many companies try to learn from the people who literally wrote the book about DevOps and observability. Companies like Uber, Netflix, and Google. A quick read through these famous publications, however, reveals an uneasy truth: even if you are as big and as smart as these “DevOps unicorns”, building your observability pipeline will still be a great effort.
In a way, it’s like developing a whole new product on top of your existing product, just for logging and the resulting observability. The more you rely on it, and the more data you send through this pipeline, the more costly it is to maintain. It entails setting up log shippers such as Fluentd and Logstash; scaling and load balancing them using queues, and configuring or purchasing that Elasticsearch cluster. Add a malformed or unexpected log to the mix and your ETL might drop it, or your Elasticsearch might refuse to index it.
Are you frustrated with logging? We’re right there with you. What developer doesn’t dream that a day will come when we can take observability for granted and focus more on the product we’re building. We all wish for a day when we don’t have to spend so much coding time and effort on adding, tweaking, and removing logs. The good news is, this is more than wishful thinking. There are new and effortless ways of getting the exact data-points we need from our code without planning ahead, and they’re becoming the new industry standard.
With technologies like non-breaking breakpoints and virtual logging, data can be available to you without having to fear it or worrying all the time about missing out on that crucial log line. These tools aren’t only more effective, they are also light on resources and don’t cost as much. One of our customers told us that adding a missing log line usually entails a 5-hour rebuild/test/redeploy process, significantly slowing down dev work. Another customer had shared that adding log lines in staging is a process that can take over an hour, and sometimes a week in production. Today, they get the data points they require in mere minutes using virtual logging to decouple data points from code and instantly deliver data without stopping.
We are now well underway towards a new world of logging, one that does not include logging FOMO. A world where you can bravely add new features or release a quick hotfix, without anxiously considering which log lines you may miss when things go south. You no longer have to worry as much or waste precious time, because, in most scenarios, you can just measure and get the data later. In this new world, you spend your mental resources on building the perfect product rather than managing your observability pipeline and trying to cut cloud costs.
Are you still waiting for some extra data that you need from your software? The future is here. Give non-breaking breakpoints a try and join us in our shared journey toward agile data and logging sustainability. :)