When Debugging Meets Performance
Our ongoing goal at Rookout, the Live Debugging company, is to turn the debugging of live, remote applications into something that every developer can easily do as part of their daily workflow. Recently, we have taken this challenge one step further. What if we could make it so developers were also able to solve performance issues on a daily basis as well? Some recent additions we made to the Rookout platform are the first step towards turning that vision into a reality.
Our investment into tracking server metrics started following a series of requests we have received from our customers. When these requests started coming in, we were quite surprised. We used to think that debugging was first and foremost about seeing the code, while CPU and memory are usually what you look into only when the APM tool wakes you up. But as our customers taught us, live debugging and production debugging require a shift-left approach in attitude. Seeing the code is no longer the sole responsibility of the developer, and watching for performance hits is no longer the sole responsibility of the IT / Ops / DevOps group.
As we launched some recent performance tracking features, we had a chance to think about where debugging and performance fit together. Here are some of the highlights.
Traditionally, the world of debugging applications has a clear cut between two practices: local debugging and live, remote debugging.
Local debugging is something that happens in the developer’s own IDE, most likely by running a single instance of the application locally. This is done by setting a breakpoint, running the app in debug mode, reaching the line of code where the breakpoint is set, and debugging step by step. In this debugging method the developer’s main point of interest is what is the behavior of each individual line of code, each individual variable.
This method of debugging is most commonly used to reproduce and troubleshoot issues with the behavior of the application, its business logic, and its user experience.
Live, remote debugging is something that happens by either integrating to a remote server (or servers) or by relying on a logging pipeline. This method is less intuitive, and requires a higher engineering skill set and learning curve. It also requires more effort and time every time it is used. This is often the reason that only senior engineers use it, and when they do, well, let’s be honest- it’s only because they had no other option (that is – they just couldn’t debug locally).
Performance troubleshooting is debugging
Troubleshooting performance issues, such as a high cpu spike or a memory leak, is one case where local debugging just won’t cut it.
There are a few reasons that make performance debugging something that can only be done by way of live/remote debugging:
- The hardware used on the developer’s desktop is different from the hardware used in the ‘live environment’, which essentially means that performance measurement will not represent the real world behavior.
- Running the application locally is often done by using a different app configuration than the one used in live environments. Running one pod or lambda function will not have the same footprint as running dozens or hundreds of pods or functions, all dynamically spinning up and down and all interacting with each other.
- Most desktop computers don’t have a built in set of performance monitoring tools – for instance, you don’t install AppDynamics or DataDog on your desktop. You may use the “resource monitor” or whatever local app is provided by your favorite OS vendor. In most cases, the local monitoring tool will not give you the granularity you need to troubleshoot a performance issue.
- Most developers don’t know how to troubleshoot performance issues. It is often considered a high expertise practice, which in many teams is handled by a so-called ‘performance expert’. Knowing what to measure, how to measure it, and how to fix an issue that causes a performance hit are all considered unique specializations in the world of software engineering.
Due to these reasons and others, troubleshooting performance issues is something that usually happens later in the development cycle, often on a dedicated environment (some organizations have a “performance” environment, which is something between a staging env and a production environment), often by a performance specialist.
These environments bring the set of challenges that Remote/Live debugging is set to solve – attaching to remote servers, attaching to multiple, dynamically deployed servers, handling changes in source code versions and more.
Raise your debugging game by handling performance
However, as good as all this sounds, having a remote/live debugging solution doesn’t mean it can be used to troubleshoot performance issues. Some live debugging tools simply don’t have the capability yet. In many cases debugging is done using one tool and by the developer who wrote the app, while performance troubleshooting is done using another set of tools and by the SRE or developer on call.
Thanks to some recent additions to our live debugging platform, we have now extended our core debugger into a tool that can also be used for solving performance issues. And that turns performance troubleshooting into something that non-experts can do, just as Rookout turns remote/live debugging into something that non-experts can do.
It’s all about the timing
As we mentioned at the top of this post, the initial reasoning behind this move was a repeated request from our customers. When debugging live, remote environments, they realized that getting a CPU or memory sample along with a debug snapshot, in the dev-friendly IDE experience, and without struggling to install a server monitor and correlate its results from logs from another tool, would make their debugging more efficient and enable more engineers to troubleshoot such issues. So faced with that much possibility, how could we not?
Secret sugar coating cherry sauce
Some traditional debuggers have the ability to track CPU and memory, with Visual Studio being the most notable among them. Some of the newer exception catching tools also provide similar capabilities, usually from a production debugging perspective.
From conversations with the same customers who requested these features in Rookout, our guesstimation is that even though these tools have these abilities, few users know these tools actually have these features. This is either because they don’t know about them, because they don’t know how to use them, or because they simply don’t perceive these tools as tools for performance troubleshooting (which, if we’re not sugar-coating it, is probably a challenge that Rookout will face as well).
As with other problems that Rookout solves, the unique value in Rookout is giving developers something they are used to. A debugger look and feel is much more dev-friendly than setting up and learning to use a server monitor, which is also external to the developer environment and is far away from the code.
And to put the cherry on top, Rookout lets you measure these server metrics without adding code, which is our thing.
So what are you waiting for? Go ahead. Try it out. See how easy performance debugging is. You’re welcome 😉