Rookout speeds up Eagle.io’s debugging and troubleshooting time from hours to minutes

Pushing Forward With Complex Code

As Eagle began their migration to a tech stack built out of new, modernized, containerized infrastructure, they started to understand that they were facing a problem. They were facing difficulties not only seeing what was going on in their production environment, but also being able to troubleshoot when errors did occur and facing long resolution times as a consequence.

Even worse, in order to be able to understand and resolve ticket issues, their team had to be able to reproduce them in a dev environment. They were working to keep as many of their customers as possible happy, which meant that they had to struggle through queues of tickets where the issues were especially difficult to reproduce. This resulted in extreme stress and workloads for the support team to achieve target resolution time. Resolving tickets meant having to reproduce the issue in a dev environment and that would mean that the customer support team had a near-impossible task of programming a customer’s local conditions to simulate their experience for each ticket.

According to David Julia, Head of Engineering at Eagle, “Working on support resolution without a tool like Rookout was time consuming, difficult, and sometimes ineffective. We had to grab logs from individual EC2 instances and try to figure out the problem and recreate the situation locally. That would involve grabbing an export of data, an export of the charts, the workspaces, and the different configurations the user had made. To make it worse, none of this was guaranteed because sometimes in IoT you have these weird conditions – such as a device losing connectivity or something unusual happening – that you simply aren’t able to recreate in a local environment. We would try our best, but there was often no guarantee we would be able to recreate the issue”.

The team at Eagle understood that they needed to find a better solution.

The Live Debugging Experience

Using Rookout has been a game-changer for Eagle’s team.

When looking to troubleshoot areas in which the developers at Eagle don’t have access to the relevant data they need, Rookout has been especially beneficial. “When our developers find that they haven’t logged enough, there isn’t data jumping out at them from their APM tool, or they aren’t seeing anything in distributed traces – that’s when I’ve found that the team will jump to Rookout”, said David. “Rather than try to guess and recreate the situation locally, we are able to get real data in the real circumstance in which the bug is occurring in production, critically, without having to do another deployment”. This has enabled their team to find the data they need much more quickly, significantly cutting down the debugging time that used to take hours down to mere minutes, thus saving them at least 83% of their troubleshooting time.

According to David, the best part of using Rookout is the ease of use, especially when installing.

I initially installed Rookout before even reaching out to their sales team. It’s super easy. You simply install a JVM on your running application when you’re on Java and it’s not much harder on any of the other applications- we also use it on Node. I tried it locally, it worked. That simple.

Troubleshooting Customer Issues And Getting to Resolution Faster

The adoption of Rookout into Eagle’s developer toolbox has been beneficial in two critical ways:

  1. When troubleshooting customer support issues.
  2. When there’s a production incident and having the ability to reach resolution faster.

Being able to get the data they need at the exact time they need it has been critical for David’s team when resolving customer issues.

“With Rookout, when the customer tells us there’s an issue, we can go straight to what we think the offending code path is, set a bunch of Non-Breaking Breakpoint snapshots, and ask them to check again or trigger the behavior again in our production environment. We then watch the data come in and are able to narrow it down to a line of code, a variable condition, or even a situation. This allows us to troubleshoot very quickly.” said David.

Now David’s team doesn’t have to suffer through their classic debugging frustrations of trying to recreate problems locally. They are instead able to nail down the exact line of code, fix exactly what they need within as little as 10 minutes, and get back to developing features and resolving other support tickets.

In a production incident, for companies like us that are trying to maintain SLA with high uptime, the ability to speed up debugging and troubleshooting time from hours to minutes is a necessity.

A Few Words About Eagle

Eagle.io is an innovative software platform built with love from the ground up to meet the needs of demanding users. Their team has extensive experience designing, developing, and delivering data management and control systems around the world. Eagle.io is the culmination of this experience.

Eagle.io is an environmental IoT platform that turns time series data into actionable intelligence. Eagle enables real-time data acquisition, visualization, alerting, and analysis of data from fleets of IoT devices and arbitrary text sources.

In a production incident, for companies like us that are trying to maintain SLA with high uptime, the ability to speed up debugging and troubleshooting time from hours to minutes is a necessity.

David Julia
Head of Engineering

Get real-time data

Reduce troubleshooting time by 83%

Maintain high SLA uptime

Want to be our next success story?

Let’s Chat