XM Cyber's Tamar Stern - A Whole New World Of The SDLC
XM Cyber’s Tamar Stern – A Whole New World Of The SDLC
Tamar Stern: Hi, it’s really nice to be here. Well, I’m Tamar Stern, 38, married with three kids. Currently, I manage the backend development in XM cyber, and the last few years, I was diving into Node.js and diving like a lot into the internals of Node.js. I’m an expert Node.js developer, loves to code, and I’m also a public speaker.
Tamar Stern: I had a friend that was doing that, that was lecturing a lot in the C# area, and I saw that like lecturing, helping her to study a lot of things. Like if you decide that you would like to lecture, that you would like to develop a career of a public speaker, then this is something that you need like to aim to, you need to study that, you need to prepare yourself, you need to look at conferences, understand how to get accepted, what are the topics that the conferences are looking for, what were the main talks last year? But really, you have to decide that this is like an interest to you, and this is something that you want to do, because I think that you have to invest your time in it, you cannot do it without investing your time in it, and you cannot do it without like preparing and studying the subject.
So after I got accepted to one conference, it was gig time code, at that point, I tried to get accepted to the international jobs conference, and I was lucky and I got accepted. I got accepted with two talks, I worked on them a lot. I rehearsed them for, I don’t know, for two months in advance, and I thought about– one of them was about performance and one of them was about security, and I built like a lot of demos of cool security. I have one demo there that you can break into a server without knowing even one username and password with no SQL injection.
And I was like, wow, the Node.js core team developers are coming to hear me, oh, my god, that’s amazing. But I’ve done like a lot of talks, some of them has like, few thousands views on YouTube, which is a lot for technical talks, and after all of that, I’ve started to be a moderator in international conferences also. This is how everything developed. Actually, it’s like really helping you when you structure something. First of all, like your knowledge becomes like very, very deep in the subject, then like, you organize everything, and you become like, you really master the knowledge in a new way. So I think it’s cool, and it’s really helpful, and it opened a lot of doors for me.
Liran Haimovitch: What do you like speaking about?
Tamar Stern: In Node.js, I have several talks about performance, about the architecture of the engine, and how the engine, the internals of the engine works, and how to perform operations. For example, Node.js is a language which is very built for non-blocking operations for a lot of IO operations. But if you would like to do and Node.js CPU extensive algorithms, let’s say let’s take the area of machine learning algorithms, then it’s not the ideal language for that, mainly because CPU intensive operation doesn’t really perform well in Node.js.
For example, I have a lecture about that, and about how to handle this problem using an internal model of Node.js. It’s called the worker threads, that’s the name of the module. Also, I’m showing there how to improve performance of a server that has a lot of memory algorithms in it using that model. I have a lot of talks about server less, about how to build a good microservices architecture in Node.js, and what are the best ways that microservice can communicate with one another. Everything actually comes from the field, I say, or from my working experience, because I have the chance to work on a lot of cloud systems with the complex architecture, and I get a lot of inspiration from work, from like the technical problems that I have at work to build my lectures.
Liran Haimovitch: Performance is very close to your heart, and you seem to be giving a lot of talks about it, why is that?
Tamar Stern: Yes, that’s a good question. Actually, performance is, let’s say, escorting me, or I somehow started to get interested in that, even before I became a Node.js developer. Before that, I worked several years in C# over there also, I developed a lot of new features, but I also did a lot of performance optimizations. Actually, I find it interesting because of the challenge it has inside, because you’re looking at a problem and you have to understand how to solve it, and you have to get really deep into that technology and understand what would be the correct solution, what would be the best solution? Also, you have to understand how many resources you have. Well, currently in cloud environments, the resource question is less relevant, because we can have a lot of resources.
So a lot of performance improvements right now would be to build a good microservice pipeline, and to understand how to do, let’s say single responsibility, like which activities should be where and how to be able to build services that you are able to replicate them with no problem. But yes, I started mastering the subject before getting into Node.js.
Liran Haimovitch: Now, how is performance for Node js different for other languages? How do you approach the unique nature of the interpreter and the language differently?
Tamar Stern: If you’re taking languages like Java, or Python, or C#, Ruby languages that are multi threaded, those languages work in a pattern called Blocking I/O. What do that mean? Let’s say that I/O operation, I hope that the notion of I/O operation approaches to like HTTP request, database query, reading from a file, et cetera. If you’re working in a Blocking I/O approach, let’s say I’m working with Python, and I would like to run a database query, I would open a new thread in Python, and then I would run the database query, and my thread is going to wait until the database query would come back, and that time, it won’t do anything. Usually, what’s going on in those languages is that for every request, there is a new thread that is handling it. Also, to optimize– for example, I was talking about the database queries, I’ve seen mechanisms of like having a thread pool, that you’re sending the queries to it, and that thread pool is processing the query, and when it’s ready, like the responses are coming back. But that sums up into like a lot of threads that are being open.
And in Node.js, it’s different because the approach is non-Blocking I/O. Well, in high level, the architecture of Node.js is event loop. Event loop, once like a request is coming through the server, callback is being registered to the event loop, and the event loop is starting to execute the flow of that request. Of course, the event loop has several phases. I mentioned them in some of my lectures, and I get really deep into the phases of the event loop. But once an I/O operation is happening, or let’s be more specific, it’s not an I/O operation, it’s asynchronous operation than the operation. Let’s assume that this is an I/O operation of HTTP or DB query, is offloaded to another component called worker threads, that component is processing like the DB query, it’s processing the operation. And meanwhile, the event loop can do other things like handle requests that are executing flows that are coming from other requests that has arrived to the server and the server has to serve them.
So you have to think that you have a constant amount of threads, the event loop is running in one thread, and the worker threads are running. This is like a thread pool with a constant amount, and if you’re blocking one of them, it would be very not good. You have to think about how to work asynchronously all the time. If you’re working with synchronous API’s, then you can block the event loop, which is, I think, the worst case that can happen. You’re blocking the event loop for example, if you’re doing like complex algorithms in memory, that you’re doing them in a synchronous way, and they’re not offloaded to the worker threads, or for example, I/O operation that are working with synchronous API’s, in those cases, the event loop is executing the operation and cannot serve other requests. I think from the ability to serve multiple things in parallel, the power of the event loop is coming. So the event loop can offload the heavy activities to the worker threads and serve other things.
Liran Haimovitch: So how do you go about that? How do you offload computations or activities to work with threads?
Tamar Stern: If you’re looking at Node.js code, then you have asynchronous API’s that in old Node.js code, they were working with callbacks, and in the newer versions, we’re working with async/await. Async/await is actually like a syntactic sugar for promises. So by convention in the language, you can assume that when you have an API, which returns a promise, or implemented with async/await, then it has asynchronous operations inside of it, and all of those operations, every asynchronous operation will be offloaded to the worker threads, but this is only convention. Actually, in order to submit a task to offload an operation to the worker threads, you have to write a code to the library that is actually doing it that is called libuv.
That code is written in C++, but by convention in the language, all the language or using a set of basic languages, and those languages, there is code that was written in C++ in the libuv level, which is offloading the operations to the worker threads. And by convention, every operation that is offloaded to the worker threads is wrapped with an asynchronous API. When we have an asynchronous API, we can assume that behind the scenes, somebody wrote a C++ code that is sending that operation to libuv. Of course, I can like create a new promise or create a new asynchronous API, and right inside, it’s a synchronous code. For example, I can add numbers or stuff, but this is not something that you’re doing.
I mean, if you’re writing an asynchronous API, then you have to work with asynchronous operations. And as I said, usually, you’re sending your heavy operations to the worker threads, which are I/O operations, and also, in advanced libraries, CPU intensive operations. For example, you can look at crypto library, several APIs in crypto library are asynchronous. You can look at TensorFlow, which is a library for machine learning algorithms. If I remember correctly, all of that library is asynchronous. Things are done efficiently over there and are offloaded to the worker threads. So I/O operations and CPU intensive operation that are wrapped with a synchronous API are executed inside the worker threads.
Liran Haimovitch: So how do you go about monitoring that? How do you go about understanding performance throughout the software development lifecycle?
Tamar Stern: All right. So performance during the software developer lifecycle, it’s a whole world, it doesn’t really end in the abilities of the Node.js language. You have the database, you have the resources of the machine, it’s like a who. Let’s take a problem that we’re facing now for example. It’s a legacy API that we have in our software, it was written a long time ago, and what’s going on in there is that, I have a UI and for the UI, I get a page of entities, let’s say, page size is 100 entities. The problem is that when I am, for example, calling the API, this is something that I’m dealing with it the last few days, every time that you call the API, there is a very complex computation that is happening on every record. And what’s happening is that when you’re working with an API, for example, when you’re working with an HTTP request, what you want to do is do one query on your database, and take all the data from your database, do zero processing, and send everything to the front end, like really zero processing. And here we have like very complex processing on every record.
And even the DB queries are going one by one, meaning like, for each record, we’re doing like one query, and then we’re doing that processing also on the database. And what we can do, instead of that, that would be like much more efficient, we could compute the complex computation for all the records that we have in a different microservice, and like save it to the database, let’s say it’s not real time. So we can compute it like every 15 minutes, or something like that, or every 10 minutes in a different microservice, and then the API that– the HTTP requests will just do one DB query, like, gets top 50 or top 100 with all the data, throw it to the front end, and that’s it. So how did we discover that? As I said, actually, that problem combines the limitations and no limitations of database. But another aspect that we need to see with performance problems is, for example, the database itself. We have to see which queries we have on the database. For that, you have to do database profiling, you have to look at the database query log and see which queries are running.
At this case, we’ve started to see that for every record, we have a query that is doing– we have like several queries that for fetching the data. So after examining the DB queries, so we’ve done profiling through the application. And for example, in Node.js, you can do that with tools calls like Node inspector. There are other profilers like cleanic, for example, but I’m kind of fan of Node Inspector, I really love it, and it really helps me to find a lot of problems, and I’m very used to it. So we’ve seen in the, like, CPU profiling, that we have done in the Node application, that after running like fetching a lot of data from the DB for every record, we’re doing some kind of like memory computation, adding things to one another, computing stuff, and then getting it back. So it’s really inefficient. So the process started with like doing, in parallel, the CPU profiling for the application and the DB profiling for the database, and then we saw in the DB profiling of those queries running, and then we saw in the application, all the like the in memory computation.
And as I said, if we’re doing in memory computation in a synchronous way over there like we’ve done, we’re doing everything in memory. And so we’re blocking the events, the power of Node.js is working asynchronously and offloading, doing everything efficiently, offloading things to the worker threads. So after we’ve done profiling in both ways, we’ve solved those problems. Well, there’s other things that I can say about that. You spoke also about monitoring, but I would like to say that I never recommend to profile your production environment, I don’t recommend to do so because when you’re profiling your production environment, you can add– well, profiling a process add an overhead on the process. They say, well, one of my experts in my group is debating me and saying that CPU profiling is not giving so much load on the process, but still, even he doesn’t do it for more than like 20 minutes or 15 minutes on production. What we do have in our systems, is we have a duplication of the production environment. It’s like a shadow environment.
And over there, also, the version is being installed. And over there, since it’s imitating exactly the production environment, we can do the profiling and see the problems. Of course, on production environment, we don’t do that. But that is, yes, this is like a very useful thing that we’re doing that shadow environment that we have created that’s imitating production environment. And that’s like a really good replica, and we can see all the real problems that are happening in production over there. Regarding monitoring, so as I said, profiling production environment is not really good. You have to, as I said, we’re only profiling the shadow environment. If you don’t have the ability to take up a shadow environment, what you can do is use performance logs, and using performance log to measure how much time every operation is taking. And then they’re like really good tools like Prometheus and Grafana.
Well, Prometheus, and Grafana, Prometheus is a tool for sending metrics on your server, important metrics that you choose, and Grafana is a tool that coming, let’s say above Prometheus, and helping you to build a lot of dashboards. And well, what’s going on is that Prometheus is reporting live data into Grafana, and then Grafana dashboards are being updated. This is something that we also use. So you can define metrics for like responses that are coming very slowly. You can say, all right, if a response from my server is slower than three seconds, please report it to Prometheus, and then Grafana takes it from Prometheus. If you don’t have the possibility to take up a shadow environment for your production environment, then performing logs and monitoring is the best. Other things that we’re doing for monitoring is using Elasticsearch and Kibana. Actually, we’re using that to monitor logs, but we also monitor other points in there, other important metrics in there. Also, that tools of Elasticsearch and Kibana is helping you. For example, if you have a fatal exception, and something is collapsing, you can create an alert from there, which is also very helpful.
Liran Haimovitch: So, Tamar, thanks very much for joining us.
Tamar Stern: Thank you very much. It was very nice to be here.
Liran Haimovitch: So that’s a wrap on another episode of the Production-First Mindset. Please remember to like, subscribe, and share this podcast. Let us know what you think of the show, and reach out to me on LinkedIn or Twitter @productionfirst. Thanks again for joining us.
Outro: So that’s a wrap on another episode of The Production First Mindset. Please remember to like, subscribe, and share this podcast. Let us know what you think of the show and reach out to me on LinkedIn or Twitter at @productionfirst. Thanks again for joining us.