Rookout CTO Liran Haimovitch sits down with Victoriya Kalmanovich, Engineering Manager at Aspectiva. They discuss operating NLP at scale, her experience joining a company post-acquisition, how it is to collaborate with such a huge enterprise from so far away, and the key to good software architecture.
Victoriya Kalmanovich, Liran Haimovitch
Liran Haimovitch 00:02
Welcome to The Production-First Mindset. A podcast where we discuss the world of building code from the lab all the way to production. We explore the tactics, methodologies, and metrics used to drive real customer value by the engineering leaders actually doing it. I'm your host, Liran Haimovitch, CTO and co-founder of Rookout. Today's episode is all about software architecture and operating an NLP at scale. Joining us is Viki Kalmanovich, software engineering manager at Walmart. Thank you for joining us and welcome to the show.
Victoriya Kalmanovich 00:43
Hi, thank you for having me.
Liran Haimovitch 00:45
So Viki, can you tell us a little bit about yourself?
Victoriya Kalmanovich 00:47
Sure. So my name is Viki Kalmanovich, as you said. Hi, nice to meet you. And actually, we've met before, but it's nice to meet everyone who's listening. I'm an engineering manager here at Aspectiva. We are a startup that was acquired by Walmart about three years ago. And in previous roles, I was a software engineer and group lead at the Navy's technological unit. And basically, I lead here, the software team.
Liran Haimovitch 01:14
So what is Aspectiva?
Victoriya Kalmanovich 01:17
So Aspectiva is an NLP expert company. And what we actually do is we analyze reviews. So, we have a lot of data that we can utilize using data that reviews that people write about a lot of products. And what we do is we take all this data and we gather insights from the data. We can gather insights, using many NLP methods, but what we-- basically the output of this analysis is something called an aspect, which means if you write a review about your phone, for example, you can write this phone has a great battery life. But for screen resolution, what we know what to do is to take this sentence and to separate it into all textual parts, and then find the important words the battery life, the screen resolution. And we can also tell you the sentiment that you felt about each and every one of those words, those aspects. That's why we're called Aspectiva. Based on those aspects and sentiments, we are able to build a lot of experiences within Walmart, if its search systems, based on aspects or recommendation systems based on aspects or a lot of other experiences within Walmart.
Liran Haimovitch 02:31
So essentially, you're taking the raw data available in reviews all over Walmart, and then redesigning it and making it available as actionable insights for every aspect of the digital experience.
Victoriya Kalmanovich 02:43
Essentially, yes, but there's another product that we're working on, which is bringing reviews outside of the Walmart ecosystem. So for example, people write reviews within all the Walmart app and website and all over the world. And then they're called Organic reviews. And we also have a system that's syndicating reviews from partners of Walmart. So Walmart works with a lot of brands and with a lot of companies. And then when new brands and new companies they onboard onto Walmart, we also have a platform to get all their data with them, which is all their reviews in their external websites.
Liran Haimovitch 03:18
So now today's Aspectiva is part of Walmart, and you've joined the company post-acquisition, right?
Victoriya Kalmanovich 03:23
Liran Haimovitch 03:24
What was it like?
Victoriya Kalmanovich 03:26
Joining a company post-acquisition has a lot of challenges. The main challenge that I faced was that I joined Aspectiva right when COVID started, which was April 2020. So that was the biggest challenge, which was to get into a new company when I'm from home. And I don't know anyone and I only have met my superior when he interviewed me. So that was the biggest challenge. The main challenges I've faced from working in Aspectiva after post-acquisition was-- I think I can separate it into two worlds. The first one was the personnel challenge, which is, before the acquisition, they were six people. And they ran with the company for I think about eight years. And after the acquisition, suddenly we were growing, suddenly there were new people joining. And then you needed to tread lightly between the new people and the pre-acquisition people and the post-acquisition people. Because people would work in different-- They would know different best practices, they would work under different prioritization systems. For example, when you're in startup mode, then you always have to run quick and dirty and when you're growing and maturing a product, then you have to start thinking about processes. So there were a lot of discussions on the personal level and on the processes level. And I think those are the two main challenges that I had to face as I was leading the software team.
Liran Haimovitch 04:46
How many people are on the software team today? How many people are in Aspectiva in general?
Victoriya Kalmanovich 04:51
So Aspectiva is around 20 people, I think maybe 22 is the correct number, but I'm not sure. My team goes between three and four. It went between three to four people in the past couple of years, because you know, a lot of people changed and a lot of people changed positions within the company and outside of the company. So between three and four.
Liran Haimovitch 05:08
So now you're 20 People, working out of Israel, and you're working with the global Walmart of-- I think it's the biggest employer in the world.
Victoriya Kalmanovich 05:17
Liran Haimovitch 05:18
How is it collaborating with such a huge enterprise from so far away?
Victoriya Kalmanovich 05:23
So the thing that really mattered to us, we thought about this number, as of the moment of the purchase, Walmart has 2.3 million workers all around the world, or associates, as they like to call it. That's a lot of people. How do you make an impact when you're 20 people sitting in a remote site in Israel, like they haven't worked with Israel before our purchase, before the acquisition of Aspectiva. So, it's a lot of challenges in that realm. And the thing that really mattered to us was to try to understand how we can make maximum impact, like trying to understand exactly what are Walmart's goals, and the goals of the group that we were a part of. And I can share a little bit about the group that we were a part of at the beginning, and where we are now, if you want. So we're trying to understand the goals at a Walmart level and try to understand which goals we need to set in order to make maximum impact on the Walmart goals.
Liran Haimovitch 06:19
I'm assuming today you owned your systems, you owned the code, you're running as it's running in production, the code you're writing.
Victoriya Kalmanovich 06:25
Liran Haimovitch 06:26
And this code has to integrate with services all across Walmart. So how do you go about managing those integrations?
Victoriya Kalmanovich 06:34
Right. So when we were acquired, the first thing that we needed to do was actually onboard on to the Walmart ecosystem. And we were supposed to be a part of a group that handled all reviews in Walmart. And basically, they helped us during the integration, they helped us onboard into all the matrix systems and the deployment and everything that was related to our system within the Walmart ecosystem, all the environments, everything, everything was Walmart handled. So they really helped us onboard there. And as time went by, we first of all made a-- COVID started, right?. And then there was a bigger change within Walmart. And then we were moved to work with a different group. And moving to work with a different group-- Currently, were a part of the search and personalization group within Walmart, which is a core team. So when we move to work there, we understood more and more, the need for us to work autonomously, and to have our own environments within the Walmart ecosystem, and be aware of how to run them in the way that it fits our needs. So this is something that we've been working on to gain this autonomy over our system and over our processes and over visibility, whatever, production autonomy.
Liran Haimovitch 07:52
So what does it look like today? I mean, you're in the business of AI or in the business of machine learning, you're doing NLP, and you have so much data lying around from all wanted surf, from partners. How do you write code that walks with so much data?
Victoriya Kalmanovich 08:08
So that's a really good question. Because we also saw that we have a lot of data just lying around. And besides the main purposes of the using of the data in order to gain insights and opinions of people using our aspects and sentiments, we saw that we can also utilize the huge amount of data that we have, in order to get insights on things we didn't think we could gain insights on. So we were in discovery mode to see what exactly our data can get insights on. And so we have a lot of discoveries at the moment that we're working on, to understand exactly what insights we can gain on things we never thought, like outside of our group, right outside of the search and personalization team.
Liran Haimovitch 08:50
And so today, when you're building a new NLP algorithm, how do you know it's going to walk well with all that amount of information? How do you know it's going to be fast enough? How do you know it's going to get decent results? I mean, you can only test so much in the lab, right?
Victoriya Kalmanovich 09:07
Yes. So basically, I'm going to separate the answer into two. So first of all, how do we know the NLP models are good? Because our NLP team rocks, that's how we know. And they test their models all the time. But what you're saying is, right, we don't know how a model behaves in real life until we try it on production, right?. So a lot of our efforts are to get the data and to get our models as quickly as possible to production in order to understand that the insights-- That the model is working correctly on real production data. Although we do have a lot of testing on staging, and we do have a lot of testing right before we get to production, so we have to make it work on both environments. But the module itself, I guess, when it comes to production, it's already very well tested on staging before it gets there.
Liran Haimovitch 09:54
Yeah. So what challenges lie ahead for Aspectiva?
Victoriya Kalmanovich 10:00
So one of the main challenges that we've been working on ever since we onboarded onto Walmart was to mature our systems. Because we basically, and you probably know how it is when you're running a startup, and everything is in startup mode and the systems that you're working on are, you know, you have to do things we can dirty, and you have to do things just in order to make it work and to reach production as fast as possible. So we've been maturing our systems in the past year or two, and the plan is to continue to grow. Because the scale that we've seen before the acquisition is very different from this huge scale Walmart is presenting to us. And so we want to mature systems to handle this amount of scale. And that's basically the biggest challenge we face.
Liran Haimovitch 10:44
So how do you know a system is becoming more mature? How do you go about-- how do you know the system today is more mature than it was a year ago?
Victoriya Kalmanovich 10:50
So for example, when I joined Aspectiva, one of the systems that we've been working on was a monolith, which handled a lot of things all at once. And it was really hard to understand which part of the system causes which problems, and it was hard to understand which part is the bottleneck. And so, the first thing that we did was to try to sort of do a separation of concerns and to make the system work in different services. And currently, we're in the part where already extracting it to microservices. So we definitely know where are bottlenecks and we definitely know the issues that we don't have anymore that we used to have. Like, historically speaking, we keep remembering all these annoying things that used to happen to us, that don't happen anymore, because the system is so much more mature and can handle the larger scale that Walmart has brought with it.
Liran Haimovitch 10:52
I think this brings us to one of your favorite topics: software architecture.
Victoriya Kalmanovich 11:49
Yes, it is one of my favorite topics.
Liran Haimovitch 11:52
So in your mind, what's good software architecture and how do you know it's good?
Victoriya Kalmanovich 11:57
So the key to good software architecture is to really understand the needs, because the fact that we took the monolith and broke it into microservices doesn't mean necessarily that this is like-- that we had to do it because monoliths suck and microservices are amazing. We did it because we saw that the needs of our system required making exactly that change. So I guess the key to good architecture is really understanding the needs, and really understanding the current pains and problems and not solve a problem based on a pattern that we think is the hype.
Liran Haimovitch 12:29
So what kind of pains brought you to break down the monolith?
Victoriya Kalmanovich 12:33
So basically, it was really the bottlenecks. Because there was like one service, that took a lot, a big amount of time to go through. And there was another service that kept failing. So before we separated into different services, if one service would fail, one function would fail, for example, then the others couldn't work at all right?. So that was one of the large pains and then the functionality couldn't work if one component kept failing over and over. And so right now, we have separated everything and we distributed the system almost completely, so that if one component fails, the others don't. And we also could work better to make the performance of this one component different. For example, if you're deploying it on the cloud, then you can make sure that this component, which requires a lot more CPU, can get it when others don't. So the thing that we really wanted was to look at each and every one of those components separately, and understand the technical needs each and every one of those components needed and not as the one system as a whole.
Liran Haimovitch 13:35
Yeah, I think resource allocation and separation of error handling are classic cases for moving to microservices.
Victoriya Kalmanovich 13:42
Liran Haimovitch 13:43
I know you're also giving workshops about software architecture, kind of what are those?
Victoriya Kalmanovich 13:48
So the reason I created the software architecture workshop was because I saw that a lot of people from a certain seniority level of software engineering, they assume they have to go straight into leadership roles, right? To be a software manager. And not everyone likes to be a manager and not everyone is good at being a manager, which is really good because we're in an industry, which allows not everyone to be managers, and it's okay. So the reason I created the workshops was to introduce the software architecture role, and to allow people to actually have hands-on experience on what it means to be a software architecture, from the moment of talking about needs, through actually doing a lot of software architecture exercises, and up to the point where we can actually test our software architecture in production and understand that we did a good or bad job.
Liran Haimovitch 14:42
So how do you identify weaknesses in a software architecture?
Victoriya Kalmanovich 14:45
So the answer to that I can also separate into two answers. Weaknesses before I create the architecture weaknesses, after I create the architecture. When?
Liran Haimovitch 14:57
Victoriya Kalmanovich 14:58
Both. So I guess I don't want to assume weaknesses of an architecture before, what I want to assume is to understand all the problems and understand what my architecture is supposed to fix from these problems. And then if I can create like an architecture design, and I can go over all the problems, all the issues that I said that I have, and I'm trying to understand if my design can really fix those issues, then I'm assuming it's a good design. I guess, one of the most important things of creating an architecture is to really have your-- Always make sure that you're working on the problems that you're trying to solve and to understand exactly the picture that you're trying to paint in the future.
Liran Haimovitch 15:43
You know, I've seen an amazing talk by Martin Fowler, about software architecture, and what he's saying that architecture comes from the Greek world of literally ‘how to change’.
Victoriya Kalmanovich 15:54
Liran Haimovitch 15:55
So essentially, when you make choices about software architectures, you're kind of choosing what you're going to have to stick with, whether it's good or whether it's bad, it's going to be very hard to change. And in many ways, in Agile software architecture, or modern software architecture, we're trying to minimize those 'How to Change choices', because we're often wrong. Regardless of the choices we make, we can be wrong. So the more we can delay those choices, the more we can we minimize them, the easier it's going to be down the road when we figure out that we were wrong with our weaknesses in our architecture, and the less we're going to have to worry about changing them.
Victoriya Kalmanovich 16:32
Definitely, I think one of the most important things we need to do when we create a design like that is to understand the price of change. If we have certain components that are risky, or certain components that are really difficult to create, but we think they should solve some of our problems at the moment, we need to be aware of the fact that in maybe even in six months, this architecture will be outdated, because reality changes all the time. And we always need to consider the price of change exactly like you said.
Liran Haimovitch 17:01
There's one question I love asking all of my guests, and it's about bugs. We're a debugging company, so I deal with bugs all day long. And I'm kind of wondering - for you, what's the one bug you remember the most?
Victoriya Kalmanovich 17:14
So there's one really annoying bug that I really can't talk about because it was in the Navy.
Victoriya Kalmanovich 17:22
I'll share something that I've encountered here. We were working on something, I don't remember what, and then we went to take a look at the database. That's some data that we saw, it was a little bit weird. And we saw that one of the timestamps was negative.
Liran Haimovitch 17:37
That's fun. That's always fun
Victoriya Kalmanovich 17:38
That was really-- it was so annoying. That was one of the weird bugs that we worked on. It's kind of an easy one, right? Because of the negative timestamp, you know exactly where in the code, you need to take a look and which component had the problem. But that was something that we discovered a little bit too late and we had a lot of negative timestamps at that point. So we needed to both handle the data and handle the code changes. So that was annoying, but yes.
Liran Haimovitch 18:08
Anything else you want to share with us about Aspectiva, about Walmart, about Amazon?
Victoriya Kalmanovich 18:15
I guess the most important thing that I can share about being a Walmart company is that it has a lot of challenges, to be acquired by this big a company like Walmart. But Walmart is such a people-oriented company, that it, first of all, it gives us a lot of space and a lot of autonomy for our ideas and we get to share a lot of our initiatives. And we get full backup from executives within Walmart, which is really cool. And so because it's so people-oriented, we can really feel like we can share ideas and push forward our initiatives, and this is something that I really appreciate in Walmart, and it makes my job really interesting. It makes our day-to-day super interesting.
Liran Haimovitch 19:01
You always live in an interesting time. Sounds super cool, Viki. So if people want to learn more about Aspectiva, or Walmart, what should they do?
Victoriya Kalmanovich 19:09
They can totally reach out to me on LinkedIn or on Twitter. I'm famous on Twitter now.
Liran Haimovitch 19:13
Yeah, you have a podcast.
Victoriya Kalmanovich 19:14
As of last week and also I have a podcast. So feel free to reach out to me or to anyone at Aspectiva. We're not that many people here and we're all very friendly and nice.
Liran Haimovitch 19:33
So that's a wrap on another episode of the Production-First Mindset. Please remember to like, subscribe, and share this podcast. Let us know what you think of the show and reach out to me on LinkedIn or Twitter at @productionfirst. Thanks again for joining us.