The Data Canteen: Episode 16

Lak Lakshmanan: Data Science...Broader Than Ever!

 
 
 

Lak Lakshmanan is an operating executive at Silver Lake focused on improving the value of portfolio companies through data and AI-driven innovation. Prior to Silver Lake, Lak was the Director for Data Analytics and AI Solutions on Google Cloud and a Research Scientist at NOAA. He co-founded Google's Advanced Solutions Lab and is the author of several O'Reilly books and Coursera courses. He was elected a Fellow of the American Meteorological Society (the highest honor offered by the AMS) for his data science work.

In this episode, Host Ted Hallum and Lak dive into Google Cloud's evolved view of what data science encompasses (IT'S BROADER THAN EVER!), the biggest developments that Lak sees on the horizon for our field, how to deal with the blistering pace of change in the datasphere, and the challenges faced by all non-tech organizations that are striving for an edge with AI.

FEATURED GUEST:

Name: Lak Lakshmanan

LinkedIn: https://www.linkedin.com/in/valliappalakshmanan/

Twitter: https://twitter.com/lak_luster

Medium: https://lakshmanok.medium.com/

SUPPORT THE DATA CANTEEN (LIKE PBS, WE'RE LISTENER SUPPORTED!):

Donate: https://vetsindatascience.com/support-join

EPISODE LINKS:

Lak on Medium: https://lakshmanok.medium.com/

Lak on Google Scholar: https://scholar.google.com/citations?user=qphajtkAAAAJ&hl=en

Lak's Website: https://www.vlakshman.com

Lak's Books: https://aisoftwarellc.weebly.com/books.html

Lak's Technical Articles: https://aisoftwarellc.weebly.com/articles.html

Lak's Recorded Talks: https://aisoftwarellc.weebly.com/talks.html

Lak's Courses: https://aisoftwarellc.weebly.com/courses.html

Lak's Journal Articles: https://aisoftwarellc.weebly.com/research.html

Lak's Resume/Vitae: https://aisoftwarellc.weebly.com/resumevitae.html

PODCAST INFO:

Host: Ted Hallum

Website: https://vetsindatascience.com/thedatacanteen

Apple Podcasts: https://podcasts.apple.com/us/podcast/the-data-canteen/id1551751086

YouTube: https://www.youtube.com/channel/UCaNx9aLFRy1h9P22hd8ZPyw

Stitcher: https://www.stitcher.com/show/the-data-canteen

CONTACT THE DATA CANTEEN:

Voicemail: https://www.speakpipe.com/datacanteen

VETERANS IN DATA SCIENCE AND MACHINE LEARNING:

Website: https://vetsindatascience.com/

Join the Community: https://vetsindatascience.com/support-join

Mentorship Program: https://vetsindatascience.com/mentorship

OUTLINE:

00:00:00​ - Introduction

00:02:15 - How Lak got into data science

00:10:06 - How to deal with the blistering pace of change in the datasphere

00:16:45 - Google Cloud's evolved view of what data science encompasses

00:25:48 - Is Google Cloud's new definition of data science inspired by MLOps?

00:28:05 - Biggest developments that Lak sees on the horizon for our field

00:33:59 - How capable do you think AutoML is? What role will it play in the future?

00:39:59 - Where Lak suggests veterans should focus when transitioning into data 

00:47:22 - Challenges faced by all non-tech organizations that are striving for an edge with AI

00:51:25 - Lak's secret to keeping his learning and growth focus on track

00:54:38 - Lak's favorite way to learn new things

00:56:12 - How Lak prefers to be contacted

00:56:34 - Farewells

Transcript

DISCLAIMER: This is a direct, machine-generated transcript of the podcast audio and may not be grammatically correct.

[00:00:07] Ted Hallum: Welcome to the data. Canteen, a podcast focused on the care and feeding of data scientists and machine learning engineers who share in the common bond of U S military service. I'm your host, Ted Hallam today, I'm chatting with Lac Lux, modern director of data analytics and AI solutions at Google cloud. We hit on a range of topics in this conversation to include Google's evolve.

You have data science and what that means for you. The biggest developments that lax sees on the horizon for our field and the biggest challenges for all of us, practitioners, businesses, and governments, as we all strive for an edge with AI, I hope you enjoy this conversation and here we go.

lakh. Thank you so much for coming on the show, man.

I have been excited about this. It took a little while to get the coordination, right, because I had some stuff going on on my end and we had to reschedule,

[00:00:53] Lak Lakshmanan: Life happens dead. Happy to be here. Super, super dope. Very honored that you invited me over. Uh, um, I'm really looking forward to the

[00:01:02] Ted Hallum: conversation.

Now, when I was looking over your background, I was just blown away. So you've written tons of papers, tons of books. I mean, speaking of books, as I was looking through your background, I realized. That you actually wrote two of my favorite O'Reilly books. I've got here, practical machine learning for computer vision and machine learning, design patterns.

just absolutely fantastic. I think what I looked you up on Google scholar, there was like over 200 citations. Does that sound right? Uh,

[00:01:31] Lak Lakshmanan: yes. I'm a re I like to say that I'm a recovering academic. There's some sense in my second career, I know this is very familiar to like a lot of you on this podcast.

Now you retire from the military and you go into civilian life in that's very similar. Right? So I, I, I was, uh, uh, no researcher at, uh, not doing weather research, uh, did that for 20 years and then basically found myself in a startup, uh, based, you know, doing precision agriculture for farming. And that's what brought me to Google.

Uh, so I, um, I'm on the second, second inning, second career. All of those papers are from my ethic, my previous.

[00:02:15] Ted Hallum: Wow. So then you, you hit on what I was going to say. I noticed a range of everything from data science, for weather, all the way to your current role. Now as director of data analytics and AI solutions at Google cloud.

So just for folks, you know, we have people, um, every day coming into the veterans and data science machine learning community. And they're just thinking about getting out of the military, or maybe they transitioned out of the military to civilian life recently. And so they're very much thinking about if I want to get into data science, what should that look like?

Can you tell us about your own preparation and what you did to get into the field of data science and then your current.

[00:02:53] Lak Lakshmanan: Oh, absolutely. So I think mine was the very traditional entry into data science in the sense that, uh, I have an engineering background. I did my undergrad in engineering. Uh, I did my master's work at the Ohio state university working on, uh, ultrasound images.

So this was, my work was in the Cleveland clinic. And, uh, this, this tells you how old I am. Uh, so this was late 19 93, 94. And I would be in Columbus, I'd be at some kind of a student party. And somebody says, what are you doing? And I'd say, well, I'm working, uh, working at the Cleveland clinic and they say, but you're here in Columbus.

How are you working at the Cleveland case? And I would explain to them about this thing called the Unix and how you could log into a computer lake a few hundred miles away, and you could basically get data and do things with data. Uh, so, you know, in that sense now I think I've, I've had a lot of advantages in the sense of, it's very easy to keep up with technology as it, as it grows when you're working with it.

Uh, so I started my career doing computer vision, could doing image processing for medical images. And once I finished my masters, my first job was in a weather research laboratory called the Nashville, severe storms lab in Oklahoma. And so my work essentially involved, uh, finding patterns on in weather, radar images for things like tornadoes, flash, floods, lightning, hail, et cetera.

So any, any of you who live in the Midwest and you have, uh, somebody on TV screaming about a storm, that's basically gonna come to your town and hit that exactly. 2:13 PM. Well, all of those are ML models that basically were not built by built by our teams. If you've been in an airport and you've seen like the continental map of like weather over the entire us, that's like 140, 150 radars.

All of those have to be combined in real time cleaned up. So that was my life's work. And I had been doing that till about, uh, 2014. And one of the things that, that happen when you're in academia is that you get to consult with private companies too. And one of the companies I was, I was consulting for was called climate corporation.

They based in Silicon valley. And so they were a precision agriculture forum and the idea was, uh, telling farmers what to plant when to plant it. What, not depending on their soil type, depending on the type of seeds, et cetera. And, uh, uh, they got acquired. They got acquired by Monsanto at the time. Now it's there, but they got acquired them on center and they needed to build their data science team very fast.

And because I had been consulting with them, they asked me to come over and help them build that, build that team up. So it took like one year off. From, uh, moved from Oklahoma to beautiful Pacific Northwest. So Kane came to Seattle and like, you know, built this team. And now that team essentially doing great work, built up like amazing rainfall estimation model that gets used in farming.

But the point is at the end of that year, I had a choice go back to academia or continue in industry. And there were two things that happened in 2014. So one was, uh, there was this, uh, very, uh, influential. That came out of the university of Toronto, uh, on deep learning. So now people have been doing convolutional learning and deep learning and like, but it didn't actually work in reality until, uh, this, uh, group of authors basically proved that you could run this and GPU, they did a bunch of different things and they essentially won the con the, the, the, the traditional contest that's used in computer vision to benchmark your applications.

It's called image net. And he, they wanted by the kind of improvement that you see over 10 years, they just got in a year and everybody's like, okay, like this field has changed, right? This feed has changed. And, and the, how had it changed all of the stuff that I had been doing manually designing conversion filters for radar images, the ML model just learns something.

Right. You're given enough data, it learns it. Right? So that's the first change that happened. The second change that happened was that that was the first time that I used cloud. Right. And one of the hardest things in machine learning is building a training data set. And it used to take us nearly four years to process 15 years of weather data.

But that was because we had only 17 machines to us move off to the cloud and you have near in finance scale and anything that you can do with a hundred machines and 10 hours you can do with the thousand machines in an hour. So, which essentially meant that we brought that processing time from four years to two weeks, we just know it's the same cost for the same amount of money.

You get to process more data faster and you could experiment with. And that was the realization that the world had fundamentally changed. All of these things that I had been writing these papers on applying machine learning techniques and data science techniques. They didn't cultivate a sense at the time.

Uh, but now, uh, these techniques in science was now open to people who are not scientists, people who were not engineers, people who did not have that kind of academic training that I had had. This was now what we would say it's democratized. So to the point where someone who doesn't have training in the field can pick it up.

And that's why I'm like super excited to be talking to this group here, because it is now ready for folks like no, no, uh, No, no. Ted, we were just talking before we got on the show about like, not about what your entry into data science and the idea here is that you were able to pick it up, even though you didn't have the quantitative training.

I love hearing that story because that basically proves the power right. Of this whole thing. So, uh, my journey into this doesn't have to be your journey. Your journey is going to be a whole lot easier and not, that is absolutely phenomenal if it is.

[00:10:06] Ted Hallum: Well, I know that's going to be an encouragement to a lot of people who are listening to this episode.

But as I was hearing you talk about your background, there was a couple of takeaways and questions that came up in my mind. And, and that is, I mean, you have tremendous history specifically with computer vision. That was one of my takeaways. And you've seen multiple major evolutions in the field of computer vision.

So I'm curious what advice you have for our listeners who are going to have, you know, 10, 15, 20, more years in this field of data science and machine learning. what are your recommendations for how to keep up with things as they change? And then also I could imagine, from the way you described it, that it can be sometimes not just difficult to keep up, but also maybe.

Kind of depressing when you've spent a lot of time and effort to specialize in a certain thing. And then the technology may be kind of leapfrogs and you're like, oh, well I was so good at this, but now like those skills aren't as relevant anymore. So how do you evolve as fast as the technology? What are your recommendations?

Absolutely. So,

[00:11:12] Lak Lakshmanan: so let's, let's take those two, cause it's a great question. Let's take them one, one of the other, the first one, uh, computer vision. Yes. I had a lot of experience in computer vision. Uh, but that was because I had to, I had to specialize because it was a hard, challenging technological thing.

Now, when now we have people on my team, they do machine learning where they can easily hop between natural language processing and computer vision and time series modeling. Right. They can hop from one to the other much easier because as the obstruction levels increase, you now are able to cover more.

Okay. So it gained like your journey. Doesn't have to be my journey, right. Uh, I had to do it because I'm old and I've basically been in this field for a long time. That doesn't have to be your journey into this. Like you have the ability to actually cover a lot more things. And again, so that brings me to the second thing, but my background was in computer vision, my computer vision, all of these stuff that I knew about computer vision became in some sense, irrelevant right now, the whole idea of designing a spatial filter, designing a matched filter, being able to basically, uh, you know, uh, I do pattern recognition by hand doing feature engineering, understanding, uh, correlate co-occurrence matrix.

Who cares and you don't need them anymore because if you have enough data and your train, a machine learning model, the first five or six layers of the model, learn it by themselves. And you go look at the second layer of an ML model. It has a whole bunch of textures in it. Textures was my PhD thesis. I spent four years developing textures on satellite imagery.

Now, do I get depressed that M and a mother model learns that thing in 30 minutes? Well, that's one way to look at it. Like, oh my God, I spent four years, like, you know, uh, getting a PhD for developing this thing that a machine just learns so quickly and easily. That's one way to look at it, but I'm much more of an optimist that way.

Right. Rather than looking at it that way, instead I say, okay, like I had to do, do computer vision. Forget about that. Now I get to do recommendations. Now I get to do, uh, a time series forecasting. Now I get to do NLP. I had not done a single machine learning model or natural language until I got to Google.

I never had to, but I learned it. And I think the fact that I could learn it, even though I had no expertise in linguistics is like, yeah, sure. I sacrificed all my, uh, like esoteric knowledge and computer vision. But in return, I got the ability to basically cover a lot more ground. And that is a, that's a normal technology evolution as well.

So you don't, uh, you know, I mean, part of being in technology is this being willing to live with the rapid pace of change, what never leaves. Uh, is that judgment is the intuitive understanding that you develop in one field, it will apply to other things as well. Right? The fact that I know and deeply understand convolution means that it made it easier for me to pick up a recurrent neural networks.

Right? It is, it is basically a processing in a different domain, but it is that the idea of the state for processing makes a lot of sense. Right? So things, uh, yeah, so that, that, that those conceptual models, the obstructions, they don't go away and, uh, uh, and definitely judgment that you develop of what is hard, what is easy.

What's going to take a long time. What is a bear trap? Those things are things you can only develop through experience and that doesn't go away. So now fundamentally I'm an optimist, as far as technology changes.

[00:15:29] Ted Hallum: So, what I heard you say was that the underlying things that you've learned, those more, those lower level, that lower level knowledge, you're still going to be able to apply it.

Most likely it'll just be in a different way than you originally applied it. And then the other thing I took away was that you just, you, you recognize the new technology for what it was that you were sort of trading a really awesome 57 Chevy that you'd worked on really hard for a brand new Ferrari, right?

And so while the, while the other, you know, your heart and soul is invested in it, you could still take the brand new Ferrari and be really happy for that and its capabilities that it gave you to do your job right. More or more

[00:16:11] Lak Lakshmanan: or more like down. I just creating my 57 Chevy for a bus

fundamentally, fundamentally something that would work only for. Into something that moves a whole lot more people, much more efficiently gets a lot more done, but I don't get the personal satisfaction of driving a driving aid and the Chevy or a Ferrari. Right. It's become an opera utilitarian.

[00:16:44] Ted Hallum: Sure, sure.

Well, I thank you for sharing those insights on how people can cope with the pace of change, because sometimes it is blistering. Um, and that in itself is, is a skill to be able to keep up. So like at this point, I want to transition to a diagram that actually is what was served as the catalyst for you coming on the show today.

So let me go ahead and bring that up.

All right. So I have to give credit to Adam Jennings. He found this awesome, uh, workflow diagram that you guys have created a Google cloud. My understanding is that it. Captures your current and near to midterm vision of, of what data science is and how it functions. Um, he posted this in our slack and the reaction was overwhelmingly positive.

People really liked how it captured all the moving parts. but the one thing that it did generate questions about is sort of the overarching term that we have here for the whole thing is data science. And I think conventionally, a lot of people would have referred to that data analysis area and the model development kind of collectively as data science and data engineering would have been considered, you know, its own separate field and then ML engineering on the other end would have been considered its own separate thing.

so there was, as you might imagine, there was a number of questions that kind of spiraled out of this, things like, Google probably made this nomenclature choice for a reason, does it point to specific things that Google cloud views about where our field is headed? There was also questions about, well, if all of this is data science, that's more than what any one person can do.

So is the title data scientists going to go away because no one person can embody all of that.

[00:18:40] Lak Lakshmanan: So, uh, perfectly valid questions.

Right? Uh, and the thing that I would like to basically, uh, uh, come, like it's sometimes good to think of an analogy. Okay. Let's take, instead of data science, let's imagine that the analogy is medicine, right? And you say you have a medical doctor. Uh, there's a lot of different people who together form that.

No. Uh, the, the entirety of the experience of a patient experience that you, so you like the data science is what all these people do. Right. And the justice patient care is what everyone in medicine does. So all of these people are doing data center. Not just the most glamorous, intermediate thing of model development, because how good is if your model, if the data engineering people don't basically take care of data quality, right?

What if they don't make your data discoverable? So, uh, data engineering has to be done with the perspective that insights have to be drawn from the data justice. You know, the person basically admitting patients into the system for care, right? The person at the front line who may be the primary physician, right.

Has to understand right. All of the stuff that's going to follow afterwards. And so that's one way to think about it, right? It is not that, uh, this is not, this is a data. Science is not an individual person. Um, it's not something that an individual does. It's a team sport and there is multiple roles that people play in order to do data science.

There are data engineers, there are data analysts. There are model developers, right? Uh, there are, uh, male engineers, and then there are developers who activate the insights out of data who basically build websites that show the result of recommendation systems, et cetera. Okay. So there are multiple roles here that together, basically.

Um, create a data science workflow. So when you talk about a data science workflow, but this is what you're talking about here is how do you, how do you get insights from data, right? That's the, that's essentially it. How do you get insights from data? How do you automate the getting of insights from data now?

You're absolutely right. That, that lots of times people talk about the middle of the model development as data science, but then you actually go to it practicing data scientist, and you ask them, what is it that you do most of your day? And what is it that they do?

[00:21:44] Ted Hallum: They get their data ready. They have to do data wrangling data analysis, then they experiment and it doesn't do what they thought and they go back and bingo.

[00:21:53] Lak Lakshmanan: Right? So what, what a data scientist does, 80% of the time is data analysis and data engineers. So now are we going to go to, I mean then doesn't it make sense that data science is what data scientists do. It does. Absolutely. Right. So that's, that's the other way to think about this, right? This is what data scientists in industry do.

They do data engineering, they do data analysis, they develop models, they put models into production. They basically help generate insights. Now does any one person do it? No, it's a team sport, but which things do people do depends on the industry. Depends on the person. Right? So that what you just said, Ted was that you have data scientists who do data, wrangling data analysis and model development.

I get that's one view of it. But if you look at another view of. Imagine that you're in a highly regulated industry, right? You're in finance, right? Uh, X, et cetera. The data engineering has been done, right? The data lake has been, the data have been put in a governed form in a golden source. And then the data scientists and those industries, they start from there.

They start with the data that, that is that they have available to them. And then they build them all models. They productionize it. They basically use it to make trading decisions. They're really like if you ask them, what did they do most of the time? They say 80% of the time I'm out there fiddling with my time models.

I don't really get to do modern development because all the time I basically fixing, uh, bugs in my models and basically adapting my models to what's happening in the markets, et cetera. So you have that, maybe the other view of. So the idea is that these are all the things that need to get done, what any one person does is going to depend on the company and the industry.

It's going to be a slice of that. So that's, that's an, a large company now go back to a really small form, right? You're a team of five and you're a startup and you basically have to get going. You have to basically build your minimum viable product, who who's going to do the ML engineering. Who's going to the ML activation.

Who's going to do the data engineering, it's you, right? And yes, you've got to do everything. But the good news is that everything here is getting easier because of the higher levels of abstraction. Will you be able to do data modeling to the extent of somebody who works day in and day out on data model?

Probably not, but will you have a good enough data model to get your MVP out the door? Yeah. Same with ML engineering. Will you basically be able to detect drift immediately? Will you be able to adapt your model? Maybe not to the level of somebody who specializes in that, but will you then just put a module out and never, never change it?

No, you would probably put the model on a schedule and drift or node drift. You're going to retrain the model every, every week.

[00:25:24] Ted Hallum: It's fine on a temp on a temporal

[00:25:26] Lak Lakshmanan: basis on a temporary basis. Right. It's fine. It's not as efficient, but it's, it's fine. It's good enough. It gets, it gets the job done. And that kind of, uh, like, uh, adaptation is very, very, very common.

So you need to do all these things, but you don't need to be the world's expert in all of these things.

[00:25:48] Ted Hallum: Well, I think that's a perfect segue to the next question that I want to do. Ask you, and that is not to not to muddle the waters with even more terminology. Cause there's certainly plenty, um, in this diagram, but as I looked at the diagram and you see all the arrows and everything very much has a flow, um, I was curious how much of this was inspired or driven by the push towards Emma lops.

Because to me, while the title says data science, what this really is, it's a mapping of a full featured ML ops platform flow building those automated reproducible pipelines. Yeah,

[00:26:29] Lak Lakshmanan: totally. I mean these days, a lot of data science is moving from being descriptive to being predictive and a lot of ML models.

Are moving from train once and run forever to basically adapt, uh, to the environment. And the methods are moving from being bespoke, to being fully automated, put those three things together. And that's what is ML ops? Right? So definitely this diagram is showing where the field is headed. And the good thing is that if you never retrain your models, it means that a few of these things like model monitoring crops, maybe.

So you may not have to do a few of these things if you're basically doing things, not in an Emma labs way, but our hope is like, you know, overall, if you're going to do the most challenging kinds of problems and you want to do it as fully automated as possible, everything being predicted. Then this is what you would need.

If you're doing things, let you know, where are you basically saying I'm not going to worry about these aspects. There are things in here that you don't have to do.

[00:27:50] Ted Hallum: So at this point, I definitely want to go back. I know I mentioned that I got this graphic from Adam Jennings who works there, at Google, but, I also have to think Adam, because if it hadn't been for him running quite a bit of interference, we wouldn't have been able to have you here on the show.

So definitely thanks to Adam for, doing all that communication, emailing to help make this happen. so before we move away from this slide, given everything that you just said, when you look out over the next five years, you know, and you have a really nice altitude to, to give us a perspective, what are the two major.

Implications or changes that you think will happen, that this graphic is maybe trying to get at that people should be aware of and have on their radar.

[00:28:41] Lak Lakshmanan: number one, right? Uh, you notice that, uh, the order here in data engineering followed by data analysis, that actually is not what happens in a lot of businesses.

Today is data analysis followed by data engineering, because you basically looking at the data before you decide how you're going to preprocess it and how you're going to store it. What you're capturing here is looking ahead a little bit, what we're seeing the most advanced customers of ours doing, which is landing the raw.

Right. Or in the data as raw uniform as possible, and basically opening it up for analysis. So one of the things that you see here is that trend of ETL, right? You're basically transforming before you're loading the data and you're basically changing that to ELT. You're loading the data before you're transforming it on the fly.

Right. So you've seen that as one of the trends that you're, that you're kind of seeing here. And the other one that you, you rightly pointed out is that if we had done this diagram two years ago, the whole ML engineering thing would not have even existed, uh, because the primary mode of people basically deploying machine learning orders to production was you train a model and you deploy it.

Right. And you deployed embedded in your app. But now you basically see the whole idea of, uh, reuse and you see reuse everywhere in this diagram, you see a reuse with the feature store. You see a reuse with the model registry. You see reuse with data cataloging. You see a reuse with data insights, right? So you see reuse not being called reuse, but you see reuse.

And every, every part of this diagram, because justice with traditional software, there is becoming this emphasis on not doing things as a one-off, but as basically building things in a reusable, uh, reusable manner, componentized, and in traditional software, these are libraries. These are microservices and then data.

These are. Uh, no data catalogs these as metadata. This is, uh, you know, uh, semantic layers for data insights. This is registered models. This is feature reusable features. This is about basically, you know, uh, even like embedded analytics or insights activation, where you can basically build a widget and that widget can be embedded in multiple websites.

So that, that idea of reuse was not something that was at the forefront of people's minds when they thought about data two years ago. And we really see this starting to become a big thing. So in the next two to three years, I think a big that idea of expanding flexibility, making things more agile, making things more reusable is going to be a big.

And are you asking about five years? I have no crystal ball to five years. I have a crystal involved with two years because I can see what, uh, that, how long it takes for the really advanced set of customers of ours, of Google cloud. Right. To be doing things before we start to see it, basically being more broadly embraced by, uh, the, uh, not early adopters, but the.

[00:32:31] Ted Hallum: No. I think that you have provided our guests with phenomenal insight here. And it very much is in line with what we've heard in some previous episodes. I had a guest on the show before and I asked him, and he's a hiring manager for data science and machine learning teams. And I said, what is the hardest skill for you to hire for so that our folks will know where to put their upscaling investments where there'll be most valuable.

And he said, add right now, absolutely ML engineering and ML ops. He said, it's just almost impossible to find the people that I'm looking for with these skills. And, to, to the point that you just said, it's critical because everybody wants to introduce more reproducibility, more automation to scale that human capability, right.

Because if you don't have that, if everything's still manual and one-off, then a data scientist builds, what would you say? Maybe 10 models. And then from that point, it's just trying to monitor and maintain. He or she is built. Whereas if you have a system like what's laid out in this diagram, then you can stay focused on solving the business's next problem.

And your model that's out there and production the monitoring. What it's, it's being handled. If it, if it drifts it'll be detected, probably, you know, if, if, if the right data pipeline is there to continue growing the training set, it might just automatically retrain and then redeploy, you know, with nothing but a dashboard notification.

And then you just continue with developing your next model. I mean, it really is a beautiful thing. so I can, I can absolutely, see why you said all of those things, and I'm so glad that you captured that for our listeners. Now, like for the next part of the conversation, I'd like to read you a quote that I recently heard from Sadie St.

Lawrence. I was listening to the super data science podcast and there was a quote that really stood out to me. she said, if you're a practicing data scientist right now, and you pride yourself on the cleanliness of your code and the complexity of the models that you develop, you might want to take a little bit of a different look and approach to that.

I'm not saying your skills are going out of date, but at the end of the day, we're automating our job at the same time as we're doing it. And I think this perfectly dovetails in with what you just said about the reproducibility and the automation in MLS pipelines. It also dovetails nicely with what you said of your own experience of having to stay up with the times and things change.

And you can't just continue to do what maybe you originally were trained to do. So, I guess, with, with the context of her quote in mind, specifically when you think about auto ML, which seems to be getting more and more capable, I think it's extremely cable with tabular data. but I'm, I'm interested in your perspective and things you may know from where you sit as far as computer vision and NLP.

do you think the auto ML is on the horizon going to be extremely capable in those arenas? Is. Yeah. I mean, if

[00:35:37] Lak Lakshmanan: anything, our terminal is even more capable on image problems, like object detection, image, classification, et cetera. Uh, it is super hard, uh, to, uh, beat and alter and modern these days. Now it reminds me of, uh, of playing chess against computer programs and no once upon a time, right.

Uh, you could, you had a shot and now the only shot you have is to basically go in and change the settings of the computer program so that it doesn't have any time to think anymore. Right. So that tends to be the thing with auto ML. If you give it only an hour, only two hours. Yeah. You may be able to beat it, but given autumn program 24 hours of budget of 24 hours and sorry, it's just, it's going to be so good that it's going to be very, very, very, very hard for an, for you to be.

Especially when you take into account the cost of your time. Right? And you've got to remember that, uh, at the end of the day, uh, your time is valuable and you want to spend it on things that cannot be automated, because if something can be automated, you should let the automation take care of it. Because as I said, there's a lot of human judgment involved.

There's a lot of decision-making involved. You want to basically allow yourself to play in the places where you bring unique value, right? So it is, uh, you know, sometimes a lot of the discussions about automatic turn out to be you versus optimal. Can author do your job better than you? Well, that usually means that you're defining your job very, very, very narrowly, right?

The only thing . Is that given, uh, given a data set, it will create an ML model, figure out the right hyper-parameters figure out the right architecture. Usually by doing an exhaustive search. Sometimes they're doing an architecture search, but building architecture components one by one, but regardless it's a search through a loan catalog and it basically gives you the best thing.

So what it has done is that it's automated your experimentation. That's all it's done. And if you define your job as basically doing that routine, changing of parameters and rerunning the model, you're looking at your job very, very, very now. But if you think about your blood, that data science background that we talked about, if you think about your job as deciding what data to collect, deciding which teams to go work with, deciding which problems to solve, deciding what metrics to use, to measure the performance of your problems, deciding how to communicate the results of your models to decision-makers as changing the workflow of end users by providing them guidance along the way.

These are not things that autoAML does. And these are the things that humans are extremely good at we're social creatures, and we should be looking at what can we do? To take advantage of our communication skills of our judgment, of our problem solving skills and not think of ourselves as calculators.

Right? I don't think you and I measure ourselves against lake your Texas instruments calculator on whether you can multiply a five digit number faster than your Texas instrument calculator. We don't, why are we measuring ourselves against our Pell math?

[00:39:54] Ted Hallum: It's a great question. When you put it that way, it seems absolutely absurd.

So with everything that you said in view and trying to focus on those things that can't be automated, for folks that might be transitioning out of the military, Right now, as they hear this episode, in terms of skills courses, degree programs, where do you recommend that people in 2020 to invest their effort so that it will have the best return for the foreseeable future?

[00:40:25] Lak Lakshmanan: Okay. First of all, I think for people, uh, uh, no, um, transitioning out of the military, first of all, thank you very much for your service, right. Uh, you know, this is, uh, uh, we're recording this on the week that, uh, Russian tanks just rolled into Ukraine and, uh, you know, a week like this, you totally, totally appreciate all of the, uh, no, we would not be a country with, without the folks in, in service.

So, uh, you know, huge, huge, thanks for it for, uh, like everything that you do. Having said that, right. You're coming out of service and you're transitioning out. And the first thing that you want to look at is where are you struggling? And I've worked with a lot of ex Marines. I've worked with a lot of people who are in the air force because I worked with folks from the air force weather agency.

And the thing that always struck me was the military builds leaders, right. There's leaders at a much younger age than anywhere else. Right. So people come in with, with their head squarely on their shoulders with that ability to basically be calm under pressure, to be able to basically lead groups. I mean, where else do you lead?

Do you get to lead a group of like 15 people when you're in your early thirties? Right? So the military gives you that amazing, uh, leadership experience and amazing ability to basically have situational. So remember that those are your sprints. Those are the strengths of the military has given you the leg up.

If you will, that people in civilian life just don't have. Right. It's, it's really hard to get that leg up anywhere else. So what you should be looking at is how do I take these skills that I have and apply them to this new domain that I want to enter? Okay. So there's two ways that you should, you should be looking at it one way is, uh, that technologically, right.

You don't want to basically, uh, you know, be, uh, thinking that you have to be technically as capable as everybody on your team. Right? You want to think in terms of, again, data science. Software engineering. All of these are our team, team sports and being a team sport, you don't have to be the absolute best at everything you want to basically compliment the team.

And the way you compliment this team is with your leadership, with your situational awareness, which I think two options with your ability to basically get lot of people behind you. Right? So given that, what roles do you play in software? You know, I find that people in the military are amazing product managers, right?

They're amazing at basically go to market because they have the no cold calling is one of the hardest things in the world. Right. And, uh, and, uh, sellers do this all the time. But people with a technology background find it super hard to do, but if you're building a go tomorrow, And every product, every software product needs go to market.

So if you basically on the go-to-market side of a technology company, now your skills will be phenomenal and you will shine. Right? So choose where it is that you want to play in a place that you're going to be. Absolutely. You're going to just kill it, right. In terms of being like amazing at it now, what, what do you have to learn that now, now that you know where you want to go, now, you understand what is it that I want to learn?

So if you are saying that, if I said, are you okay as data science? I'm, he's going to say you should go listen to like the Andrew and Coursera courses and you should go read the PhD treatises on, uh, uh, on learning from data. Like Trevor has these books, et cetera. You can. But I think that is fundamentally like trying to be what, what, you're not right, but that's not your strength, your strength isn't isn't product.

Your strength is go to market. And given that figure out, um, more of the, uh, uh, MBA courses, figure out more of the business courses, figure out more of how data gets used to make decisions, how data gets used to do marketing campaigns, how data gets used to do sales, lead scoring. Those are the things that you, that you should learn.

I don't know what other guests have told you, right? I'm maybe like completely off the one end here, but it's not about basically learning image classification. It's not about learning. Uh, you know, RMMs, it's not about learning how to do AMA. It's about learning how to take the results of ML models and apply them in a business context, which is where, you know, you're going to you, you're going to be phenomenally successful.

[00:45:46] Ted Hallum: So, I, we haven't had a previous guest say that, but I actually really like what you said, because it's very complimentary, I think it kind of validates some sidebar advice I've given. I've never, I don't think I've given it, explicitly on, on this podcast. but I personally did a master of science in business analytics and I picked, I picked the most rigorous MSBA program.

I could find that where I should, there's lots and lots of model building. You know, we got all the way into computer vision and NLP and everything that program, but I wanted the business acumen as well. And I felt like marrying those two up was very valuable for all the reasons that you said. And also as many military folks don't have, necessarily like a bachelor's in mathematics or physics or something like that.

usually those programs, if you pick a rigorous program, they're still gonna have some salad prerequisites, but they're going to be a little bit more accessible than like, if you just did a straight up master's in computer science where you have to have multiple calculus courses and yeah,

[00:46:48] Lak Lakshmanan: we were essentially the first five classes are weed-out classes.

Exactly. Yeah. You don't need that. No, that's not. Because again, that's not playing to your strength and I like, I'm a, I'm a big believer in, uh, like knowing what the first 35 years of your life have been like and taking advantage of it, these skills that you've built, the opportunities that the military has given you to get ahead rather than basically, uh, like start behind everybody else in.

Uh, very happy to play a lot of catch-up

[00:47:22] Ted Hallum: well, you get momentum, right? And when you get extreme momentum doing a U-turn is complicated, but if you can shift your direction, then that's a little bit more feasible. And I think that that's a great way for veterans to look at it. so changing gears a little bit from your perspective, you know, when you look at across the globe, there's a race everywhere.

There's a race in the private sector, race in the public sector for people who want to do data-driven decision-making and do it well. So when you think about both non-technical companies, because I think that the really big tech companies, of course, which you're very familiar with there at Google, they very much got this ironed out pretty well.

I think when you're talking non-tech midsize American businesses, there's probably a lot of people out there that still have a lot of learning to do about how to do that process. Well, that was on the diagram earlier. And I think in government, that's kind of the same thing. So thinking about the U S government thinking about non-tech midsize American businesses, what do you think are the biggest challenges to remaining competitive on a global scale and getting that ability to do data-driven decision-making in a way that's going to be on par with everywhere else in the

[00:48:41] Lak Lakshmanan: world.

Yeah. You know, the funny thing is I work at Google. Yes. I know everyone else says, oh, Google has got the data thing completely figured out. And I know that my, my weeks are filled with all of the places where we haven't gotten it figured out. And we're basically basic. Our goal is to figure it out. But I, I get where you're coming from.

Which is that the areas that you're addressing are more of the corner cases and it's not the it's not big. The low hanging fruit is mostly all taken care of. Now we're basically looking at corner cases. We're looking at unusual situations, we're looking at, uh, two or three things happen together that makes, uh, the routine automated stuff, not work well.

Uh, RV, basically looking at places where the business is so new, that we don't have any data. Those are things that we're looking at, but that also points at the challenge that, uh, the middle, uh, if you will, is facing, because if you're a bank, uh, you cannot say I'm a bank. And let me look at what the other banks are doing.

You have to look at what square is doing, what Stripe is doing, what PayPal is doing. Because no, uh, there's there are disruptors, there are disruptors. I mean, just, just imagine what happened to the telecom industry with the advent of smartphones, right? Uh, all it takes, all it takes is a D is the disruptor who basically builds a product that ends up basically invalidating your entire roadmap.

And, uh, and now, no, the there's a complete sea change in the whole communication industry. Because of smartphones and that's like it, no, that's, that's, that's the, that's the no danger that payments forums, right. Are, are basically posing in the financial space.

the, uh, existential threat, if you will, in industry, after industry, that there is a disruptor out there. And if that disruptor is able to basically take advantage of data is able to take advantage of AI is able to take advantage of technology in general, to experiment faster.

Uh, there is no moat right anymore. That's gonna, that's going to protect your business.

[00:51:25] Ted Hallum: All right. So we were talking about the biggest challenges to like midsize us businesses in the U S government, but then taking it to the other end, the spectrum back to you personally, and then thinking about the conversation that we've had up to this point.

So it's been clear, you've had a career where the whole, your whole career encompass just learning and things changing and then learning more. and then you just talk to us about. You have to be ready when disruptive competitors are out to eat your lunch, like every day. So for you personally, I'm sure that you're still learning.

And I'm curious to know what is your learning focus today to be ready for tomorrow? Uh, so,

[00:52:10] Lak Lakshmanan: uh, I have, uh, uh, it's just my personal philosophy, I guess, but every two to three years I look up from wherever and wherever I am, whatever I'm doing and say, what is the thing that I don't know that I should be learning.

And I basically make it my. To not let the day-to-day get in the way of me learning that new thing. So sometimes it means changing jobs, changing roles within the company going and figuring out new things. But no, uh, definitely, uh, taking, taking a week of reflection, uh, to figure out what it is that you need to learn, uh, is, is an important, absolutely important practice.

Uh, what is it for me? I don't know quite yet, but, uh, vacations coming up.

[00:53:04] Ted Hallum: Okay. Well, I, I, there was power in what you said though. And I, everybody, I think could stand to write that down. Maybe put it on a post-it note at the bottom of their monitor of don't let the day-to-day choke out your vision for what you need to be learning, because that's really easy.

All, man, it ha it's happened to me many times where months go by and then I come up for air and I think, gosh, And so busy. I haven't learned anything in a while. Yeah.

[00:53:32] Lak Lakshmanan: Think everyone learns differently. And for me, the way I've learned, the reason you see all those books is because I learned when I explain things to people and I can get people, they listen to me long enough.

So I now, right,

[00:53:49] Ted Hallum: well, so I've kind of taken the opposite side. I've started to do these little, um, three, two to three minute video clips weekly. We put them out from the veterans and data science machine learning community. the snappy little name is called better way Wednesday. And it's where we take and we give a morsel of efficiency and I distill it as concisely as I can.

let me tell you, getting all of your knowledge straight enough, that you can condense it down. Like that is a challenge, and I'm learning a lot from doing that on a weekly basis. Again,

[00:54:23] Lak Lakshmanan: I think you're discovering the same thing, right. That if you can explain it easily and simply you really internalized it and that's, that's an amazing way.

It's, it's amazing of sharing with other people, but it's also an amazing way of making sure that you're continuing to

[00:54:38] Ted Hallum: learn. Absolutely. So, on that topic of learning, I'd love to hear, you know, obviously I'm sure that the data canteen is your new favorite podcast, but aside from the data canteen, what other learning resources in terms of courses, podcasts, books, just your favorites that you would like to point our listeners to so that they can, consume those as well and get all the value that those resources have done.

[00:55:05] Lak Lakshmanan: Okay. I've got to admit some things. It's going to be horrible to admit. Uh, but I, I read for pleasure. So pretty much the only reading that I do is fiction, right? Uh that's that's the only thing I do. Uh, all of the learning that happens, I do in a problem-based way. Here's a problem that I need to solve and I will go do research and find all of the articles, all of the books, all of the things that have been written about.

So I'm not really like subscribing to, uh, two articles. I'm not, I'm not reading the next new book that comes out, but instead of say, here's a problem that I have, and I would basically go, go and know like read books that were published like three years ago, four years ago. That's because they address the, the problem that I want to say.

[00:56:03] Ted Hallum: Well, like, I appreciate so much you coming on the show, providing all your insight to our listeners, humoring and fielding all of my questions.

on the screen throughout the whole show, we've had your LinkedIn username, is that how you prefer to be contacted

[00:56:18] Lak Lakshmanan: best way to contact me is on Twitter. It's uh, so like, you know, uh, Aleke underscore GCP. Uh, so let's go ahead and find me on Twitter.

That's the best way to reach out to me. Uh, I rarely post anything. Oh,

[00:56:34] Ted Hallum: okay. Well, we'll make sure that your Twitter handle that you just gave us is in the show notes below. So if you want to reach out to Lac on Twitter, you'll be able to do that and lack again from my heart. Thanks so much for making the time for us.

[00:56:48] Lak Lakshmanan: Yeah, you're welcome. And yeah. Thank you all for listening. Absolutely.

[00:56:52] Ted Hallum: Thank you for joining Lac. And after this conversation about how data science is evolving, what that means for you and how we can be best prepared for what lies ahead. Lax written a number of great books on machine learning, two of which live on the bookshelf next to my desk. If you'd like to check out one of those, there'll be links in the show notes below with that until the next episode I bid you clean data, low P values and Godspeed on your data journey.