Season 4, Episode 3
Guillaume Calmettes is CTO of Fieldbox.ai, a French-based company specialized in developing data-driven solutions for the industrial sector. He has a Ph.D in Biophysics with specialization in Systems Biology. Prior to joining Fieldbox.ai, Guillaume had been leading a wet laboratory studying heart bioenergetics at UCLA for 9 years, where, assisted by Brooke, he also developed the first resampling-based statistical class of the university. In his spare time, Guillaume is an accomplished ultra-trail runner, enjoying running very long distances ranging from 100km to 200+mi, and is always looking for a good excuse to apply data-science approaches to any sort of data.
Welcome to Data Brew by Databricks with Denny and Brooke. This series allows us to explore various topics in the data and AI community. Whether we’re talking about data engineering or data science, we’re going to interview subject matter experts to do deeper into these topics. In this season, we’re going to focus on connected health and how data and AI augment and improve our daily health. And while we’re at it, we’re going to enjoy our morning brew. My name is Denny Lee, I’m a Developer Advocate at Databricks and one half of Data Brew.
Hello everyone. My name is Brooke Wenig, Machine Learning Practice Lead at Databricks and the other half of Data Brew. And today I am thrilled to introduce Guillaume Calmettes, who is a senior ML Ops Engineer at FieldBox.ai and an insanely long distance runner and winner of the Last Man Standing race. Welcome Guillaume.
Thank you. Thank you Denny, thank you Brooke, very excited to be with you guys and talk about science and running.
So to kick it off Guillaume, what is this Last Man Standing race? Can you share a bit more about what this race entails?
Yeah, of course. So the Last Man Standing race started from the principle that in the actual competition, in running, worldwide, only the very fast runners are rewarded. And in fact, a lot of people, maybe your neighbor, your friend, your family, they’re doing incredible thing, because they’re very, very resistant, it’s just that they don’t run as fast as you those elite runners. So there is one guy who decided to make a race dedicated for those people.
And so the principle is very simple. You do have a loop, it’s seven kilometers, so it’s 4.633333333 miles, so a distance you can run in an hour. And in fact, you guys do have one hour to complete the loop. And once you complete this loop, you just wait for the next hour to start, and there is basically a new start every hour. And the race finishes and stops when there is only one runner left willing to go for another loop. So it’s more like a resistance event, and nowadays it goes for like days, but we could talk about that a bit.
So how long exactly did you go for Guillaume?
So myself, in 2017, I ran for 59 hours, so it was pretty good. So two and a half days without sleep.
Whoa. Okay, I got to ask. So wait, how do you squeeze in eating and going to the bathroom and all these other things? Is it just you finish you the loop faster, and then you have that little bit of time to squeeze that, is that the concept?
Exactly. You basically have an hour to do the running of the loop, plus everything you need, maybe I don’t know, you have to change your socks, you have to go to the bathroom, you need to drink a soup or anything, so you basically need to do your own strategy. Of course, you don’t want to run the loop too fast, because then maybe you get tired and also you will have to wait for 25 minutes, so you will get cold or something like that. So you need to basically time your loop perfectly to just do whatever you need in that allotted time.
And did you sneak in any cat naps or anything there?
So I tried to. The problem is that when I’m running, I’m very, very excited, so it’s very hard for me to nap. But some people can do it, and actually last year, the world record got beaten, now it’s 85 hours, by Harvey Lewis, the guy who actually finished just after me. But for those kind of events, you do need to have a very good sleep strategy. Unfortunately, I don’t have one, but that’s fine too.
Okay. As the Last Man Standing, for sake of argument, you were willing to run the 59th loop, the 59th mile, do you actually have to complete it in order to be Last Man… sorry, not miles, the seven kilometers, do you have to run the full seven kilometers in order to win? Or for the sake of argument, nobody else is willing to run, you ran like 20 meters and you’re good to go? Or did you actually have to run the full seven kilometers?
Oh, you still have to finish the loop. Basically, the goal is to do one more loop than everyone else, so you still have to complete it. So in the end, yeah, it was 246 miles, which was a good day.
Were you able to run the next day? How long did you need to take off after running that many miles?
Well yeah, you are a bit in trouble for the next week after a race like that, because obviously, your legs are a bit beaten. So it takes seven to 10 days to feel okay to run. But also it’s like motivation wise, because you have run so much in so little time, you need a mental break too from that. So yes, seven to 10 days, and you can for a little run around your park.
So switching gears a little bit, we’ll come back to running in a second, but you have a PhD in Cell Biology and also Biophysics. How exactly did that culminate into you becoming a long distance runner?
In fact, it’s tightly linked. So I did my PhD in Bordeaux, and then obviously in academia, you need to prove that you can adapt to another scientific environment, so you go for a post-doc. I had the chance to get a Fulbright Scholarship to join UCLA, in the Heart Lab at UCLA. And so there, I didn’t know anyone, it was a new country, so I didn’t have a car for LA, which is a big city without a car. So you need to go to places, so if you don’t have a car, you can bike or you can rent too.
So that’s basically how I got really into running. I started to go from places to other places, then I wanted to discover my neighborhoods, there was some mountain nearby, so I just started to go to the trail head and then go up the mountain and down. So the miles piled up, and then I just signed up for the marathons, that’s how I got into it. So basically, just because I wanted to discover my area, I got into running, and running got to me.
And just for some context, while Guillaume was a postdoc at UCLA, I had TA’ed one of the stats classes that Guillaume had taught. And I can vouch that he is a very good runner as well, definitely kicked my butt on a few runs.
But while you were at UCLA, you also got into the field of data science. How exactly did that transition happen?
So the project I was working on was basically in the context of heart attacks, and what are the different cellular modifications that happen at the bioenergetic level in your cells. And so you do have a lot of data, it could be for example images because you’re using fluorescent probe, and so you want to see for example, how protein moves from a cell location to another in real time. So you acquire all those data. You also have realtime data of metabolites concentration, because of course as your cell contracts, the energy is consumed and energy also has to be produced, so you do have a fluctuation and time series data of the different metabolites. So all those data needs to be analyzed, and data science comes very, very handy for that.
And as an example of how data science can help researchers, we were developing a very specific virus to modify a specific protein, and so the idea was basically to deliver a medication to your cells. And of course, to know if the virus changed the protein we wanted, one option was to send the cells to be sequenced, to get the genome of it, and it could take days to weeks to get back the results. So we decided to develop a model to classify the cell based on the metabolic traits, or different like features that we had and we are able to measure in real time. And so that way we could from this model, know like practically live, which cells were correctly infected and had the medication active and the cells where didn’t have it, so it was also a lot of gain of time for us.
So all those data, and it was also fun to visualize to data. So I guess as long as you do have data, you have to do data science.
So that’s actually a pretty interesting segue, so basically since you have the data, you might as well go do data science.
I guess. I mean, what else?
No, no, that’s fair. But then I guess what I was curious about is that, like for example, I’d be lazy, when it comes to doing the analysis of my data. So considering how much of a runner you are, I probably would’ve just set my process running, wouldn’t care, and then go for a run. So I’m just curious, how did you figure out the right balance of actually balancing out the running that you did, your passion for it obviously, but also to go ahead and become the expert in data science that you are now? How did you find a nice balance between those two worlds?
That’s the key. So let me explain to you in a few images. You do have a lab where you want to live your passion about the data and data science. You do have a home, you need to go to the lab, so you just run, so you already have your first passion to go to the other one. So you see, it’s a very easy next step.
I feel like you’re going to continually move your home farther and farther from the lab to increase the amount of miles you have as part of your commute.
Yeah, I guess. But nowadays, the company I work on, FieldBox.ai, is just actually like half a mile from my place. So I don’t have this strategy to go for a run now.
You have to run away from work to get to work.
Exactly, exactly. You actually have to go across the mountains, through some vineyards, before you come back. Your route just has to be that much more complicated, that’s all it has to be.
So given your experience at FieldBox.ai and your passion for running, how exactly do you apply data science to running?
It’s a very good question, and I’m sure people are also experiencing a lot of different use of their own data using data science. So on my side, what I do with that, so first of all, I do love to know when I go for running training or something like that, I like to know what happened during this training. And I obviously record everything, so I do have a GPS watch for example. I also have heart rate monitoring that I do not use all the time, but for very specific training, like interval training, I do also record my heart rate. But anyway, collecting all those data, all those personal data, allows you to quickly visualize what you have done during the week, during the months.
And because I’m also passionate with visualization, for example D3.js, I made several little personal apps and custom applications to visualize this training, it could be like a yearly training, and compare it to from a year to another or from a week to another. So that would be one way, because data science, of course there is a cleaning of the data and visualization of the data, it’s an important part before actually applying the modeling and everything like that. So that could be a way you could apply data science to your running experiments, we could say.
The other thing is that at one point, I was really into trying to optimize my training based on science. So I did a lot of research about it, and I wanted to try the Banister model on myself. So the Banister model is a way of basically modeling through data science your fitness and your fatigue, depending on the training that you do before. So it was basically on a impulse response kind of framework, meaning that each time you go for a training run, you will get some fatigue, but your physiology, your body will respond to that stress and will have a self-compensation mechanisms that will make you better. So you always have those two players into the game, the fitness and the fatigue. And so the Banister model tries to model that, and you could use this model on your own personal data.
So I was basically running this model on my training data to know, okay, what should I do next week? Should I go for a long run, should I go for an interval training run, and stuff like that? And so for several months, I actually drove my training using that model, and in the end, it went well. I actually won Angeles Crest 100 Mile, so it was a successful experiment. Obviously, it’s just n=1, but it’s the one that counts.
I wish I could start a sentence off with, it was successful, I ran 100 miles and I won. I don’t know if I’ll ever be able to say that sentence Guillaume. Exactly. But how do you continue to find the motivation to run? When you’re at mile 80, how do you dig deeper to continue to make it through to mile 100? What are you thinking about for those very long runs?
So it’s actually a question a lot of people ask me, and I always respond that, so for me running, obviously it’s fun, and I always see a long run as a way to discover an area or discover the world. So for me, running is not really a problem, I don’t have like a losing motivation. Actually, sometimes when I go for a very long run, I don’t even use music or anything like that, because I’m just enjoying myself, surrounded by I don’t know, trees and leaves and stuff like that.
So, it’s hard for me to give you a very simple answer, how do I do, I don’t know, for me, I don’t have the problem of not being motivated by it. Obviously, sometimes you have a lot of pain, like you feel it in your legs, and you could be a bit done, but it’s always a very short period of time for me. So I don’t have like a demotivation problem at all, especially because I enjoy running outside and discovering my surrounding, is what drives me.
So, okay, I want to switch back to the data part, especially that visualization part. And thank you by the way for teaching me up with the Banister model, I was looking it up, I was like, oh, this is really cool. But I’m just curious, the geeky me agrees, yes, I want to just build my own models and visualize it myself with D3.js. I actually grok you completely. But I’m just curious, why do it yourself, why not track it using the available apps? Is it because the model isn’t as good? I’m just curious, it sounds to me like you’re basically using the models as just tracking it yourself in an Excel spreadsheet or something like that. I’m just curious, what is your approach here?
So you mean, why do like a custom DIY solution, instead of using? I mean, because it’s always fun to understand how things are working behind, right? So obviously, I could do Strava directly, and they actually do have the performance and fatigue curves inside the application. But I find it personally always more motivating and fun to also do it by yourself so you can better understand the system as a well.
And nowadays, it’s very also important in the data science world, because as you know, data science is only a small part of the full ecosystem. You need to be able to obviously build those model, but then also deploy it. So you need to understand how the cloud works, how to do a deployment or a container, what is it, et cetera? And it’s only by playing with those tools yourself and trying to do it yourself from scratch. And sometime it’ll be very difficult, but at the end, you will always get out with a new knowledge, a new understanding, of what you’re doing, and that will help you to like what you do even more.
So that’s always been my approach, so that will be the response I give to you on that question.
Oh no, that’s a great approach. And I love it actually, because now you are inspiring me, because I freely admit, I’m lazy, I am using Strava and things of that nature. But since I do know the stuff and the Banister formula actually looks really interesting, I’m like, yeah, you’re right, maybe I should just hack my devices and start building it out myself. So no, fair enough, fair enough.
So you mentioned in terms of devices, sorry, I’m going to go on the hardware side a little bit, you said you have a heart monitor and you have the watch to track these things. I take it, the heart monitor, that’s separate mainly because of the imprecision of those wearables typically, when it comes to tracking your heart. I presume it’s for that reason?
Yeah. I mean, I’m not choosing it a lot. So first of all, I’m not a very fast runner, I know how to run long, but I’m not very, very fast. So for me, when I go for a long run, I don’t feel the need to necessarily record my heart rate, because I’m just going out to enjoy my run or something like that. But then for the interval training, that data becomes very important to me, because for each speed internal, you can then see your recovery rate, like when is my heart rate returning to base line, with what rate, et cetera, and then you can see the curves. And from one training to another, you can see if your training worked, because maybe your recovery speed increases and things like that. So I’m only using this data for very, very specific workouts, not necessarily for a very long run.
Also, the heart rate that I have is from the wrist, so it’s not obviously very, very accurate, less accurate than the one you can put on your chest. But for what I do, I guess it’s enough.
Speaking of tools that you use, you’ve mentioned heart rate, you mentioned the watch, you mentioned that sometimes you would analyze your runs using D3.js visualizations, what are some of the non-tech tools that you use when you go out on such long runs? How do you ensure you have enough water for a 200 mile run, like what do you carry with you? It’s not quite hardware, but what other tools do you bring with you when you run?
So I try to always have a flask. Nowadays, what’s great is that with the democratization of running, you do have a lot of materials that can help you for very long handover. And especially for carrying water, you don’t have to carry a very bulky bottle anymore, you have those very soft flask, kind of like a container, and so I always start with that. And then where do I refill? It depends. So if I go in nature, I do have little capsules for bacteria. And so if I find a river, that’s good, if I don’t find a river, well, I’m in trouble, and it happened before.
I remember a long run in the Santa Monica mountains, where it was during the night and I was doing the Backbone Trail. And so on the map of this Backbone Trail, you actually see toilets and you think, okay, there must be water over there, and so you plan your run like that. And then when you arrive at the bathroom, well, it’s closed and there is no water. So it past midnight, I had to actually knock on the door of someone because I was completely out of water.
So yes, carrying water is very important, that’s the first thing. And then obviously, if you go for like six, seven hours, you need food too, so I sometime have a belt where I put energy bar in it. But apart from that, water and food, that’s particularly all you need. A bit of sunscreen of course, sometimes.
Obviously, you can run all hours of the day, you’re doing these 24 hours, multiple day races, do you have a preferred hour of the day that you run?
I mean, at sunrise, it’s always beautiful, just because of the light and everything. And also during multi day events, the sunrise is always when you feel like new energy coming in, because you’ve passed long hours in the dark and then you see the light, you see those colors in the sky, so you get that re-energization kind of. So I would say that around sunrise is my preferred time of the day to run.
Oh, that’s so awesome actually, I’m completely with you. While I’m certainly not even remotely the same level as you in terms of sports, and I’m not even going to bother with the pretense, I do agree with you, I love the mornings, especially in my case, I typically ride across the 520 bridge in Seattle. And with the sunrise off the mountains, especially if Mount Rainier is out, oh yeah, it’s easily the best way to keep me going, the nature itself is what keeps me driving and motivated. So completely grok where you’re coming from.
I’m just curious, talking about motivations, which is not just related to how beautiful nature is, what is one of the hardest running events you’ve actually taken part of? How do you keep motivated for those things?
So of course, as I said, I’m doing very long races. So some of the hardest 100 miles I’ve done for example, could be HURT100 in Hawaii. The cool part of that race is that you are in nature, but you are in the Hawaiian nature, which means a lot of routes, a lot of obstacles. So that was very hard because it was also a lot of elevation gain, I think the full race is 28000 feet of climbing during those 100 miles. So this one was hard, but I went there three times, so I guess it was still good.
Obviously, the Big Backyard event, so the Last Man Standing event, because it was long. And when I came into this race, the hardest part was not physically I guess, it was the mental part, because you are arriving to a race where you actually don’t know when it’ll finish. So it’s like, okay, how do I prepare for that? How much food do I need? There are a lot of unknowns. And so you arrive at this race with a mental state like, I don’t know, it could last one hour, it could last more, so there is this unknown part that is very exciting too. So that was hard on the mental challenge, I would say. And also at the end, we were only two for like 36 hours with my buddy Harvey Lewis. So it was kind of a strange mental parameter, so that was hard on that level.
Oh, okay. I got to roll back, you throw out so many things, did you just initially say, was it the Barkley one, you had 28000 feet of elevation?
Yes, the HURT100 is 28000 feet of elevation in the race. So, yeah.
Okay. Now of course, I’m using a Seattle reference, for everybody to understand, that’s two Mount Rainiers of elevation. This man just [inaudible 00:23:50]. Okay, wow. All right.
But also it means that at one point, you had a very high point and you can see everything that is around, so that’s really cool.
Yes, yes, in a past life, I was a former mountain climber, which is why I’m referring to Mount Rainier here, so I rock that. But still, I’m like, okay, I guess do you get sponsored for these things then? Because considering how much you’re doing, it seems to be almost like you need, like race car driving, you need a sponsorship flag, because this is some pretty amazing stuff. I’m just curious, do you do those as well?
So I have the chance to get sponsored for those races. And mostly when you do those kind of events, you need shoes, a lot of them, because you go through a lot of pairs obviously. So yeah, I had the chance, in the US, of being there I guess at the right moment, so I met some great runners and they introduced me to the sport. And then I got noticed by the local vendor of some brands, especially Salomon, and then after the 59 hour race, they basically called me, they say, hey, they want to join the team. So it was funny because I was a French in the US team, so it was very fun.
So how often do you go through pairs of running shoes?
I try to be like respectful of the shoe too, it’s not because you do have free shoes that you want to just barely run with them and then throw them away. So usually, I tend to only switch shoe when it’s barely usable anymore. So I would say, I don’t know, three, four years ago, when I was running at my running peak, I could go two pairs a month. So yeah, like 20, 25 pairs a year.
That’s incredible. Did you get a chance to break them in?
Oh yeah, of course.
Okay, so that’s how you do it. And so Denny had mentioned race I don’t think you’ve had a chance to mention just yet, which is the Barkley Marathon. Could you share a bit more about what that race is and why it’s such a gruesome race?
Of course. So the Barkley might be the most famous race that I have run, I’m not a finisher. So it’s a race that started more than 20 years ago, actually 30 years ago, and there have only been 15 finishers of the full race. So this race is very historic, because basically the murderer of one of your presidents escaped from prison, and 60 hours later, so it’s a prison in Tennessee, in the woods, and it’s surrounded by mountains too, so the guy tried to escape the prison and he did. And then 60 hours later, he was found at only eight miles from this prison. So there is a very famous guy now in Tennessee called Lazarus Lake, his real name is Gary Cantrell, but like every movie star, he has a specific name, so we call him Lazarus Lake. He said, oh, what? I know those mountains, in 60 hours I could have done 100 miles. He was just laughing because this guy only ran eight miles.
And so that’s how the race was born. He basically defined a path in those mountains, which is 100 miles, and that you have to run in 60 hours. Except that there is no trail, basically it’s a race you go from book to book. So basically to prove that you went on the real path of the race, the guy hides books, and it’s a loop race, and every loop, you do have a bib number. So each time you arrive at a book, you tear up the page of your bib number. So you collect those pages in the woods nights and day, and every loop you give him the pages to prove that you have done the course.
So this race is very difficult because there is no trail, it’s off trail, and so you have to go over fallen trees, you have to climb some little cliff and stuff like that. Plus, you do have the navigational challenge too, because it’s not marked, so you have your map and a compass, and you have to find those books in the forest.
So it’s a very cool event actually, and every year only like 40 people have the right to run it. And that’s why the process for the registration is very mysterious, you do have to know when to send a letter to a specific address, and you have to know what to write on this letter. So it’s kind of very cool challenge just to get into the race, and so once you’re into the race, it’s a challenge to run the race. So I guess it’s a challenging race.
And is the reward at the end being able to say that you’re a finisher, are there any prizes if you finish?
That’s it. No, that’s it. And so what happens is that each time there is a new finisher, the race become becomes harder, because then it adds another part to it. The idea is to keep the race at the limits of human potential, so each time you are finisher, you make the race harder, so then the next guy hates you for that. So no reward at this race.
I love the fact that basically the entire premise of this race is how we can basically continuously torture the next person trying to do the race.
Exactly. So yeah, look it up, the Barkley Marathon, there are different documentaries that have been also done about it, it’s kind of interesting.
Okay. But then how do you train for something like that and not get injured? Because first of all, you just mentioned the fact that it’s unmarked, you might be climbing cliffs. Seriously, maybe it’s just because I’m old, but I feel like just training for this thing would injure me, let alone actually doing it.
Oh, it does. It does. So actually, once you are into the race, you are allowed to join a specific Facebook group, and on this Facebook group, there is also a wait list, a wait lister. So you do have 40 people entering the race, and then you do have 50 people in the wait list. And the reality is that, because when you train for this race as you said, you might get injured, that’s why there is a wait list. And every year, 10, 15 people get injured training for it, and are replaced by a wait-lister.
So yeah, it’s a real problem, but then that’s where you have to be smart, maybe use a Banister model to make sure your training is on point, so that’s what you can do. But yeah, it could be a problem, you could get injured, so obviously, you you want to do smart things preparing for this race, definitely.
Denny, I think the moral of the story today is, if we use data science, we can train smarter and avoid injury. That’s the takeaway I have.
Well, no, no, I was actually hoping that I could use data science to make myself that much better of a runner. But fair enough, I think you actually hit the point better than I did. Okay, fair enough.
And so Guillaume, what does your normal training regimen look like? Do you do any form of cross training, do you pretty much just stick to running? What does your schedule look like for training?
So I’m very bad. You know, you read everywhere and everybody will tell you it’s very good to cross train, it’s very good to do some stretching and stuff like that, I’m very bad at it. So I just run actually, I just do some abs and pushups, but that would be the most cross training that I’ve done. But before, when I was in my peak, yeah, I was doing a bit of biking, a bit of climbing, a bit of swimming too. So before, when I was younger, I was doing triathlons, so I do have this experience too, but nowadays I just run.
So what a typical week looks like? So a few years ago, again at my peak, I was waking up around 4.30 to 5, go for two hours or three hours of running. And so I was seeing this beautiful sunrise every morning, taking a picture, it was beautiful, and then was going to the lab. Nowadays, my life is a bit busier, I mean work wise, so I usually go after work. We do live in Bordeaux, so it’s very flat, and we do have several bridges around the Gironde, which is a river in Bordeaux. So there is this very famous run which called Around the Bridges, so you go from a bridge to another, and it’s very beautiful, so I do that after work. So I try to run four to five times a week basically, I allow myself two days off. And during the weekend, I try to enjoy as much the free time that I have during the weekend to go for a longer run. It could be in the Pyrenees, so south of Bordeaux, or in the little maintain nearby.
Definitely the volume, when you go for those kind of races, doing a lot of volume can help you. Obviously, you don’t want to overdo it. But yeah, what’s important is consistency, so if you can run four times a week instead of just one time a week where you do the same distance, that would be better to speed it up a bit. So that would be my advice.
I feel there’s almost like a data science ML model segue in all this, you keep on talking about running and then splitting it up, like partitioning your data, or you’re basically doing model training, so you’re training every day before you go to your production of the Pyrenees where you actually apply the training for your runs. I feel that there’s an analogy that we’re missing here somewhere in the process.
Yeah, exactly. This is like a good project, you have the build phase, then you do have the run phase in production. I totally agree, I think you do have a point Denny.
So then we’re going to segue back, I know we asked this before, but still, it’s amazing the level of effort that you have to do in order to be able to maintain this balance of the work that you’re doing and the running that you’re doing. Again, there’s that split that you’re doing, but how are you able to get that balance? I guess that’s more or less the question I’m trying to ask. Because it’s difficult, to be able to be top of your game in data science and top of your game when it comes to running, outside of the analogy.
No, definitely. More seriously, yes, it’s very challenging because you also have family commitments. I have a very understanding girlfriend, that’s for sure. She doesn’t see me a lot during the week, because I do long hours at work, and then I have to go for a run, so she sees me even less. So yes, that’s part of how I can do that, is that people around me are very supportive of those different passions. But I totally agree that it can be a challenge and I don’t have kids or anything like that, but I know that I have friends that are doing the same thing that I do and they also have more responsibility to other family, because they do have kids. So I really don’t know how they do it because I know what it involves.
So the point you are talking about is very key in doing those kind of stuff, you need to be able to balance your work, your family, and then your running. And so obviously, you will have to make compromises in some of those area, because time is incompressible. People will handle that differently, I’m very lucky that my family are very supportive in that, so that’s how I do it. It’s basically thanks to other people.
I was just saying, it’s a great analogy from the standpoint that you need a support team, you need a team of people for your data science team, to support you for everything. So I’m playing up the segues way too much, back to you Brooke.
No, I could add that also for the running at those big events, you do have a crew of people also helping you. So yeah, this sport, it looks like a solo sport, but it’s actually really a team effort every time. So I totally follow you on that, it’s like a good data science team. You want to go for the stars and deployed some new Delta Lake somewhere on an arcane cloud cluster, you need a team. You need DevOps, you need data scientists, you need software engineers, so totally agree on that.
Exactly, it’s all hands on deck for the go live. Once it’s successful, that’s when you take your PTO, that’s when you take your week off from running.
And so I know you’ve got a race tomorrow, can you share a bit more about what that race is?
Yes. So actually two weeks ago, a guy contacted me on Messenger. Didn’t know that guy, but he was like, hey, I heard you’re doing Barkley in a few months this year again, and I’m actually organizing this little race in my garden. So the guy has different vineyards and it’s a 24 hour race, and he just asked me if I wanted to join. That’s also what’s very cool with running, is that people are not afraid, like you might be a very good runner, people are not afraid to contact you, it’s a huge community. So that’s same like data science, if you need help, if you need more information about something, you can still reach out to someone and get an answer.
So this guy contacts me and I’m very glad he did, because his race sounds very fun, it’s a 24 race. It’s basically the same system as the Last Man Standing race, except that even if you are the last one, you can continue running, the race just stops at 24 hours. But it’s a loop with some elevation in the vineyards around Bordeaux, and it’s going to be fun. Starts tomorrow at 11:00 AM.
Well, best of luck with that Guillaume. And I love how you just gave the analogy of stack overflow of data science to the huge community of runners. And so just to close out the session, one last question for you, what advice or tips do you have for people that want to get into either the field of data science or into the field of running?
So for data science, I would say that obviously, if you are very new to the field, you might want to go to read some blog and throw some tutorial, but that would be very, very boring I think, because it’s always the same dataset, and usually, the datasets that you get are already cleaned up. And in reality, in the real world, datasets are not very clean, so you need to clean your dataset, extract your feature, do the full-feature modeling, et cetera.
So it’s better to start with a little project of your own. For example, I don’t know, nowadays we do have data coming from everywhere in our house, it could be the temperature, it could be the cycle of your refrigerator or whatever. And you could say, okay, I want to make a prediction of, I don’t know, how I could optimize my energy expenditure based on those little data. So you start to collect those data, you open your computer, make a bit of Python, and then you fire up [inaudible 00:39:37], and here you go. So data science is more fun when you are actually dealing with datasets that you love and that can basically help you, instead of dealing with data set from like airplane data or whatever, or something that is very outside of your life.
So yeah, the tips for data science, start with a little project that you care about and that matters to you, that’s always fun.
And then for running, I would say go one step at a time. Running, if you want to go too long, too early, could be very, very painful. So just start by around the block run, and don’t do that every day, maybe every third day or something like that. And then if you want to go longer, use running not as a way to lose weight or for something like that, but use it as a way to discover the world, to discover around you, or to meet people. Make it a positive attitude, and you will also feel better and more motivated to go for your run.
And I think one word that could link together both of the pieces of advice you gave Guillaume, is find something that you’re passionate about. So find a project you’re passionate about, do some data analysis there. For running, find whatever it is that motivates you about running and really find your passion there.
Totally agree on that.
So I just want to say thank you so much Guillaume, this was such a fun session, always great catching up with you. And thanks again for joining us on Data Brew.
Thank you very much to you guys for the invitation, it was very fun talking to you about those passions. And stay connected and stand data sciencey.