Caitlin talks to a Site Reliability Engineer
May 21, 2010
Hi guys! Allow me to introduce my buddy Marc. As a Site Reliability Engineer, he and his team are responsible for keeping Google going, rain or shine, 24-7! You can read all about it below - but first...We want your help! Who should Caitlin talk to next? Want to learn more about specific Google roles and offices? What questions would you ask an engineer? Use the ‘comments’ field to submit ideas, or email cttae@google.com!

Caitlin: Hey there Marc! So, how long have you been at Google?
Marc: It’ll be 4 years in July. I started out in the Mountain View office, then moved to San Francisco. I originally interviewed for a job in the Cambridge, MA office, but then this position opened up and I’ve been living in sunny California ever since!
C: What are you working on right now and why is it cool?
M: I’m a member of the Site Reliability Engineering team, otherwise known as Software Engineer – Google.com. There are Site Reliability Engineers (or SREs) working on most Google applications, but my team specifically deals with mobile. We’re responsible for a number of Google’s mobile properties including Ads and Mobile Search. Any time you’re using Search on your phone, it’s my team responsible for keeping the train running on time. We release new versions of the Mobile Search front end—and we’re rapidly expanding as new apps grow in popularity and stabilize on the backend. We really are growing...I’ve seen my team triple in the past year!
C: So, are you one of those engineers who is ‘on call’?
M: Yes—but not all the time. All SREs spend a period of time being ‘on call’—monitoring various applications and responding immediately if something goes wrong. My team, like most, is split into two shifts—we have a group in California working daytime in the US and other groups in London, Sydney, Taipei, and Tokyo working daytime in Europe, Asia and Australia. So, on call works out rather conveniently. In general, you’re on call one out of every four to six weeks.
C: I’ve heard some stories about you SREs. Apparently, you’re a pretty wacky bunch.
M: Yeah...we have a few characters on the team! I had an interesting experience yesterday, actually. So, If something goes wrong when you’re on call, your mobile phone or pager will go off. This alerts you to get on your computer and check everything out. After a few rounds of on call, you become conditioned to respond to your mobile phone automatically, without even thinking about it. I have a few very ‘clever’ SRE friends who decided to take advantage of this. One of them got married a few days ago and decided to send me some videos from the wedding...by emailing them to my work pager inbox at 4:00 AM. So, I woke up in the middle of the night thinking something was wrong with one of our services—and it turned out to be happy wedding videos!
C: Wow. Sounds like you guys are used to being on your toes.
M: Yeah, you’re on your toes a lot and it’s always really interesting. When you’re paged, it’s never because of a simple problem. We take care of those with automation. So, when your pager goes off, you know it’s something important and really challenging. SREs do a lot of reasonably high pressure trouble-shooting and critical thinking on the spot. It makes for some exciting work.
C: What do you think is the coolest happening in the tech world right now?
M: I put a lot of thought into this question a few years ago when I transferred to the mobile SRE team. Mobile is a really fast-moving, interesting area of development that I wanted to be involved in. I wanted to help make sure that people out there in the world have access to information when and where they need it and, these days, that’s all being done through mobile development.
C: What do you do when you’re not being an SRE?
M: I write a few Android apps in my spare time—one of them is an open source application which assists other engineers who are on call. It’s called Klaxon. I also enjoy juggling and spending sunny afternoons in Dolores Park.
C: And finally, what is the sound of one hand clapping?
M: I'd imagine that the sound of one hand clapping is a lot like the sound of my pager not going off at 4:00 AM!
C: That’s incredibly zen. Thanks for talking to me, Marc!
M: No problem!

Caitlin: Hey there Marc! So, how long have you been at Google?
Marc: It’ll be 4 years in July. I started out in the Mountain View office, then moved to San Francisco. I originally interviewed for a job in the Cambridge, MA office, but then this position opened up and I’ve been living in sunny California ever since!
C: What are you working on right now and why is it cool?
M: I’m a member of the Site Reliability Engineering team, otherwise known as Software Engineer – Google.com. There are Site Reliability Engineers (or SREs) working on most Google applications, but my team specifically deals with mobile. We’re responsible for a number of Google’s mobile properties including Ads and Mobile Search. Any time you’re using Search on your phone, it’s my team responsible for keeping the train running on time. We release new versions of the Mobile Search front end—and we’re rapidly expanding as new apps grow in popularity and stabilize on the backend. We really are growing...I’ve seen my team triple in the past year!
C: So, are you one of those engineers who is ‘on call’?
M: Yes—but not all the time. All SREs spend a period of time being ‘on call’—monitoring various applications and responding immediately if something goes wrong. My team, like most, is split into two shifts—we have a group in California working daytime in the US and other groups in London, Sydney, Taipei, and Tokyo working daytime in Europe, Asia and Australia. So, on call works out rather conveniently. In general, you’re on call one out of every four to six weeks.
C: I’ve heard some stories about you SREs. Apparently, you’re a pretty wacky bunch.
M: Yeah...we have a few characters on the team! I had an interesting experience yesterday, actually. So, If something goes wrong when you’re on call, your mobile phone or pager will go off. This alerts you to get on your computer and check everything out. After a few rounds of on call, you become conditioned to respond to your mobile phone automatically, without even thinking about it. I have a few very ‘clever’ SRE friends who decided to take advantage of this. One of them got married a few days ago and decided to send me some videos from the wedding...by emailing them to my work pager inbox at 4:00 AM. So, I woke up in the middle of the night thinking something was wrong with one of our services—and it turned out to be happy wedding videos!
C: Wow. Sounds like you guys are used to being on your toes.
M: Yeah, you’re on your toes a lot and it’s always really interesting. When you’re paged, it’s never because of a simple problem. We take care of those with automation. So, when your pager goes off, you know it’s something important and really challenging. SREs do a lot of reasonably high pressure trouble-shooting and critical thinking on the spot. It makes for some exciting work.
C: What do you think is the coolest happening in the tech world right now?
M: I put a lot of thought into this question a few years ago when I transferred to the mobile SRE team. Mobile is a really fast-moving, interesting area of development that I wanted to be involved in. I wanted to help make sure that people out there in the world have access to information when and where they need it and, these days, that’s all being done through mobile development.
C: What do you do when you’re not being an SRE?
M: I write a few Android apps in my spare time—one of them is an open source application which assists other engineers who are on call. It’s called Klaxon. I also enjoy juggling and spending sunny afternoons in Dolores Park.
C: And finally, what is the sound of one hand clapping?
M: I'd imagine that the sound of one hand clapping is a lot like the sound of my pager not going off at 4:00 AM!
C: That’s incredibly zen. Thanks for talking to me, Marc!
M: No problem!
Marc and Caitlin sit in front of Google London’s new photomosaic wall. Check it out on the Official Google Blog!