In our second episode of Breaking 404, we caught up with Johan Andersen, Engineering Director, Citadel ( Former Google SRE Manager) to understand the best practices of managing distributed systems as well as distributed systems engineers.
Arbaz: Hello everyone and welcome to the second episode of Breaking 404 by HackerEarth, a podcast for all engineering enthusiasts, professionals, and leaders to learn from top influencers in the engineering and technology industry. This is your host Arbaz and today I have with me Johan Andersen, an ex-Googler (or Xoogler as they call it) and currently the Director of Engineering at Citadel, a global financial institution headquartered in Chicago, United States, with offices throughout North America, Asia, and Europe.
Johan: Hey Arbaz! I’m glad to be here. Thanks for inviting me.
Arbaz: So let’s get this episode up and running by giving our audience a little sneak-peak into your professional journey. So what has your professional journey been like?
Johan: Varied. I started as a systems and networks engineer in academia, worked at university for ~5 years trying to build cheap versions of products that were too expensive to buy. I moved from there into finance, worked as an IT security engineer at a major investment bank. I learned a lot about large systems and how to work effectively across teams, as security was supposed to be a part of any large project being developed at the bank. Switched to Google around 2009, and stayed there for 10 years working a wide variety of applications and infrastructure projects. I learned a lot about SRE best practices and scaling things both for traffic and for an operational load. Last year I moved to Citadel to help drive the SRE team here and spread some of that culture.
Arbaz: Now that our audience knows you much better, it’s time to get into the technicalities. You have previously been a Senior Engineering Manager at a tech giant, Google and now you are with Citadel, a top company in the financial space. What was the biggest challenge for you during this transition? As in how different has your experience been working in the engineering teams of two different industries (Tech and FinTech)?
Johan: The major change I noticed was more related to the relative sizes of the two companies. When I left Google, there were roughly 100x as many full-time engineers as there are at Citadel. This means that it’s much faster to get effective changes rolled out, but that there’s less pre-built infrastructure for teams to leverage. So more work is spent on establishing best practices, but also it’s easier to get consensus on those practices and get them into production. Another change was the difference in the regulatory climate between the two. Google had lots of regulations on safeguarding user privacy and data, but fewer concerns around things retaining communications and desktop technology. I really miss Google Docs.
Arbaz: Well, Google Docs is very close to everyone using the GSuite globally, so we can totally understand your pain here. Moving on, it’s said that as one grows as a professional they tend to develop a greater fear of things going wrong. So what is the biggest fear that you have, being the Director (Engineering) at Citadel?
Johan: My biggest fears are not really Citadel specific; anyone building an engineering organization today has to think about them. One is a competition for talent: strong engineers have never been in more demand than they are today. One is keeping up with a changing ecosystem. SRE in particular, being partnered with multiple engineering teams, really ends up having to have a breadth-first approach to learning, and ends up being a conduit for best practices throughout the organization. Finally, and somewhat topical, is preparing for the unexpected. How well have you load tested the services you use to support remote workers? With the recent news, a number of “baseline assumptions” around both technology and support models are being tested.
Arbaz: Very well said, Johan. All the 3 points here are bang on point and very relatable for all those working in engineering teams globally. And as you rightly pointed out, the competition for talent is fierce and it’s really important for all companies to build great engineering teams. We, at HackerEarth, are proud to help companies in getting top technical talent. Just deviating a little from your professional life and getting to know you more as a person, what would be your favorite leisure-time activity that you love to do when not working?
Johan: I read a lot of science fiction. I play some video games. I like to sail, but don’t get a chance very often! And I like to bicycle around New York City.
Arbaz: That’s really interesting. A mix of reading, playing video games and sailing is a pretty unique combination. I believe having a hobby is much needed for everyone to keep calm and motivated. Now that we are talking about hobbies and interests, we often see engineers (at least I do at HackerEarth) lost in their laptop/computer screens, writing lengthy codes. All this while, they have their headphones plugged in, listening to music. What songs or music genre best describes your work ethic?
Johan: Wow, this is tough for me! Maybe classical, Baroque stuff that moves quickly through different movements. My day is rapidly changing, and I like to think it has a similar underlying order.
Arbaz: Coming back to Johan, the Engineer at work, considering the current scenario around the COVID-19 outbreak where companies have asked their employees to work remotely, what do you think is the biggest problem/challenge with remote work for an engineering team?
Johan: I think the biggest challenge nowadays is kind of maintaining the sense of team and comradery that you had during normal operations. It’s really easy in an environment where you spend all of your time at home and only communicate via instant messenger or email or the occasional video conference to get lost in work and to not have a good way to separate your personal time from your work time. It’s really important for leaders to reach out to the people on their teams to make sure that people are doing well in their assigned projects and also in their home lives and try to make accommodations as this is a challenging period for everyone.
Arbaz: The outbreak is pretty serious and we don’t know when it’s gonna end. Wishing all our listeners the best of health and please stay safe. Now comes a question that I love asking all my guests on this podcast. Code quality and technical debt are two terms that we often hear from engineering leaders. Keeping them in mind, how do you maintain a balance of technical stability (minimize technical debt) while still delivering high-quality code?
Johan: This is a really great question. A lot of people think that SRE is the team that exists to say no. And certainly, no system is more stable and reliable as one that never changes. But systems like that are seldom very useful. I think that SRE exists to make changes as easy and fast as possible without the wheels flying off the car. So if you have robust testing, a release system that lets you canary changes effectively, and can roll back changes quickly and easily, deliver as fast as you can. It’s only when you start to see gaps in these areas that SRE starts to recommend being more cautious. I’d much rather own a project where we made frequent changes through a well-understood and tested process than own a more “Stable” service that only released quarterly.
A colleague of mine who I respect a lot once said that a service can’t have “haunted graveyards”. If there’s a place or thing you’re afraid to touch or change, it is your responsibility to exercise it until it is understood well enough to change it safely. Otherwise, you risk having to make such a change when you are least prepared to under fire.
Arbaz: With all the new advancements in technology, the introduction of concepts like Machine Learning and Artificial Intelligence, how do you see the technical landscape changing over the next few years and how will you prepare engineering for that?
Johan: Oh, man, I’m terrible at this. I mean, obviously, things continue to move to the cloud. Maintaining either expensive on-premises data centers or expensive offices for engineers is going to be seen as more of an unnecessary cost. It’s currently justified by “We’ve always done it this way” or “there’s no other way to meet our regulatory burdens”, but if you imagine starting a new business today, how much would you invest in building your own on-premises services for things the SDLC, authentication, email, etc? I know I’m not saying anything novel or profound here, but trying to keep a grasp on what the state of the market is and moving away from capital-intensive “build our own” is probably what’s in the cards for anyone not in a gigantic cloud provider.
Arbaz: Taking you back in time, around 20-25 years, just a fun question here, what was the first programming language you started to code in?
Johan: I did some Pascal very early on, and my first “real” code was in C.
Arbaz: A few minutes ago we talked about getting the right talent for building a strong and performing engineering team, what according to you is the most challenging part of any technical assessment/interview from a Hiring Manager perspective?
Johan: Assessment of a new system, or of an engineer? For a system, probably trying to determine dependencies as well as establish the best SLIs.
For a person, I am most interested in trying to evaluate how well they learn, rather than their expertise in any particular technology. Technology, as you talked about before, is constantly changing, and a good engineer will be able to pick up what is needed. This means less “what are the various list functions in Python” and more “in whatever language you are familiar with, help me solve this general problem”.
Arbaz: If not engineering, what alternate profession would you have seen yourself excel in?
Johan: Maybe teaching? I really enjoy explaining how things work to people and working through problems with them. You learn a thing best by teaching it to others.
Arbaz: Finally before we sign off and end today’s episode one last question for you- What would be your 1 tip for all Developers, Engineering Managers, VPs, and Directors of Engineering for being the best at what they do?
Johan: One piece of advice for everyone? Probably something about humility/willingness to listen. People who ask questions or express reservations aren’t attacking you. It’s easy, especially for leaders, but even smart engineers on teams, to be super-confident in your own abilities and ideas. But even if you ARE 100% right, if you are shooting other people down when they bring you questions or concerns, it trains the people around you to not bring you new information, and that’s the last thing someone who is trying to be the best wants.
Arbaz: It was a pleasure having you as a part of today’s episode, Johan. It was really informative and insightful to hear from a leader like yourself.
Johan: Arbaz, it was a real pleasure. Thank you for inviting me. I had a really good time.
Arbaz: This brings us to the end of today’s episode of Breaking 404. Stay tuned for more such awesome enlightening episodes. Don’t forget to subscribe to our channel ‘Breaking 404 by HackerEarth’ on Itunes, Spotify, Google Podcasts, SoundCloud and TuneIn. This is Arbaz, your host signing off until next time. Thank you so much, everyone!
About Johan Andersen:
Johan Andersen is an engineering leader with a broad experience creating and developing teams to focus on improving the reliability and scalability of large distributed systems. Johan is currently an Engineering Director at Citadel, where he manages several teams with responsibility for the middle and back-office operations for the firm. Before that, he was a senior SRE manager at Google, where he worked on a wide variety of infrastructure and application teams from Storage to Docs to Search Indexing. Prior to Google, Johan led the Security Architecture team at Morgan Stanley. He has a BS and MS in Computer Science from Columbia University.