The Podlets: Learning Distributed Systems (Ep 12)

castrojo · January 13, 2020, 4:41pm

In this episode of The Podlets Podcast, we welcome Michael Gasch from VMware to join our discussion on the necessity (or not) of formal education in working in the realm of distributed systems. There is a common belief that studying computer science is a must if you want to enter this field, but today we talk about the various ways in which individuals can teach themselves everything they need to know. What we establish, however, is that you need a good dose of curiosity and craziness to find your feet in this world, and we discuss the many different pathways you can take to fully equip yourself. Long gone are the days when you needed a degree from a prestigious school: we give you our hit-list of top resources that will go a long way in helping you succeed in this industry. Whether you are someone who prefers learning by reading, attending Meetups or listening to podcasts, this episode will provide you with lots of new perspectives on learning about distributed systems.

Follow us: https://twitter.com/thepodlets

Website: https://thepodlets.io

Feeback:

info@thepodlets.io

https://github.com/vmware-tanzu/thepodlets/issues

Hosts:

Key Points From This Episode:

• Introducing our new host, Michael Gasch, and a brief overview of his role at VMware.
• Duffie and Carlisia’s educational backgrounds and the value of hands-on work experience.
• How they first got introduced to distributed systems and the confusion around what it involves.
• Why distributed systems are about more than simply streamlining communication and making things work.
• The importance and benefit of educating oneself on the fundamentals of this topic.
• Our top recommended resources for learning about distributed systems and their concepts.
• The practical downside of not having a formal education in software development.
• The different ways in which people learn, index and approach problem-solving.
• Ensuring that you balance reading with implementation and practical experience.
• Why it’s important to expose yourself to discussions on the topic you want to learn about.
• The value of getting different perspectives around ideas that you think you understand.
• How systems thinking is applicable to things outside of computer science.
• The various factors that influence how we build systems.

Quotes:

“When people are interacting with distributed systems today, or if I were to ask like 50 people what a distributed system is, I would probably get 50 different answers.” — @mauilion [0:14:43]

“Try to expose yourself to the words, because our brains are amazing. Once you get exposure, it’s like your brain works in the background. All of a sudden, you go, ‘Oh, yeah! I know this word.’” — @carlisia [0:14:43]

“If you’re just curious a little bit and maybe a little bit crazy, you can totally get down the rabbit hole in distributed systems and get totally excited about it. There’s no need for having formal education and the degree to enter this world.” — @embano1 [0:44:08]

Learning resources suggested by the hosts:

Book, Designing Data-Intensive Applications, M. Kleppmann
Book, Distributed Systems, M. van Steen and A.S. Tanenbaum (free with registration)
Book, Thesis on Raft, D. Ongaro. - Consensus - Bridging Theory and Practice (free PDF)
Book, Enterprise Integration Patterns, B.Woolf, G. Hohpe
Book, Designing Distributed Systems, B. Burns (free with registration)
Video, Distributed Systems
Video, Architecting Distributed Cloud Applications
Video, Distributed Algorithms
Video, Operating System - IIT Lectures
Video, Intro to Database Systems (Fall 2018)
Video, Advanced Database Systems (Spring 2018)
Paper, Time, Clocks, and the Ordering of Events in a Distributed System
Post, Notes on Distributed Systems for Young Bloods
Post, Distributed Systems for Fun and Profit
Post, On Time
Post, Distributed Systems @The Morning Paper
Post, Distributed Systems @Brave New Geek
Post, Aphyr’s Class materials for a distributed systems lecture series
Post, The Log - What every software engineer should know about real-time data’s unifying abstraction
Post, Github - awesome-distributed-systems
Post, Your Coffee Shop Doesn’t Use Two-Phase Commit
Podcast, Distributed Systems Engineering with Apache Kafka ft. Jason Gustafson
Podcast, The Systems Bible - The Beginner’s Guide to Systems Large and Small - John Gall
Podcast, Systems Programming - Designing and Developing Distributed Applications - Richard Anthony
Podcast, Distributed Systems - Design Concepts - Sunil Kumar

Links Mentioned in Today’s Episode:

The Podlets on Twitter — https://twitter.com/thepodlets

Michael Gasch on LinkedIn — https://de.linkedin.com/in/michael-gasch-10603298

Michael Gasch on Twitter — https://twitter.com/embano1

Carlisia Campos on LinkedIn — https://www.linkedin.com/in/carlisia

Duffie Cooley on LinkedIn — https://www.linkedin.com/in/mauilion

VMware — https://www.vmware.com/

Kubernetes — https://kubernetes.io/

Linux — https://www.linux.org

Brian Grant on LinkedIn — https://www.linkedin.com/in/bgrant0607

Kafka — https://kafka.apache.org/

Lamport Article — https://lamport.azurewebsites.net/pubs/time-clocks.pdf

Designing Date-Intensive Applications — https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable-ebook/dp/B06XPJML5D

Designing Distributed Systems — https://www.amazon.com/Designing-Distributed-Systems-Patterns-Paradigms/dp/1491983647

Papers We Love Meetup — https://www.meetup.com/papers-we-love/

The Systems Bible — https://www.amazon.com/Systems-Bible-Beginners-Guide-Large/dp/0961825170

Enterprise Integration Patterns — https://www.amazon.com/Enterprise-Integration-Patterns-Designing-Deploying/dp/0321200683

Transcript:

EPISODE 12

[INTRODUCTION]

[0:00:08.7] ANNOUNCER: Welcome to The Podlets Podcast, a weekly show that explores Cloud Native one buzzword at a time. Each week, experts in the field will discuss and contrast distributed systems concepts, practices, tradeoffs and lessons learned to help you on your cloud native journey. This space moves fast and we shouldn’t reinvent the wheel. If you’re an engineer, operator or technically minded decision maker, this podcast is for you.

[EPISODE]

[00:00:41] CC: Hi, everybody. Welcome back. This is Episode 12, and we are going to talk about distributed systems without a degree or even with a degree, because who knows how much we learn in university. I am Carlisia Campos, one of your hosts. Today, I also have Duffie Cooley. Say hi, Duffie.

[00:01:02] DC: Hey, everybody.

[00:01:03] CC: And a new host for you, and this is such a treat. Michael Gasch, please tell us a little bit of your background.

[00:01:11] MG: Hey! Hey, everyone! Thanks, Carlisia. Yes. So I’m new to the show. I just want to keep it brief because I think over the show we’ll discuss our backgrounds a little bit further. So right now, I’m with VMware. So I’ve been with VMware almost for five years. Currently, I'm in the office of the CTO. I’m a platform architect in the office of the CTO and I mainly use Kubernetes on a daily basis from an engineering perspective. So we build a lot of prototypes based on customer input or ideas that we have, and we work with different engineering teams.

Kurbernetes has become kind of my bread and butter but lately more from a consumer perspective like developing with Kurbenetes or against Kubernetes, instead of the formal ware of mostly being around implementing and architecting Kubernetes.

[00:01:55] CC: Nice. Very impressive. Duffie?

[00:01:58] MG: Thank you.

[00:01:59] DC: Yeah.

[00:02:00] CC: Let’s give the audience a little bit of your backgrounds. We’ve done this before but just to frame the episodes, so people will know how we come in as distributed systems.

[00:02:13] DC: Sure. In my experience, I spent – I don’t have a formal education history. I spent most of my time kind of like in a high school time. Then from there, basically worked into different systems administration, network administration, network architect, and up into virtualization and now containerization. So I’ve got a pretty hands-on kind of bootstrap experience around managing infrastructure, both at small-scale, inside of offices, and the way up to very large scale, working for some of the larger companies here in the Silicon Valley.

[00:02:46] CC: All right. My turn I guess. So I do have a computer science degree but I don’t feel that I really went deep at all in distributed systems. My degree is also from a long time ago. So mainly, what I do know now is almost entirely from hands-on work experience. Even so, I think I'm very much lacking and I’m very interested in this episode, because we are going to go through some great resources that I am also going to check out later. So let’s get this party started.

[00:03:22] DC: Awesome. So you want to just talk about kind of the general ideas behind distributed systems and like how you became introduced to them or like where you started in that journey?

[00:03:32] CC: Yeah. Let’s do that.

[00:03:35] DC: My first experience with the idea of distributed systems was in using them before I knew that they were distributed systems, right? One of the very first distributed systems as I look back on it that I ever actually spent any real time with was DNS, which I consider to be something of a distributed system. If you think about it, they have name servers, they have a bunch of caching servers. They solve many of the same sorts of problems.

In a previous episode, we talked about how networking, just the general idea of networking and handling large-scale architecting networks. It’s also in a way very – has a lot of analogues into distributed systems. For me, I think working with and helping solve the problems that are associated with them over time gave me a good foundational understanding for when we were doing distributed systems as a thing later on in my career.

[00:04:25] CC: You said something that caught my interest, and it’s very interesting, because obviously for people who have been writing algorithms, writing papers about distributed systems, they’re going to go yawning right now, because I’m going to say the obvious.

As you start your journey programming, you read job requirements. You read or you must – should know distributed systems. Then I go, “What is distributed system? What do they really mean?” Because, yes, we understand apps stuck to apps and then there is API, but there’s always for me at least a question at the back of my head. Is that all there is to it? It sounds like it should be a lot more involved and complex and complicated than just having an app stuck on another app.

In fact, it is because there are so many concepts and problems involved in distributed systems, right? From timing, clock, and sequence, and networking, and failures, how do you recover. There is a whole world in how do you log this properly, how do you monitor. There’s a whole world that revolves around this concept of systems residing in different places and [inaudible 00:05:34] each other.

[00:05:37] DC: I think you made a very good point. I think this is sort of like there’s an analog to this in containers, oddly enough. When people say, “I want a container within and then the orchestration systems,” they think that that's just a thing that you can ask for. That you get a container and inside of that is going to be your file system and it’s going to do all those things. In a way, I feel like that same confusion is definitely related to distributed systems.

When people are interacting with distributed systems today or if I were to ask like 50 people what a distributed system is, I would probably get 50 different answers. I think that you got a pretty concise definition there in that it is a set of systems that intercommunicate to perform some function. It’s like found at its base line. I feel like that's a pretty reasonable definition of what distributed systems are, and then we can figure out from there like what functions are they trying to achieve and what are some of the problems that we’re trying to solve with them.

[00:06:29] CC: Yeah. That’s what it’s all about in my head is solving the problems because at the beginning, I was thinking, “Well, it must be just about communicating and making things work.” It’s the opposite of that. It’s like that’s a given. When a job says you need to understand about distributed systems, what they are really saying is you need to know how to deal with failures, not just to make it work. Make it work is sort of the easy part, but the whole world of where the failures can happen, how do you handle it, and that, to me is what needing to know distributed system comes in handy.

In a couple different things, like at the top layer or 5% is knowing how to make things work, and 95% is knowing how to handle things when they don’t work, because it’s inevitable.

[00:07:19] DC: Yeah, I agree. What do you think, Michael? How would you describe the context around distributed systems? What was the first one that you worked with?

[00:07:27] MG: Exactly. It’s kind of similar to your background, Duffie, which is no formal degree or education on computer science right after high school and jumping into kind of my first job, working with computers, computer administration.

I must say that from the age of I think seven or so, I was interested in computers and all that stuff but more from a hardware perspective, less from a software development perspective. So my take always was on disassembling the pieces and building my own computers than writing programs. In the early days, that just was me.

So I completely almost missed the whole education and principles and fundamentals of how you would write a program for a single computer and then obviously also for how to write programs that run across a network of computers. So over time, as I progress on my career, especially kind of in the first job, which was like seven years of different Linux systems, Linux administrations, I kind of – Like you, Duffie, I dealt with distributed systems without necessarily knowing that I'm dealing with distributed systems. I knew that it was mostly storage systems, Linux file servers, but distributed file servers. Samba, if some of you recall that project.

So I knew that things could fail. I know it could fail, for example, or I know it could not be writable, and so a client must be stuck but not necessarily I think directly related to fundamentals of how distributed systems work or don’t work. Over time, and this is really why I appreciate the Kubernetes project in community, I got more questions, especially when this whole container movement came up. I got so many questions around how does that thing work. How does scheduling work? Because scheduling kind of was close to my interest in the hardware design and low-level details. But I was looking at Kubernetes like, “Okay. There is the scheduler.”

In the beginning, the documentation was pretty scarce around the implementation and all the control as for what’s going on. So I had to – I listen to a lot of podcasts and Brian Grant’s great talks and different shows that he gave from the Kubernetes space and other people there as well.

In the end, I had more questions than answers. So I had to dig deeper. Eventually, that led me to a path of wanting to understand more formal theory behind distributed systems by reading the papers, reading books, taking some online classes just to get a basic understanding of those issues. So I got interested in results scheduling in distributed systems and consensus. So those were two areas that kind of caught my eyes like, “What is it? How do machines agree in a distributed system if so many things can go wrong?”

Maybe we can explore this later on. So I’m going to park this for a bit. But back to your question, which was kind of a long-winded answer or a road to answering your question, Duffie. For me, a distributed system is like this kind of coherent network of computer machines that from the outside to an end-user or to another client looks like one gigantic big machine that is [inaudible 00:10:31] to run as fast. That is performing also efficient. It constitutes a lot of characteristics and properties that we want from our systems that a single machine usually can’t handle. But it looks like it's a big single machine to a client.

[00:10:46] DC: I think that – I mean, it is interesting like, I don’t want to get into – I guess this is probably not just a distributed systems talk. But obviously, one of the questions that falls out for me when I hear that answer is then what is the difference between a micro service architecture and distributed systems, because I think it's – I mean, to your point, the way that a lot of people work with the app to learn to develop software, it’s like we’re going to develop a monolithic application just by nature. We’re going to solve a software problem using code.

Then later on, when we decide to actually scale this thing or understand how to better operate it under a significant load, then we started thinking about, “Okay. Well, how do we have to architect this differently in such a way that it can support that load?”

That’s where I feel like the beams cut across, right? We’re suddenly in a world where you’re not only just talking about microservices. You’re also talking about distributed systems because you’re going to start thinking about how to understand transactionality throughout that system, how to understand all of those consensus things that you're referring to. How do they affect it when I add mister network in there? That’s cool.

[00:11:55] MG: Just one comment on this, Duffie, which took me a very long time to realize, which is coming – From my definition of what a distributed system is like this group of machines that they perform work in a certain sense or maybe even more abstracted like at a bunch of computers network together.

What I kind of missed most of the time, and this goes back to the DNS example that you gave in the beginning, was the client or the clients are also part of this distributed system, because they might have caches, especially in DNS. So you always deal with this kind of state that is distributed everywhere. Maybe you don't even know where it kind of is distributed, and the client kind of works with a local stale data.

So that is also part of a distributed system, and something I want to give credit to the Kafka community and some of the engineers on Kafka, because there was a great talk lately that I heard. It’s like, “Right. The client is also part of your distributed system, even though usually we think it's just the server. That those many server machines, all those microservices.” At least I missed that a long time.

[00:12:58] DC: You should put a link to that talk in our [inaudible 00:13:00]. That would be awesome. It sounds great. So what do you think, Carlisia?

[00:13:08] CC: Well, one thing that I wanted to mention is that Michael was saying how he’s been self-teaching distributed systems, and I think if we want to be competent in the area, we have to do that. I’m saying this to myself even.

It’s very refreshing when you read a book or you read a paper and you really understand the fundamentals of an aspect of distributed system. A lot of things fall into place in your hands. I’m saying this because even prioritizing reading about and learning about the fundamentals is really hard for me, because you have your life. You have things to do. You have the minutiae in things to get done. But so many times, I struggle.

In the rare occasions where I go, “Okay. Let me just learn this stuff trial and error,” it makes such a difference. Then once you learn, it stays with you forever. So it’s really good. It’s so refreshing to read a paper and understand things at a different level, and that is what this episode is. I don’t know if this is the time to jump in into, “So there are our recommendations.” I don't know how deep, Michael, you’re going to go. You have a ton of things listed. Everything we mention on the show is going to be on our website, on the show notes. So nobody needs to be necessarily taking notes.

Anything thing I wanted to say is it would be lovely if people would get back to us once you listened to this. Let us know if you want to add anything to this list. It would be awesome. We can even add it to this list later and give a shout out to you. So it’d be great.

[00:14:53] MG: Right. I don’t want to cover this whole list. I just wanted to be as complete as possible about a stuff that I kind of read or watched. So I just put it in and I just picked some highlights there if you want.

[00:15:05] CC: Yeah. Go for it.

[00:15:06] MG: Yeah. Okay. Perfect. Honestly, even though not the first in the list, but the first thing that I read, so maybe from kind of my history of how I approach things, was searching for how do computers work and what are some of the issues and how do computers and machines agree. Obviously, the classic paper that I read was the Lamport paper on “Time, Clocks, and the Ordering of Events in a Distributed System”.

I want to be honest. First time I read it, I didn’t really get the full essence of the paper, because it doesn't prove in there. The mathematic proof for me didn't click immediately, and there were so many things and concepts and physics and time that were thrown at me where I was looking for answers and I had more questions than answers. But this is not to Leslie. This is more like by the time I just wasn't prepared for how deep the rabbit hole goes.

So I thought, if someone asked me for – I only have time to read one book out of this huge list that I have there and all the other resources. Which one would it be? Which one would I recommend? I would recommend Designing Data-Intensive Apps by Martin Kleppmann, which I’ve been following his blog posts and some partial releases that he's done before fully releasing that book, which took him more than four years to release that book.

It’s kind of almost the Bible, state-of-the-art Bible when it comes to all concepts in distributed systems. Obviously, consensus, network failures, and all that stuff but then also leading into modern data streaming, data platform architectures inspired by, for example, LinkedIn and other communities. So that would be the book that I would recommend to someone if – Who does have time to read one book.

[00:16:52] DC: That’s a neat approach. I like the idea of like if you had one thing, if you have one way to help somebody ramp on distributed systems and stuff, what would it be? For me, it’s actually I don't think I would recommend a book, oddly enough. I feel like I would actually – I’d probably drive them toward the kind of project, like the kind [inaudible 00:17:09] project and say, “This is a distributed system all but itself.” Start tearing it apart to pieces and seeing how they work and breaking them and then exploring and kind of just playing with the parts. You can do a lot of really interesting things.

This is actually another book in your list that was written by Brendan Burns about Designing Distributed Systems I think it’s called. That book, I think he actually uses Kubernetes as a model for how to go about achieving these things, which I think is incredibly valuable, because it really gets into some of the more stable distributed systems patterns that are around.

I feel like that's a great entry point. So if I had one thing, if I had to pick one way to help somebody or to push somebody in the direction of trying to learn distributed systems, I would say identify those distributed systems that maybe you’re already aware of and really explore how they work and what the problems with them are and how they went about solving those problems. Really dig into the idea of it. It’s something you could put your hands on and play with. I mean, Kubernetes is a great example of this, and this is actually why I referred to it.

[00:18:19] CC: The way that works for me when I’m learning something like that is to really think about where the boundaries are, where the limitations are, where the tradeoffs are. If you can take a smaller system, maybe something like The Kind Project and identify what those things are. If you can’t, then ask around. Ask someone. Google it. I don’t know. Maybe it will be a good episode topic for us to do that. This part is doing this to map things out. So maybe we can understand better and help people understand things better.

So mainly like yeah. They try to do the distributed system thesis are. But for people who don’t even know what they could be, it’s harder to identify it. I don’t know what a good strategy for that would be, because you can read about distributed systems and then you can go and look at a project. How do you map the concept to learning to what you’re seeing in the code base? For me, that’s the hardest thing.

[00:19:26] MG: Exactly. Something that kind of I had related experience was like when I went into software development, without having formal education on algorithms and data structures, sometimes in your head, you have the problem statement and you're like, “Okay. I would do it like that.” But you don't know the word that describes, for example, a heap structure or queue because you’ve never – Someone told you that is heap, that is a queue, and/or that is a stick.

So, for me, reading the book was a bit easier. Even though I have done distributed systems, if you will, administration for many years, many years ago, I didn't realize that it was a distributed system because I never had this definition or I never had those failure scenarios in mind and it never had a word for consensus. So how would I search for something like how do machines agree? I mean, if you put that on Google, then likely they will come – Have a lot of stuff. But if you put it in consensus algorithm, likely you get a good hit on what the answer should be.

[00:20:29] CC: It is really problematic when we don't know the names of things because – What you said is so right, because we are probably doing a lot of distributed systems without even knowing that that’s what it is. Then we go in the job interview, and people are, “Oh! Have you done a distributed system?” No. You have but you just don’t know how to name things. But that’s one –

[00:20:51] DC: Yeah, exactly.

[00:20:52] CC: Yeah. Right? That’s one issue. Another issue, which is a bigger issue though is at least that’s how it is for me. I don’t want to speak for anybody else but for me definitely. If I can’t name things and I face a problem and I solve it, every time I face that problem it’s a one-off thing because I can’t map to a higher concept.

So every time I face that problem, it’s like, “Oh!” It’s not like, “Oh, yeah!” If this is this kind of problem, I have a pattern. I’m going to use that to this problem. So that’s what I’m saying. Once you learn the concept, you need to be able to name it. Then you can map that concept to problems you have.

All of a sudden, if you have like three things [inaudible 00:21:35] use to solve this problem, because as you work with computers, coding, it’s like you see the same thing over and over again. But when you don’t understand the fundamentals, things are just like – It’s a bunch of different one-offs. It’s like when you have an argument with your spouse or girlfriend or boyfriend. Sometimes, it’s like you’re arguing 10 times in a month and you thought, “Oh! I had 10 arguments.” But if you’d stop and think about it, no. We had one argument 10 times. It’s very different than having 10 problems versus having 1 problem 10 times, if that makes sense.

[00:22:12] MG: It does.

[00:22:11] DC: I think it does, right?

[00:22:12] MG: I just want to agree.

[00:22:16] DC: I think it does make sense. I think it’s interesting. You’ve highlighted kind of an interesting pattern around the way that people learn, which I think is really interesting. That is like some people are able to read about patterns or software patterns or algorithms or architectures and have that suddenly be an index of their heads. They can actually then later on correlate what they've read with the experience that they’re having around the things they're working on.

For some, it needs to be hands-on. They need to actually be able to explore that idea and understand and manipulate it and be able to describe how it works or functions in person, in reality. They need to have that hands-on like, “I need to touch it to understand it,” kind of experience. Those people also, as they go through those experiences, start building this index of patterns or algorithms in their head. They have this thing that they can correlate to, right, like, “Oh! This is a time problem,” or, “This is a consensus problem,” or what have you, right?

[00:23:19] CC: Exactly.

[00:23:19] DC: You may not know the word for that saying but you're still going to develop a pattern in your mind like the ability to correlate this particular problem with some pattern that you’ve seen before. What's interesting is I feel like people have taken different approaches to building that index, right? For me, it’s been troubleshooting. Somebody gives me a hard problem, and I dig into it and I figure out what the problem is, regardless of whether it's to do with distributed systems or cooking. It could be anything, but I always want to get right in there and figure out what that problem and start building a map in my mind of all of the players that are involved.

For others, I feel like with an educational background, if you have an education background, I think that sometimes you end up coming to this with a set of patterns already instilled that you understand and you're just trying to apply those patterns to the experience you’re having instead. It’s just very – It’s like horse before the cart or cart before the horse. It’s very interesting when you think about it.

[00:24:21] CC: Yes.

[00:24:22] MG: The recommendation that I just want to give to people that are like me who like reading is that I went overboard a bit in the beginnings because I was so fascinated by all the stuff, and it went down the rabbit hole deeper, deeper, deeper, deeper. Reading and reading and reading. At some point, even coming to weird YouTube channels that talk about like, “Is time real and where does time emerge from?” It became philosophical even like the past where I went to.

Now, the thing is, and this is why I like Duffie’s approach with like breaking things and then undergo like trying to break things and understanding how they work and how they can fail is that immediately you practice. You’re hands-on. So that would be my advice to people who are more like me who are fascinated by reading and all the theory that your brain and your mind is not really capable of kind of absorbing all the stuff and then remembering without practicing. Practicing can be breaking things or installing things or administrating things or even writing software. But for me, that was also a late realization that I should have maybe started doing things earlier than the time I spent reading.

[00:25:32] CC: By doing, you mean, hands-on?

[00:25:35] MG: Yeah.

[00:25:35] CC: Anything specific that you would have started with?

[00:25:38] MG: Yes. On Kubernetes – So going back those 15 years to my early days of Linux and Samba, which is a project. By the time, I think it was written in C or C++. But the problem was I wasn’t able to read the code. So the only thing that I had by then was some mailing lists and asking questions and not even knowing which questions to ask because of lack of words of understanding.
Now, fast-forward into Kubernetes’ time, which got me deeper in distributed systems, I still couldn't read the code because I didn't know [inaudible 00:26:10]. But I forced myself to read the code, which helped a little bit for myself to understand what was going on because the documentation by then was lacking. These days, it’s easier, because you can just install [inaudible 00:26:20] way easier today. The hands-on piece, I mean.

[00:26:23] CC: You said something interesting, Michael, and I have given this advice before because I use this practice all the time. It's so important to have a vocabulary. Like you just said, I didn't know what to ask because I didn’t know the words. I practice this all the time. To people who are in this position of distributed systems or whatever it is or something more specific that you are trying to learn, try to expose yourself to the words, because our brains are amazing. Once you get exposure, it’s like your brain works in the background. All of a sudden, you go, “Oh, yeah! I know this word.”

So podcasts are great for me. If I don't know something, I will look for a podcast on the subject and I start listening to it. As the words get repeated, just contextually. I don’t have to go and get a degree or anything. Just by listening to the words being spoken in context, absorb the meaning of it. So podcasting is great or YouTube or anything that you can listen. Just in reading too, of course. The best thing is talking to people. But, again, it’s really – Sometimes, it’s not trivial to put yourself in positions where people are discussing these things.

[00:27:38] DC: There are actually a number of Meetups here in the Bay Area, and there’s a number of Meetups – That whole Meetup thing is sort of nationwide across the entire US and around the world it seems like now lately. Those Meetups I feel like there are a number of Meetups in different subject areas. There’s one here in the Bay Area called Papers We Love, where they actually do explore interesting technical papers, which are obviously a great place to learn the words for things, right? This is actually where those words are being defined, right?

When you get into the consensus stuff, they really get into – One even is Raft. There are many papers on Raft and many papers on multiple things that get into consensus. So definitely, whether you explore a meetup on a distributed system or in a particular application or in a particular theme like Kubernetes, those things are great places just to kind of get more exposure to what people are thinking about in these problems.

[00:28:31] CC: That is such a great tip.

[00:28:34] MG: Yeah. The podcast is twice as good as well, because for people, non-natives – English speaker, I mean. Oh, people. Not speakers. People. The thing is that the word you’re looking for might be totally different than the English word. For example, consensus in Germany has this totally different meaning. So if I would look that up in German, likely I would find nothing or not really related at all. So you have to go through translation and then finding the stuff.

So what you said, Duffie, with PWL, Papers We Love, or podcasts, those words, often they are in English, those podcasts and they are natural consensus or charting or partitioning. Those are the words that you can at least look up like what does it mean. That’s what I did as well thus far.

[00:29:16] CC: Yes. I also wanted to do a plus one for Papers We Love. It’s – They are everywhere and they also have an online. They have an online version of the Papers We Love Meetup, and a lot of the local ones film their meetups. So you can go through the history and see if they talked about any paper that you are interested in.

Probably, I’m sure multiple locations talk about the same paper, so you can get different takes too. It’s really, really cool. Sometimes, it’s completely obscure like, “I didn’t get a word of what they were saying. Not one. What am I doing here?” But sometimes, they talk about things. You at least know what the thing is and you get like 10% of it. But some paper you don’t. People who deal with papers day in and day out, it’s very much – I don’t know.

[00:30:07] DC: It’s super easy when going through a paper like that to have the imposter syndrome wash over you, right, because you’re like –

[00:30:13] CC: Yes. Thank you. That’s what I wanted to say.

[00:30:15] DC: I feel like I’ve been in this for 20 years. I probably know a few things, right. But in talking about reading this consensus paper going, “Can I buy a vowel? What is happening?”

[00:30:24] CC: Yeah. Can I buy a vowel? That’s awesome, Duffie.

[00:30:28] DC: But the other piece I want to call out to your point, which I think is important is that some people don't want to go out and be there in person. They don’t feel comfortable or safe exploring those things in person.

So there are tons of resources like you have just pointed out like the online version of Papers We Love. You can also sign into Slack and just interact with people via text messaging, right? There’s a lot of really great resources out there for people of all types, including the amount of time that you have.

[00:30:53] CC: For Papers We Love, it’s like going to language class. If you go and take a class in Italian, your first day, even though that is going to be super basic, you’re going to be like, “What?” You’ll go back in your third week. You start, “Oh! I’m getting this.” Then a month, three months, “Oh! I’m starting to be competent.”

So you go once. You’re going to feel lost and experience imposter syndrome. But you keep going, because that is a format. First, you start absorbing what the format is, and that helps you understand the content. So once your mind absorbs the format, you’re like, “Okay. Now, I have – I know how to navigate this. I know what’s coming next.” So you don’t have to focus on that. You start focusing in the content. Then little but little, you become more proficient in understanding. Very soon, you’re going to be willing to write a paper. I’m not there yet.

[00:31:51] DC: That’s awesome.

[00:31:52] CC: At least that’s how I think it goes. I don’t know.

[00:31:54] MG: I agree.

[00:31:55] DC: It’s also changed over time. It’s fascinating. If you read papers from like 20 years ago and you read papers that are written more recently, it's interesting. The papers have changed their language when considering competition.
When you're introducing a new idea with a paper, frequently that you are introducing it into a market full of competition. You're being very careful about the language, almost in a way to complicate the idea rather than to make it clear, which is challenging. There are definitely some papers that I’ve read where I was like, “Why are you using so many words to describe this simple idea?” It makes no sense, but yeah.

[00:32:37] CC: I don’t want to make this episode all about Papers We Love. It was so good that you mentioned that, Duffie. It’s really good to be in a room where we’ll be watching something online where you see people asking questions and people go, “Oh! Why is this thing like this? Why is X like this,” or, “Why is Y doing like this?” Then you go, “Oh! I didn’t even think that X was important. I didn’t even know that Y was important.”

So you stop picking up what the important things are, and that’s what makes it click is now you’ve – Hooking into the important concepts because people who know more than you are pointing out and asking questions. So you start paying attention to learning what the main things it should be paying attention to, which is different from reading the paper by yourself. It’s just a ton of content that you need to sort through.

[00:33:34] DC: Yeah. I frequently self-describe it as a perspective junkie, because I feel like for any of us really to learn more about a subject that we feel we understand, we need the perspective of others to really engage, to expand our understanding of that thing. I feel like and I know how to make a peanut butter and jelly sandwich. I’ve done it a million times. It’s a solid thing. But then I watch my kid do it and I’m like, “I hadn’t thought of that problem.” [inaudible 00:33:59], right? This is a great example of that.

Those communities like Papers We Love are great opportunity to understand the perspective of others around these hard ideas. When we’re trying to understand complex things like distributed systems, this is where it’s at. This is actually how we go about achieving this. There is a lot that you can do on your own but there is always going to be more that you can do together, right? You can always do more. You can always understand this idea faster. You can understand the complexity of a system and how to break it down into these things by exploiting it with other people. That's I feel like –

[00:34:40] CC: That is so well said, so well said, and it’s the reason for this show to exist, right? We come on a show and we give our perspectives, and people get to learn from people with different backgrounds, what their takes are on distributed systems, cloud native. So this was such a major plug for the show. Keep coming back. You’re going to learn a ton.

Also, it was funny that you – It was the second time you mentioned cooking, made a cooking reference, Duffie, which brings me to something I want to make sure I say on this episode. I added a few things for reference, three books. But the one that I definitely would recommend starting with is The Systems Bible by John Gall. This book is so cool, because it helps you see everything through systems. Everything is a system. A conversation can be a system. An interaction between two people can be a system. I’m not saying this book says that. It’s just like my translation and that you can look –

Cooking is a system. There is a process. There is a sequence. It’s really, really cool and it really helps to have things framed in this way and then go out and read the other books on systems. I think it helps a lot. This is definitely what I am starting with and what I would recommend people start with, The Systems Bible. Did you two know this book?

[00:36:15] MG: I did not. I don’t.

[00:36:17] DC: I’m not aware of it either but I really appreciate the idea. I do think that that's true. If you develop a skill for understanding systems as they are, you’ll basically develop – Frequently, what you’re developing is the ability to recognize patterns, right?

[00:36:32] CC: Exactly.

[00:36:32] DC: You could recognize those patterns on anything.

[00:36:37] MG: Yeah. That's a good segue for just something that came to my mind. Recently, I gave a talk on event-driven architectures. For someone who's not a software developer or architect, it can be really hard to grab all those concepts on asynchrony and eventual consistency and idempotency. There are so many words of like, “What is this all – It sounds weird, way too complex.” But I was reading a book some years ago by Gregor Hohpe. He’s the guy behind Enterprise Integration Patterns. That’s also a book that I have on my list here. He said, “Your barista doesn't use two-phase commit.” So he was basically making this analogy of he was in a coffee shop and he was just looking at the process of how the barista makes the coffee. You pay for it and all the things that can go wrong while your coffee is brewed and served to you.

So he was making this relation between the real world and the life and human society to computer systems. There it clicked to me where I was like, “So many problems we solve every day, for example, agreeing on a time where we should meet for dinner or cooking, is a consensus problem, and we solve it.”

We even solve it in the case of failure. I might not be able to call Duffie, because he is not available right now. So somehow, we figure out. I always thought that those problems just exist in computer science and distributed systems. But I realized actually that's just a subset of the real world as is. Looking at those problems through the lens of your daily life and you get up and all the stuff, there are so many things that are related to computer systems.

[00:38:13] CC: Michael, I missed it. Was it an article you read?

[00:38:16] MG: Yes. I need to put that in there as well. Yeah. It’s a plug.

[00:38:19] CC: Please put that in there. Absolutely. So far from being any kind of expert in distributed systems, but I have noticed. I have caught myself using systems thinking for even complicated conversations. Even in my personal life, I started approaching things in the systems oriented and just the – just a high-level example.

When I am working with systems, I can approach from the beginning, the end. It’s like a puzzle, putting the puzzle together, right? Sometimes, it starts from the middle. Sometimes, it starts from the edges. When I‘m having conversations that I need to be very strategic like I have one shot. Let’s say maybe I’m in a school meeting and I have to reach a consensus or have a solution or have a plan of action. I have to ask the right questions.
My private self would do things linearly. Historically like, “Let’s go from the beginning and work out through the end.” Now, I don’t do that anymore. Not necessarily. Sometimes, I like, “Let me maybe ask the last question I would ask and see where it leads and just approach things from a different way.” I don’t know if this is making sense.

[00:39:31] MG: It does. It does.

[00:39:32] CC: But my thinking has changed. The way I see the possibilities is not a linear thing anymore. I see how you can truly switch things. I use this in programming a lot and also writing. Sometimes, when you’re a beginner writer, you start at the top and you go down to the conclusion. Sometimes, I start I the middle and go up, right? So you can start anywhere. It’s beautiful or it just gives you so many more options. Or maybe I’m just crazy. Don’t listen to me.

[00:40:03] DC: I don’t think you’re crazy. I was going to say, one of the funny things about Michael’s point and your point both, it’s like in a way that they have kind of referred to Conway's law, the idea that people will build systems in the way that they communicate. So this is actually – It totally brings it back to that same point of thing, right? We by nature will build systems that we can understand, because that is the constraint in which we have to work, right? So it’s very interesting.

[00:40:29] CC: Yeah. But it’s an interesting thing, because we are [inaudible 00:40:32] by the way we are forced to work. For example, I work with constraints and what I'm saying is that that has been influencing my way of thinking.

So, yes, I built systems in the way I think but also because of the constraints that I’m dealing with that I have to be – the tradeoffs I need to make, that also turns around and influences the way I think, the way I see the world and the rest of the systems and all the rest of the world. Of course, as I change my thinking, possibly you can theorize that you go back and apply that. Apply things that you learn outside of your work back to your work. It’s a beautiful back-and-forth I think.

[00:41:17] MG: I had the same experience with some – When I had to design kind of my first API and think of, “Okay. What would the consumer contract be and what would a consumer expect me to deliver in response and so on?” I was forcing myself and being explicit in communicating and not throwing everything at the client back to confusing but being very explicit and precise. Also on communication every day when you talk to people, being explicit and precise really helps to avoid a lot of problems and trouble. Be it partnership or amongst friends or at work.

This is what I took from computer science actually back into my real world in order to taking all those perceptions, perceiving things from a different perspective, and being more precise and explicit in how I respond or communicated.

[00:42:07] CC: My take on what you just said, Michael, is we design systems thinking how is this going to fail. We know this is going to fail. We’re going to design for that. We’re going to implement for that.

In real life, for example, if I need to get an agreement from someone, I try to understand the person's thinking and just go, “I just had this huge thing this week. This is in my mind.” I’m not constantly thinking about this, I’m not crazy like that. Just a little bit crazy. It’s like, “How does this person think? What do they need to know? How far can I push?” Right? We need to make a decision quickly, so the approach is everything, and sometimes you only get one shot, so yeah. I mean, correct me if I’m wrong. That's how I heard or I interpreted what you just said.

[00:42:52] MG: Yeah, absolutely. Spot on. Spot on. So I’m not crazy as well.

[00:42:55] CC: Basically, I think we ended up turning this episode into a little bit of like, “Here are great references,” and also a huge endorsement for really going deep into distributed systems, because it’s going to be good for your jobs. It’s going to be good for your life. It’s going to be good for your health. We are crazy.

[00:43:17] DC: I’m definitely crazy. You guys might be. I’m not. All right. So we started this episode with the idea of coming to learning distributed systems perhaps without a degree or without a formal education in it. We talked about a ride of different ideas on that subject. Like different approaches that each of us took, how each of us see the problem. Is there any important point that either of you want to throw back into the mix here or bring up in relation to that?

[00:43:48] MG: Well, what I take from this episode, being my first episode and getting to know your background, Duffie and Carlisia, is that whoever is going to listen to this episode, whatever background you have, even though you might not be in computer systems or industry at all, I think we three all had approved that whatever background you have, if you’re just curious a little bit and maybe a little bit crazy, you can totally get down the rabbit hole in distributed systems and get totally excited about it. There’s no need for having formal education and the degree to enter this world. It might help but it’s kind of not a high bar that I was perceiving it to be 10 years ago, for example.

[00:44:28] CC: Yeah. That’s a good point. My takeaway is it always puzzled me how some people are so good and experienced and such experts in distributed systems. I always look at myself. It’s like, “How am I lacking?” It’s like, “What memo did I miss? What class did I miss? What project did I not work on to get the experience?” What I’m seeing is you just need to put yourself in that place. You need to do the work. But the good news is achieving competency in distributed systems is doable.

[00:45:02] DC: My takeaway is as we discussed before, I think that there is no one thing that comprises a distributed system. It is a number of things, right, and basically a number of behaviors or patterns that we see that comprise what a distributed system is.

So when I hear people say, “I’m not an expert in distributed systems,” I think, “Well, perhaps you are and maybe you don’t know it already.” Maybe there's some particular set of patterns with which you are incredibly familiar. Like you understand DNS better than the other 20 people in the room. That exposes you to a set of patterns that certainly give you the capability of saying that you are an expert in that particular set of patterns.

So I think that to both of your points, it’s like you can enter this stage where you want to learn about distributed systems from pretty much any direction. You can learn it from a CIS background. You can come it with no computer experience whatsoever, and it will obviously take a bit more work. But this is really just about developing and understanding around how these things communicate and the patterns with which they accomplish that communication. I think that’s the important part.

[00:46:19] CC: All right, everybody. Thank you, Michael Gasch, for being with us now. I hope to –

[00:46:25] MG: Thank you.

[00:46:25] CC: To see you in more episodes [inaudible 00:46:27]. Thank you, Duffie.

[00:46:30] DC: My pleasure.

[00:46:31] CC: Again, I’m Carlisia Campos. With us was Duffie Cooley and Michael Gesh. This was episode 12, and I hope to see you next time. Bye.

[00:46:41] DC: Bye.

[00:46:41] MG: Goodbye.

[END OF EPISODE]

[00:46:43] ANNOUNCER: Thank you for listening to The Podlets Cloud Native Podcast. Find us on Twitter at https://twitter.com/ThePodlets and on the http://thepodlets.io/ website, where you'll find transcripts and show notes. We'll be back next week. Stay tuned by subscribing.

[END]

Original Source: https://omny.fm/shows/the-podlets/012-learning-distributed-systems

jamallmahmoudi · September 2, 2020, 2:22pm

Hi , castrojo
very was useful and perfect
thanks a lot

Topic	Replies	Views
The Podlets: Should I Kubernetes? (Ep 18) General Discussions podcast	1286	February 24, 2020
The Podlets: Kubernetes as per Kelsey Hightower (Ep 7) General Discussions podcast	991	December 12, 2019
The Podlets: Keeping up with Cloud Native (Ep 17) General Discussions podcast	724	February 17, 2020
The Podlets: The Past, Present and Future of Kubernetes with Craig McLuckie (Ep 13) General Discussions podcast	827	January 25, 2020
The Podlets: 009 - Stateful and Stateless Workloads (Ep 9) General Discussions podcast	707	December 23, 2019

The Podlets: Learning Distributed Systems (Ep 12)

Related topics