Shipping quality software in hostile environments

A presentation at PyCon Balkan in November 2018 in Belgrade, Serbia by Luka Kladaric

Slide 1

Slide 1

SHIPPING QUALITY SOFTWARE IN HOSTILE ENVIRONMENTS @kll #PyConBLKN 2018.

Slide 2

Slide 2

WHO? Luka Kladaric Chaos Manager @ Sekura Collective recovering web developer of 10+ years architecture, infrastructure & security consultant also a startup founder and remote work evangelist 2 — @kll #PyConBLKN 2018.

Slide 3

Slide 3

HOSTILE ENVIRONMENTS? 3 — @kll #PyConBLKN 2018.

Slide 4

Slide 4

WHAT IS TECH DEBT? 4 — @kll #PyConBLKN 2018.

Slide 5

Slide 5

Tech debt is the implied cost of additional rework caused by choosing an easy solution over a longer and better approach. 5 — @kll #PyConBLKN 2018.

Slide 6

Slide 6

Tech debt is: an API that returns a list of results without pagination 6 — @kll #PyConBLKN 2018.

Slide 7

Slide 7

Tech debt is: fragile code that everything runs through 7 — @kll #PyConBLKN 2018.

Slide 8

Slide 8

Tech debt is: entire systems that have become too complex to change or deprecate 8 — @kll #PyConBLKN 2018.

Slide 9

Slide 9

Tech debt is: parts of the codebase nobody wants to touch 9 — @kll #PyConBLKN 2018.

Slide 10

Slide 10

Tech debt is: broken development tools and processes lack of confidence in the build and deploy process 10 — @kll #PyConBLKN 2018.

Slide 11

Slide 11

Tech debt is: everything the team wishes they could change, but can't afford to 11 — @kll #PyConBLKN 2018.

Slide 12

Slide 12

WHERE DOES IT COME FROM? 12 — @kll #PyConBLKN 2018.

Slide 13

Slide 13

Insufficient up-front definition Tight coupling of components Lack of attention to the foundations Evolution over time 13 — @kll #PyConBLKN 2018.

Slide 14

Slide 14

WHAT'S THE HARM? 14 — @kll #PyConBLKN 2018.

Slide 15

Slide 15

DEVELOPERS ARE JUST SPOILED... 15 — @kll #PyConBLKN 2018.

Slide 16

Slide 16

FALSE. 16 — @kll #PyConBLKN 2018.

Slide 17

Slide 17

Unaddressed tech debt breeds more tech debt 17 — @kll #PyConBLKN 2018.

Slide 18

Slide 18

"We'll get back to that later" "why does X have to be clean, when Y isn't?" 18 — @kll #PyConBLKN 2018.

Slide 19

Slide 19

Productivity over time decreases Deadlines slip 19 — @kll #PyConBLKN 2018.

Slide 20

Slide 20

CASE STUDY 20 — @kll #PyConBLKN 2018.

Slide 21

Slide 21

So I get a call one day. 21 — @kll #PyConBLKN 2018.

Slide 22

Slide 22

Within a few days, my alarms start going off 22 — @kll #PyConBLKN 2018.

Slide 23

Slide 23

Massive monolithic git repo 23 — @kll #PyConBLKN 2018.

Slide 24

Slide 24

No concept of stable 24 — @kll #PyConBLKN 2018.

Slide 25

Slide 25

Hand-crafted build server 25 — @kll #PyConBLKN 2018.

Slide 26

Slide 26

No local dev environments Everyone works directly on production systems 26 — @kll #PyConBLKN 2018.

Slide 27

Slide 27

No db schema migration system or versioning 27 — @kll #PyConBLKN 2018.

Slide 28

Slide 28

Over 1/2 of the servers not deployable from scratch 28 — @kll #PyConBLKN 2018.

Slide 29

Slide 29

Code review tool is self-hosted abandonware 29 — @kll #PyConBLKN 2018.

Slide 30

Slide 30

Outages a daily occurrence 30 — @kll #PyConBLKN 2018.

Slide 31

Slide 31

Everyone focused on shipping features 31 — @kll #PyConBLKN 2018.

Slide 32

Slide 32

How do you even begin to fix this? 32 — @kll #PyConBLKN 2018.

Slide 33

Slide 33

It took over a year and a half. 33 — @kll #PyConBLKN 2018.

Slide 34

Slide 34

HOW? 34 — @kll #PyConBLKN 2018.

Slide 35

Slide 35

build server rebuilt from scratch 35 — @kll #PyConBLKN 2018.

Slide 36

Slide 36

build and deploy jobs defined 100% in code 36 — @kll #PyConBLKN 2018.

Slide 37

Slide 37

monolithic git repository split up into 40 smaller repositories 37 — @kll #PyConBLKN 2018.

Slide 38

Slide 38

all servers rebuilt and redeployed with Ansible 38 — @kll #PyConBLKN 2018.

Slide 39

Slide 39

better code review tool 39 — @kll #PyConBLKN 2018.

Slide 40

Slide 40

most dev work doesn't require VPN any more 40 — @kll #PyConBLKN 2018.

Slide 41

Slide 41

etc. 41 — @kll #PyConBLKN 2018.

Slide 42

Slide 42

JOB WELL DONE! 42 — @kll #PyConBLKN 2018.

Slide 43

Slide 43

The moral of this story is: don't wait for permission to do your job right. 43 — @kll #PyConBLKN 2018.

Slide 44

Slide 44

  1. If you see something broken, fix it 2. If you don't have time to fix it - write it down 3. But do come back to it when you can steal a minute 4. Even if it takes months to make progress 44 — @kll #PyConBLKN 2018.

Slide 45

Slide 45

The team was well aware of how broken things were. If we pushed for it to be a single massive project, it would've never happened. 45 — @kll #PyConBLKN 2018.

Slide 46

Slide 46

EXCEPT... 46 — @kll #PyConBLKN 2018.

Slide 47

Slide 47

That's not how things should be. 47 — @kll #PyConBLKN 2018.

Slide 48

Slide 48

How do we do better? 48 — @kll #PyConBLKN 2018.

Slide 49

Slide 49

"Tech debt" work is difficult to sell 49 — @kll #PyConBLKN 2018.

Slide 50

Slide 50

It's not like paying off your credit card 50 — @kll #PyConBLKN 2018.

Slide 51

Slide 51

It's incredibly difficult to schedule work to address tech debt 51 — @kll #PyConBLKN 2018.

Slide 52

Slide 52

If You Don’t Schedule Time for Maintenance, Your Equipment Will Schedule It for You 52 — @kll #PyConBLKN 2018.

Slide 53

Slide 53

I recently came across an article that changed the way I think about this 53 — @kll #PyConBLKN 2018.

Slide 54

Slide 54

Sprints, marathons and root canals by Gojko Adzic HTTPS://GOJKO.NET/2018/08/30/SPRINTS-MARATHONS-ROOT-CANALS.HTML 54 — @kll #PyConBLKN 2018.

Slide 55

Slide 55

New name: sustainability work 55 — @kll #PyConBLKN 2018.

Slide 56

Slide 56

Budget vs planning 56 — @kll #PyConBLKN 2018.

Slide 57

Slide 57

Helps with morale 57 — @kll #PyConBLKN 2018.

Slide 58

Slide 58

QUESTIONS? @kll #PyConBLKN 2018.

Slide 59

Slide 59

THANK YOU! Luka Kladaric twitter: @kll luka@sekura.io www.sekura.io @kll #PyConBLKN 2018.