This book was an idea Andrew and I had back when we were working together at BuzzFeed around 2014. I had started as a data science out of my PhD in physics, and he had been working for years as a software developer. There was so much we had learned on the job as data scientists that we had never come across in the classroom. At the same time that the knowledge was experiential, it was also explainable. We thought it would be easy enough to turn into a book, and that we could save others the time it took us to gather all these lessons.
This book isn't finished. There's a great deal on the application of the more theoretic insights in the latter half of the book that we didn't have the space or time to add this time around. We'll look forward to revising and expanding for a second edition, where we hope to give a more thorough treatment of components of distributed systems, like REST APIs, queue-readers and schedulers, and how these can be combined as components of machine learning platforms powering modern data products.
For its incompleteness, we hope that you'll find this is the resource that was missing early in your career when you're entering the workforce from your educational background, and looking for the missing link between academic pedagogy and practice. We hope it will help you in the transition from machine learning in an academic setting to machine learning in production.