{"version":"https://jsonfeed.org/version/1","title":"Pipeline Conversations","home_page_url":"https://podcast.zenml.io","feed_url":"https://podcast.zenml.io/json","description":"Pipeline Conversations is a fortnightly podcast bringing you interviews and discussion with industry leaders, top technology professionals and others. We discuss the latest developments in machine learning, deep learning, artificial intelligence, with a particular focus on MLOps, or how trained models are used in production.","_fireside":{"subtitle":"A Machine Learning Podcast by ZenML","pubdate":"2022-11-10T17:00:00.000+01:00","explicit":false,"copyright":"2024 by ZenML GmbH","owner":"ZenML GmbH","image":"https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/4/4d525632-f8ef-47c1-9321-20f5c498b1ac/cover.jpg?v=2"},"items":[{"id":"f0ecb9ec-1eeb-4ef3-829d-0afe43ee70e9","title":"ML at the British Library with Daniel van Strien","url":"https://podcast.zenml.io/daniel-van-strien-british-library","content_text":"This week I spoke with Daniel van Strien, a digital curator working at the British Library. Daniel has worked on a number of projects at the intersection of archives, libraries and machine learning and I was really happy to have the chance to get to unpack some of the ways he's finding to apply these techniques and tools.\n\nIn particular, I found it interesting how important the annotation process is as part of many overall workflows, as well as how simple out-of-the-box techniques like image classification using a fine-tuned model could satisfy many low-hanging fruit-type use cases.Special Guest: Daniel van Strien.","content_html":"
This week I spoke with Daniel van Strien, a digital curator working at the British Library. Daniel has worked on a number of projects at the intersection of archives, libraries and machine learning and I was really happy to have the chance to get to unpack some of the ways he's finding to apply these techniques and tools.
\n\nIn particular, I found it interesting how important the annotation process is as part of many overall workflows, as well as how simple out-of-the-box techniques like image classification using a fine-tuned model could satisfy many low-hanging fruit-type use cases.
Special Guest: Daniel van Strien.
","summary":"This week I spoke with Daniel van Strien, a digital curator working at the British Library. Daniel has worked on a number of projects at the intersection of archives, libraries and machine learning and I was really happy to have the chance to get to unpack some of the ways he's finding to apply these techniques and tools.","date_published":"2022-11-10T17:00:00.000+01:00","attachments":[{"url":"https://aphid.fireside.fm/d/1437767933/4d525632-f8ef-47c1-9321-20f5c498b1ac/f0ecb9ec-1eeb-4ef3-829d-0afe43ee70e9.mp3","mime_type":"audio/mpeg","size_in_bytes":41376868,"duration_in_seconds":3448}]},{"id":"253cd080-cfca-4b29-9a53-1641ec9b384b","title":"Questioning MLOps with Lak Lakshmanan","url":"https://podcast.zenml.io/lak-lakshmanan","content_text":"This week I spoke with Lak Lakhshmanan, who worked for years at Google on ML and AI projects and products at a senior level and he also brings years of experience working on meteorology and other scientific projects previously.\n\nLak brings a ton of experience to the table and it was interesting to hear his suggestions around when it is and isn't appropriate to bring the full set of MLOps tools to the table, for example. We also discussed the fundamentals of doing ML-backed projects as well as the teams needed to make those projects succeed.Special Guest: Lak Lakshmanan.Links:Lak on LinkedInlak lakshmanan (@lak_luster) / TwitterValliappa Lakshmanan (Lak) - HomeLak Lakshmanan – MediumAmazon.com: Machine Learning Design Patterns: Solutions to Common Challenges in Data Preparation, Model Building, and MLOps: 9781098115784: Lakshmanan, Valliappa, Robinson, Sara, Munn, Michael: BooksAmazon.com: Practical Machine Learning for Computer Vision eBook : Lakshmanan, Valliappa, Görner, Martin, Gillard, Ryan: Kindle StoreAmazon.com: Google BigQuery: The Definitive Guide: Data Warehousing, Analytics, and Machine Learning at Scale eBook : Lakshmanan, Valliappa, Tigani, Jordan: Kindle StoreAmazon.com: Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning: 9781098118952: Lakshmanan, Valliappa: Books","content_html":"This week I spoke with Lak Lakhshmanan, who worked for years at Google on ML and AI projects and products at a senior level and he also brings years of experience working on meteorology and other scientific projects previously.
\n\nLak brings a ton of experience to the table and it was interesting to hear his suggestions around when it is and isn't appropriate to bring the full set of MLOps tools to the table, for example. We also discussed the fundamentals of doing ML-backed projects as well as the teams needed to make those projects succeed.
Special Guest: Lak Lakshmanan.
Links:
This week I spoke with Charles Frye. Not only has Charles volunteered to be a judge on our Month of MLOps competition happening right now, he's part of the core team working on the Full Stack Deep Learning course.
\n\nNaturally, we get into education for practitioners as well as the things that Charles has seen in his own prior background working on production use cases. We also discuss the ways that tooling to support education as well as productive machine learning can and is being improved.
Special Guest: Charles Frye.
Links:
In today's conversation, I'm speaking with Goku Mohandas, founder and creator of the amazing online resource MadeWithML. Goku has a bunch of practical experience, from working with Apple to a startup in the oncology space and much more.
\n\nIn this conversation we continued to unpack the theme of education in ML, the challenges when it comes to working across the full stack of ML applications, and what he's seen work in his experience working on MadeWithML.
\n\nWe also discuss some of the patterns he's seen in the production stacks he's seen in his experience consulting with various ML teams as well as where he sees room for improvement in the abstractions that we all rely on to do our work.
\n\nGoku has generously agreed to be an external judge for our Month of MLOps competition that starts on October 10. If you haven't signed up yet, or want to learn more, please visit zenml.io/competition.
Special Guest: Goku Mohandas.
Links:
So excited to be able to announce our 🔥 AMAZING 🔥 external judges for the ZenML Month of MLOps competition! We have a stellar panel of ✨ ML and MLOps heroes ✨ to help select the best pipelines from all of your submissions!
\n\n💥 Charles Frye, core instructor at the amazing Full Stack Deep Learning course
\n💥 Anthony Goldbloom, co-founder and former CEO of Kaggle
\n💥 Chip Huyen, author of 'Designing Machine Learning Systems' and co-founder of Claypot AI
\n💥 Goku Mohandas, founder of MadeWithML, another essential course in production ML
We're honoured to have them on board for the ride, and we can't wait to see all the amazing ML use cases and problems our competitors solve along the way!
\n\nTo learn more about the competition and to sign up, visit https://zenml.io/competition
Links:
","summary":"So excited to be able to announce our :fire: AMAZING :fire: external judges for the ZenML Month of MLOps competition! We have a stellar panel of :sparkles: ML and MLOps heroes :sparkles: to help select the best pipelines from all of your submissions! ","date_published":"2022-09-26T14:00:00.000+02:00","attachments":[{"url":"https://aphid.fireside.fm/d/1437767933/4d525632-f8ef-47c1-9321-20f5c498b1ac/20b2e352-4565-487d-ad6e-e0f865c75da5.mp3","mime_type":"audio/mpeg","size_in_bytes":5925023,"duration_in_seconds":493}]},{"id":"f7d61b52-02c8-4401-894b-92110dde2267","title":"Data-centric Computer Vision with Eric Landau","url":"https://podcast.zenml.io/data-centric-computer-vision-eric-landau-encord","content_text":"This week I spoke with Eric Landau, co-founder of Encord, a platform for data-centric computer vision. This podcast contains a lot of geekery about annotation, and even though Encord aren't an annotation tool per se, Eric and his team have tackled a bunch of quite complicated problems relating to that domain.\n\nWe also discuss the much-used term 'data-centric AI' and consider where it's useful and where perhaps there's a little bit of hype. We also get into some of the technical tradeoffs and decisions that come when building a platform. I'm really excited to get to present this episode to you today as I really enjoyed the discussion.Special Guest: Eric Landau.Links:Eric Landau (LinkedIn)Encord | The platform for data-centric computer visionEncord blogEncord (Github)Encord (@encord_team) / Twitter","content_html":"This week I spoke with Eric Landau, co-founder of Encord, a platform for data-centric computer vision. This podcast contains a lot of geekery about annotation, and even though Encord aren't an annotation tool per se, Eric and his team have tackled a bunch of quite complicated problems relating to that domain.
\n\nWe also discuss the much-used term 'data-centric AI' and consider where it's useful and where perhaps there's a little bit of hype. We also get into some of the technical tradeoffs and decisions that come when building a platform. I'm really excited to get to present this episode to you today as I really enjoyed the discussion.
Special Guest: Eric Landau.
Links:
This week we dive into the abstractions that we're all trying to layer on top of the core ML processes and workflows. I spoke with Phil Howes, co-founder and chief scientist at BaseTen. BaseTen is a platform that allows data scientists to go from an initial model to an MVP web app quickly.
\n\nWe got into some of the big challenges he had working to build out the platform, as well as the core issue of iteration speed that motivates why they're building BaseTen.
\n\nPhil has experienced quite a few of the industry's end-to-end patterns in the years that he's been working on machine learning and it was great to have that context inform the conversation, too.
Special Guest: Phil Howes.
Links:
This week I spoke with Savin Goyal and Hugo Bowne-Anderson from Outerbounds. They both work on leading, building and helping people put models into production through Metaflow, and I'm sure current users of ZenML will find this conversation interesting to hear how they think through the broader questions and engineering problems involved with MLOps.
\n\nAbove all, we spoke about the challenges involved in building a tool that handles the whole machine learning story, from collecting data to training models, to deployment and back again. In many ways it's great that there are lots of smart people thinking about this really hard problem, and even though it is by no means 'solved' conversations like this make me feel cautiously optimistic about the space.
Special Guests: Hugo Bowne-Anderson and Savin Goyal.
Links:
This week I spoke with Mateo Rojas-Carulla, the CTO and a co-founder of Lakera and Matthias Kraft, also a co-founder and the CPO there. Lakera is an AI safety company that does a lot of work in the computer vision domain, building a platform and tools for users to gain more confidence in the output and functionality of their models.
\n\nWe discuss how they think about the testing of machine learning models, and about how having this safety element upfront has implications for how you go about the testing and ensuring robustness. We specifically dive into how to go about testing computer vision models and the various pitfalls that are to be found in that domain.
Special Guests: Mateo Rojas-Carulla and Matthias Kraft.
","summary":"This week I spoke with Mateo Rojas-Carulla, the CTO and a co-founder of Lakera and Matthias Kraft, also a co-founder and the CPO there. Lakera is an AI safety company that does a lot of work in the computer vision domain, building a platform and tools for users to gain more confidence in the output and functionality of their models.","date_published":"2022-08-04T10:00:00.000+02:00","attachments":[{"url":"https://aphid.fireside.fm/d/1437767933/4d525632-f8ef-47c1-9321-20f5c498b1ac/6300d5ea-04f5-45a5-8c81-ca184b3d5bd4.mp3","mime_type":"audio/mpeg","size_in_bytes":41436610,"duration_in_seconds":3452}]},{"id":"bc28e259-b867-43c8-96c4-24d355b78903","title":"Satellite Vision with Robin Cole","url":"https://podcast.zenml.io/satellite-vision-robin-cole","content_text":"This week I spoke with Robin Cole, a senior data scientist at Satellite Vu, a company that's about to launch a thermal imaging satellite into space in order to provide new ways of seeing the earth from above.\n\nRobin generously took the time to discuss his day to day work involving satellite data, the stack they work with at Satellite Vu as well as some of the difficulties that come up in the domain. We also discuss the extremely popular satellite-image-deep-learning GitHub repo that presents resources for those working with or seeking to learn about this kind of data.Special Guest: Robin Cole.Links:About Us — Satellite VuSatellite Vu (LinkedIn)Satellite Vu prepares to launch its thermal imaging satellite constellation with $21M A round | TechCrunchrobmarkcole/satellite-image-deep-learning: Resources for deep learning with satellite & aerial imageryRobin Cole (LinkedIn)GeoTIFF - Wikipedia","content_html":"This week I spoke with Robin Cole, a senior data scientist at Satellite Vu, a company that's about to launch a thermal imaging satellite into space in order to provide new ways of seeing the earth from above.
\n\nRobin generously took the time to discuss his day to day work involving satellite data, the stack they work with at Satellite Vu as well as some of the difficulties that come up in the domain. We also discuss the extremely popular satellite-image-deep-learning GitHub repo that presents resources for those working with or seeking to learn about this kind of data.
Special Guest: Robin Cole.
Links:
This week on the podcast I spoke with Gerard Kruisheer, the CTO and co-founder of Captain AI, a company based in the Netherlands working on autonomous shipping out of the busy Rotterdam port.
\n\nWe discussed the unique problems that come with building autonomous vehicles, the extent to which the latest and greatest research informs their work, their production stack and how they handle deployment for their particular setup.
\n\nAs always please let us know if you have guests you'd like me to speak to by sending a message to us on slack or by emailing [podcast@zenml.io](podcast@zenml.io).
Special Guest: Gerard Kruisheer.
Links:
I'll be having some conversations with the people behind the tools that ZenML offers as integrations. We spoke with Ben Wilson a few weeks back, and today I'm pleased to publish this conversation with Emeli Dral, co-founder and CTO of Evidently, an open-source tool tackling the problem of monitoring of models and data for machine learning.
\n\nWe discussed the challenges around building a tool that is both straightforward to use while also customisable and powerful. We also got into the thinking behind how they grew their community and blog along the way.
Special Guest: Emeli Dral.
Links:
This week I spoke with Karthik Kannan, cofounder and CTO of Envision, a company that builds on top of the Google Glass and using Augmented Reality features of phones to allow visually impaired people to better sense the environment or objects around them.
\n\nTheir software and devices are pretty popular and as you'll hear in this conversation, they've been on a real journey to get to where they are now.
\n\nIn particular, I really enjoyed the parts where Karthik explained their development and deployment process in detail. It's not too often that you get a deep dive into the workflows and stacks of an embedded computer vision company and tool and so I think you're going to really enjoy this one.
Special Guest: Karthik Kannan.
Links:
In this episode, I'm really happy to be able to continue the dialogue we've been having with our users and community around the role of data annotation and labeling in MLOps.
\n\nWe were lucky to get to talk to Iva Gumnishka, the founder of Humans in the Loop. They are an organisation that provides data annotation and collection services. Their teams are primarily made up of those who have been affected by conflict and now are asylum seekers or refugees.
\n\nIva has a ton of experience working with annotation and has seen how different companies build this into their production machine learning lifecycles. We're continuing to work on a feature that will allow you to do this as part of your MLOps workflow when using ZenML, and I welcome any feedback you might have on the back of this podcast or the articles we've been publishing on the ZenML blog.
Special Guest: Iva Gumnishka.
Links:
We took a few weeks break to reach out to some new guests and so I think we can go so far as declaring this next series of episodes as season 2 of Pipeline Conversations.
\n\nToday, I'm extremely excited to present this conversation I had with Ben Wilson who works over at Databricks and who has also just released a new book called 'Machine Learning Engineering in Action'. It's a jam-backed guide to all the lessons that Ben has learned over his years working to help companies get models out into the world and run them in production.
\n\nI was really lucky to get to talk to Ben about his new book and also about the mental models he thinks are useful to bring to bear on this complicated problem many of us are working on.
Special Guest: Ben Wilson.
Links:
Adam and Hamza return for a short discussion of what we've been busy working on during the previous few months, where we're going with ZenML and why it's so amazing to be building an open-source tool.
","summary":"Adam and Hamza return for a short discussion of what we've been busy working on during the previous few months, where we're going with ZenML and why it's so amazing to be building an open-source tool.","date_published":"2022-04-28T12:00:00.000+02:00","attachments":[{"url":"https://aphid.fireside.fm/d/1437767933/4d525632-f8ef-47c1-9321-20f5c498b1ac/8ce789d5-23c4-4251-933d-c4797ea40684.mp3","mime_type":"audio/mpeg","size_in_bytes":18373251,"duration_in_seconds":1531}]},{"id":"3b306917-5653-40d1-b3c7-85c92ac80ad3","title":"Trustworthy ML with Kush Varshney","url":"https://podcast.zenml.io/trustworthy-ml-kush-varshney","content_text":"I enthusiastically read Kush Varshney's book when it was released for free to the world several months back. Trustworthy Machine Learning is a concise and clear overview of many of the ways that machine learning can go wrong, and so I was especially keen to get Kush on to talk more about his work and research.\n\nI also got a stronger sense of appreciation for how good MLOps practices and workflows offered a clear path to ensuring that your machine learning models and behaviours could become more trustworthy. Kush has done a lot of interesting work, particularly with the AI Fairness 360 and AI Explainability 360 toolkits that I'm sure listeners of this podcast would find worth checking out.Special Guest: Kush Varshney.Links:Trustworthy Machine Learning by Kush R. VarshneyHome - AI Explainability 360Home - AI Fairness 360Kush VarshneyKush Varshney (@krvarshney) / TwitterKush Varshney | LinkedInTrustworthy Machine Learning: Varshney, Kush R.: 9798411903959: Amazon.com: Books","content_html":"I enthusiastically read Kush Varshney's book when it was released for free to the world several months back. Trustworthy Machine Learning is a concise and clear overview of many of the ways that machine learning can go wrong, and so I was especially keen to get Kush on to talk more about his work and research.
\n\nI also got a stronger sense of appreciation for how good MLOps practices and workflows offered a clear path to ensuring that your machine learning models and behaviours could become more trustworthy. Kush has done a lot of interesting work, particularly with the AI Fairness 360 and AI Explainability 360 toolkits that I'm sure listeners of this podcast would find worth checking out.
Special Guest: Kush Varshney.
Links:
This week I spoke with Matt Squire, the CTO and co-founder of Fuzzy Labs, where they help partner organisations think through how best to productionise their machine learning workflows.
\n\nMatt and FuzzyLabs are also behind the Awesome Open Source MLOps GitHub repo where you can find all the options for an open-source MLOps stack of your dreams.
\n\nMatt has been an enthusiastic early supporter of the work we do at ZenML so it was really amazing to get to talk to him and get his take based on the many experiences he's had seeing how ML is done out in the field.
Special Guest: Matt Squire.
Links:
This week I spoke with Emmanuel Ameisen, a data scientist and ML engineer currently based at Stripe. Emmanuel also wrote an excellent O'Reilly book called "Building Machine Learning Powered Applications", a book I find myself often returning to for inspiration and that I was pleased to get the chance to reread in preparation for our discussion.
\n\nEmmanuel has previously worked at Insight Data Science where he was involved in mentoring and guiding dozens of data scientists who were working on building their ML portfolio projects. He brings a wealth of experience to the table and I'm really excited to present our conversation to you.
Special Guest: Emmanuel Ameisen.
Links:
This week I spoke with Johnny Greco, a data scientist working at Radiology Partners. Johnny transitioned into his current work from a career as an academic — working in astronomy — where also worked in the open-source space to build a really interesting synthetic image data project.
\n\nWe get into that project in our conversation but we also discuss his experience of crossing over into industry, the skills that have served him in his new job, and his experience of working in a world where the stakes around models in production are much higher.
Special Guest: Johnny Greco.
Links:
This week I spoke with Tristan Zajonc, the CEO and cofounder of Continual, a company that provides an AI layer for enterprise companies or, as we'll get into in the podcast, the so-called 'modern data stack'.
\n\nHe previously worked at Cloudera as a CTO for machine learning and as the head of the data science platform there, and he holds a PhD in public policy from Harvard University.
\n\nIn our conversation we discussed the different levels of abstraction one can take when dealing with the MLOps problem. We spoke about all the different ways that machine learning can fail in production settings and of course we discussed the concept of the 'modern data stack' and what that means.
Special Guest: Tristan Zajonc.
Links:
Our guest this week was Mohan Mahadevan, a senior VP at Onfido, a machine-learning powered identity verification platform. He has previously worked at Amazon heading up a computer vision team working on robotics applications as well as for many years at KLA, a leading semiconductor hardware company. He holds a doctorate in theoretical physics from Colorado State University.
\n\nMohan had mentioned that he thought it might be interesting to discuss neurosymbolic AI, and the implications of a shift towards that as a core paradigm for production AI systems. In particular, we discuss the practical consequences of such a shift, both in terms of team composition as well as infrastructure requirements.
Special Guest: Mohan Mahadevan.
Links:
Our guest this week is Ines Montani, co-founder and CEO of Explosion, a company based out of Berlin that produce tools that you probably know and love like Spacy, a Python Natural Language Processing library and Prodigy, a data annotation tool.
\n\nI've always found Ines to be personally inspiring in the work that she and her team produce as well as how they present themselves to the world, so it was a real pleasure to get to dive into the weeds as to exactly how that happens. We also discuss how NLP works in production, what reproducibility means for ML projects and much more.
Special Guest: Ines Montani.
Links:
This week, we spoke with Danny Leybzon, currently working with WhyLabs to help data scientists monitor their models in production and prevent model performance from degrading. He previously worked as a kind of roving data scientist and engineer, helping companies put their models into production.
\n\nAs such, we had a really interesting discussion of some of the ways that tooling and the general context for data science sometimes lets practitioners down,
\nAnd of course we also discussed why monitoring and logging is actually a kind of baseline practice that should be part of any and every data scientist's toolkit. Luckily for us, Danny added in a bunch of examples from his wide experience doing all this in the real world.
Special Guest: Danny Leybzon.
Links:
Noah Gift is the founder of Pragmatic A.I. Labs and author of 'Practical MLOps'. We discuss the role of MLOps in an organisation, some deployment war stories from his career as well as what he considers to be 'best practices' in production machine learning.
\n\nRead the summary blogpost on the ZenML blog.
Special Guest: Noah Gift.
Links:
Adam and Hamza introduce themselves for the first episode of Pipeline Conversations. They discuss the world of MLOps, where ZenML sits within this space, and why it's such a complicated problem to solve.
","summary":"Adam and Hamza introduce themselves for the first episode of Pipeline Conversations. They discuss the world of MLOps, where ZenML sits within this space, and why it's such a complicated problem to solve.","date_published":"2021-11-19T12:00:00.000+01:00","attachments":[{"url":"https://aphid.fireside.fm/d/1437767933/4d525632-f8ef-47c1-9321-20f5c498b1ac/094212f9-84ef-4cb3-8c2a-87230f3cef0a.mp3","mime_type":"audio/mpeg","size_in_bytes":21411204,"duration_in_seconds":1338}]}]}