<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" encoding="UTF-8" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:atom="http://www.w3.org/2005/Atom/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:fireside="http://fireside.fm/modules/rss/fireside">
  <channel>
    <fireside:hostname>web02.fireside.fm</fireside:hostname>
    <fireside:genDate>Sun, 19 Apr 2026 14:27:01 -0500</fireside:genDate>
    <generator>Fireside (https://fireside.fm)</generator>
    <title>Pipeline Conversations - Episodes Tagged with “Evaluation”</title>
    <link>https://podcast.zenml.io/tags/evaluation</link>
    <pubDate>Sun, 15 Dec 2024 21:00:00 +0100</pubDate>
    <description>Pipeline Conversations brings you interviews with platform engineers, ML practitioners, and technical leaders building production AI systems. We dig into the real challenges of MLOps and LLMOps: orchestrating complex workflows on Kubernetes, fine-tuning and evaluating models at scale, and shipping AI that actually works. From ZenML.
</description>
    <language>en-us</language>
    <itunes:type>episodic</itunes:type>
    <itunes:subtitle>MLOps and LLMOps, from the trenches</itunes:subtitle>
    <itunes:author>ZenML GmbH</itunes:author>
    <itunes:summary>Pipeline Conversations brings you interviews with platform engineers, ML practitioners, and technical leaders building production AI systems. We dig into the real challenges of MLOps and LLMOps: orchestrating complex workflows on Kubernetes, fine-tuning and evaluating models at scale, and shipping AI that actually works. From ZenML.
</itunes:summary>
    <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/4/4d525632-f8ef-47c1-9321-20f5c498b1ac/cover.jpg?v=3"/>
    <itunes:explicit>no</itunes:explicit>
    <itunes:keywords>machine-learning, machinelearning, mlops, deeplearning, ai, artificialintelligence, artificial-intelligence, technology, tech, mlops, llmops</itunes:keywords>
    <itunes:owner>
      <itunes:name>ZenML GmbH</itunes:name>
      <itunes:email>podcast@zenml.io</itunes:email>
    </itunes:owner>
<itunes:category text="Technology"/>
<item>
  <title>The Evaluation Playbook: Making LLMs Production-Ready 🧪📈</title>
  <link>https://podcast.zenml.io/llmops-db-evaluation</link>
  <guid isPermaLink="false">8254e8a5-306a-46c6-9695-ecd0daea4150</guid>
  <pubDate>Sun, 15 Dec 2024 21:00:00 +0100</pubDate>
  <author>ZenML GmbH</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/4d525632-f8ef-47c1-9321-20f5c498b1ac/8254e8a5-306a-46c6-9695-ecd0daea4150.mp3" length="21055840" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>3</itunes:season>
  <itunes:author>ZenML GmbH</itunes:author>
  <itunes:subtitle>A comprehensive exploration of real-world lessons in LLM evaluation and quality assurance, examining how industry leaders tackle the challenges of assessing language models in production.</itunes:subtitle>
  <itunes:duration>32:43</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/4/4d525632-f8ef-47c1-9321-20f5c498b1ac/episodes/8/8254e8a5-306a-46c6-9695-ecd0daea4150/cover.jpg?v=2"/>
  <description>A comprehensive exploration of real-world lessons in LLM evaluation and quality assurance, examining how industry leaders tackle the challenges of assessing language models in production. 
Through diverse case studies, we cover the transition from traditional ML evaluation, establishing clear metrics, combining automated and human evaluation strategies, and implementing continuous improvement cycles to ensure reliable LLM applications at scale.
Please read the full blog post here (https://www.zenml.io/blog/the-evaluation-playbook-making-llms-production-ready) and the associated LLMOps database entries here (https://zenml.io/llmops-database). 
</description>
  <itunes:keywords>llmops, llms, ai, mlops, genai, evaluation</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>A comprehensive exploration of real-world lessons in LLM evaluation and quality assurance, examining how industry leaders tackle the challenges of assessing language models in production. </p>

<p>Through diverse case studies, we cover the transition from traditional ML evaluation, establishing clear metrics, combining automated and human evaluation strategies, and implementing continuous improvement cycles to ensure reliable LLM applications at scale.</p>

<p>Please read the full blog post <a href="https://www.zenml.io/blog/the-evaluation-playbook-making-llms-production-ready" rel="nofollow">here</a> and the associated LLMOps database entries <a href="https://zenml.io/llmops-database" rel="nofollow">here</a>.</p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>A comprehensive exploration of real-world lessons in LLM evaluation and quality assurance, examining how industry leaders tackle the challenges of assessing language models in production. </p>

<p>Through diverse case studies, we cover the transition from traditional ML evaluation, establishing clear metrics, combining automated and human evaluation strategies, and implementing continuous improvement cycles to ensure reliable LLM applications at scale.</p>

<p>Please read the full blog post <a href="https://www.zenml.io/blog/the-evaluation-playbook-making-llms-production-ready" rel="nofollow">here</a> and the associated LLMOps database entries <a href="https://zenml.io/llmops-database" rel="nofollow">here</a>.</p>]]>
  </itunes:summary>
</item>
  </channel>
</rss>
