<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" encoding="UTF-8" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:atom="http://www.w3.org/2005/Atom/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:fireside="http://fireside.fm/modules/rss/fireside">
  <channel>
    <fireside:hostname>web01.fireside.fm</fireside:hostname>
    <fireside:genDate>Wed, 22 Apr 2026 12:24:26 -0500</fireside:genDate>
    <generator>Fireside (https://fireside.fm)</generator>
    <title>Pipeline Conversations - Episodes Tagged with “Optimization”</title>
    <link>https://podcast.zenml.io/tags/optimization</link>
    <pubDate>Mon, 13 Jan 2025 08:00:00 +0100</pubDate>
    <description>Pipeline Conversations brings you interviews with platform engineers, ML practitioners, and technical leaders building production AI systems. We dig into the real challenges of MLOps and LLMOps: orchestrating complex workflows on Kubernetes, fine-tuning and evaluating models at scale, and shipping AI that actually works. From ZenML.
</description>
    <language>en-us</language>
    <itunes:type>episodic</itunes:type>
    <itunes:subtitle>MLOps and LLMOps, from the trenches</itunes:subtitle>
    <itunes:author>ZenML GmbH</itunes:author>
    <itunes:summary>Pipeline Conversations brings you interviews with platform engineers, ML practitioners, and technical leaders building production AI systems. We dig into the real challenges of MLOps and LLMOps: orchestrating complex workflows on Kubernetes, fine-tuning and evaluating models at scale, and shipping AI that actually works. From ZenML.
</itunes:summary>
    <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/4/4d525632-f8ef-47c1-9321-20f5c498b1ac/cover.jpg?v=3"/>
    <itunes:explicit>no</itunes:explicit>
    <itunes:keywords>machine-learning, machinelearning, mlops, deeplearning, ai, artificialintelligence, artificial-intelligence, technology, tech, mlops, llmops</itunes:keywords>
    <itunes:owner>
      <itunes:name>ZenML GmbH</itunes:name>
      <itunes:email>podcast@zenml.io</itunes:email>
    </itunes:owner>
<itunes:category text="Technology"/>
<item>
  <title>Optimizing LLM Performance and Cost for LLMs in Production</title>
  <link>https://podcast.zenml.io/llmops-db-performance-and-cost-optimization</link>
  <guid isPermaLink="false">850c441c-eb0b-4d22-ad8d-3da4224b35b6</guid>
  <pubDate>Mon, 13 Jan 2025 08:00:00 +0100</pubDate>
  <author>ZenML GmbH</author>
  <enclosure url="https://aphid.fireside.fm/d/1437767933/4d525632-f8ef-47c1-9321-20f5c498b1ac/850c441c-eb0b-4d22-ad8d-3da4224b35b6.mp3" length="21385889" type="audio/mpeg"/>
  <itunes:episodeType>full</itunes:episodeType>
  <itunes:season>3</itunes:season>
  <itunes:author>ZenML GmbH</itunes:author>
  <itunes:subtitle>A deep dive into the world of LLM optimization and cost management - a critical challenge facing AI teams today.</itunes:subtitle>
  <itunes:duration>33:49</itunes:duration>
  <itunes:explicit>no</itunes:explicit>
  <itunes:image href="https://media24.fireside.fm/file/fireside-images-2024/podcasts/images/4/4d525632-f8ef-47c1-9321-20f5c498b1ac/episodes/8/850c441c-eb0b-4d22-ad8d-3da4224b35b6/cover.jpg?v=3"/>
  <description>In this episode, we dive deep into the world of LLM optimization and cost management - a critical challenge facing AI teams today. Join us as we explore real-world strategies from companies like Dropbox, Meta, and Replit who are pushing the boundaries of what's possible with large language models. From clever model selection techniques and knowledge distillation to advanced inference optimization and cost-saving strategies, we'll unpack the tools and approaches that are helping organizations squeeze maximum value from their LLM deployments. Whether you're dealing with runaway API costs, struggling with inference latency, or looking to optimize your model infrastructure, this episode provides practical insights that you can apply to your own AI initiatives. Perfect for ML engineers, technical leads, and anyone responsible for maintaining LLM systems in production.
Please read the full blog post here (https://www.zenml.io/blog/optimizing-llm-performance-and-cost-squeezing-every-drop-of-value) and the associated LLMOps database entries here (https://zenml.io/llmops-database). 
</description>
  <itunes:keywords>llmops, llms, ai, mlops, genai, optimization, performance, cost</itunes:keywords>
  <content:encoded>
    <![CDATA[<p>In this episode, we dive deep into the world of LLM optimization and cost management - a critical challenge facing AI teams today. Join us as we explore real-world strategies from companies like Dropbox, Meta, and Replit who are pushing the boundaries of what&#39;s possible with large language models. From clever model selection techniques and knowledge distillation to advanced inference optimization and cost-saving strategies, we&#39;ll unpack the tools and approaches that are helping organizations squeeze maximum value from their LLM deployments. Whether you&#39;re dealing with runaway API costs, struggling with inference latency, or looking to optimize your model infrastructure, this episode provides practical insights that you can apply to your own AI initiatives. Perfect for ML engineers, technical leads, and anyone responsible for maintaining LLM systems in production.</p>

<p>Please read the full blog post <a href="https://www.zenml.io/blog/optimizing-llm-performance-and-cost-squeezing-every-drop-of-value" rel="nofollow">here</a> and the associated LLMOps database entries <a href="https://zenml.io/llmops-database" rel="nofollow">here</a>.</p>]]>
  </content:encoded>
  <itunes:summary>
    <![CDATA[<p>In this episode, we dive deep into the world of LLM optimization and cost management - a critical challenge facing AI teams today. Join us as we explore real-world strategies from companies like Dropbox, Meta, and Replit who are pushing the boundaries of what&#39;s possible with large language models. From clever model selection techniques and knowledge distillation to advanced inference optimization and cost-saving strategies, we&#39;ll unpack the tools and approaches that are helping organizations squeeze maximum value from their LLM deployments. Whether you&#39;re dealing with runaway API costs, struggling with inference latency, or looking to optimize your model infrastructure, this episode provides practical insights that you can apply to your own AI initiatives. Perfect for ML engineers, technical leads, and anyone responsible for maintaining LLM systems in production.</p>

<p>Please read the full blog post <a href="https://www.zenml.io/blog/optimizing-llm-performance-and-cost-squeezing-every-drop-of-value" rel="nofollow">here</a> and the associated LLMOps database entries <a href="https://zenml.io/llmops-database" rel="nofollow">here</a>.</p>]]>
  </itunes:summary>
</item>
  </channel>
</rss>
