
Chris Daigle sits down with Hernan Lardiez, COO of RagMetrics, to break down AI evaluations (evals) and why monitoring matters when you put GenAI into production especially in regulated or high-risk environments. Hernan explains what “good evals” actually look like without getting lost in technical weeds: building test datasets, measuring accuracy and consistency, and then continuously re-testing so you can catch drift before it becomes a business problem. They compare the “spreadsheet + spot check” approach to automated eval pipelines that can run fast, repeatable tests at scale. The conversation also covers a practical way to think about pre-production testing vs. in-production monitoring, why token usage and cost should be part of evaluation, and how small RAG tuning decisions (like Top-K chunks) can improve accuracy while cutting token consumption. If you’re leading AI adoption and you want confidence not guesswork this episode will help you build the control points and guardrails to scale GenAI safely. 🔎 Find Out More About Hernan Lardiez Hernan Lardiez on LinkedIn https://www.linkedin.com/in/hlardiez/ RagMetrics https://ragmetrics.ai/ 🛠 AI Tools and Resources Mentioned RagMetrics - https://ragmetrics.ai The AI Exchange (Rachel Woods) - https://www.theaiexchange.com/ Chief AI Officer - https://www.chiefaiofficer.com/ 📌 Chapters 00:00 Why regulated industries can’t “hope” with AI 02:04 What model evaluations (evals) actually are 05:08 The two audiences: business owner vs builders 08:52 Pre-production testing vs in-production monitoring 14:23 Why “monitoring is required” to reduce risk 16:14 Manual spreadsheet grading vs automated evals 18:01 Building test datasets + injecting through the pipeline 31:21 Measuring accuracy AND token consumption (cost) 34:01 Continuous evals to catch drift over time 42:11 RAG tuning: Top-K chunks, accuracy vs noise, token savings 49:21 Evals as “low-cost insurance” for production AI 50:27 Closing advice: control points + IT boundaries In this clip from the Using AI at Work podcast, we explore the challenges of AI implementation, particularly for organizations in regulated markets. The discussion highlights the critical role of effective risk management in navigating potential outcomes. We identify key stakeholders, like the business owner and the development team, who are crucial for understanding AI requirements and ensuring compliance. This session emphasizes the importance of strategic ai leadership and how ai business can integrate these considerations for successful operations management.