This Day in AI Podcast

Michael Sharkey, Chris Sharkey

TechnologyNews

Join Michael and Chris Sharkey, two proudly average tech enthusiasts, as they stumble through the world of artificial intelligence with all the grace of a robot learning to dance. This (sometimes weekly*) podcast delivers an hour-long conversation about their thoroughly middle-of-the-road adventures with AI. No PhDs. No Silicon Valley insights. Just two guys with enough technical knowledge to be dangerous, sharing their unexceptional yet entertaining experiences with AI tools and technology. Subscribe now to hear: • Mediocre hot takes on AI developments • Stories of AI experiments gone adequately okay • The most average advice you'll ever need • Two Sharkeys trying their best to sound smart about algorithms • Childish AI prank calls that somehow fool everybody • Attempts at using AI for phishing attacks on their mother • "Chart-topping" AI songs according to the brothers Join our perfectly mediocre community where being average at AI is celebrated, questions are encouraged, and learning through mistakes is our specialty. Because let's face it - most of us are figuring this out as we go along. New episodes drop whenever we remember to record them. 🎙️ Proudly supported by Simtheory.ai

Episodes

This Day in AI Podcast

Michael Sharkey, Chris Sharkey

TechnologyNews

Episodes

We Built Microsoft Teams in 23 Minutes (And You Can Use It) & GPT 5.4 Impressions - EP99.37

we've been having way too much fun with the AI again. OpenAI just dropped GPT-5.4 and 5.4 Pro, and holy shit - we finally have a ball game. This might be the first OpenAI model that genuinely competes with Opus 4.6 for agentic work. But here's where it gets wild: we rebuilt Trello AND Microsoft Teams from scratch using single prompts. Not mockups. Fully deployed, working apps with authentication, video chat, the works. You can literally sign up and use them right now. Plus: We roast Gemini 3.1 (it's a disgrace for agentic workflows), break down the insane $30/$180 per million pricing on 5.4 Pro (who is this for??), and discuss why every $99/month SaaS tool might be about to die. Chris declares his programming skills "useless" and honestly... he might be right. We also demo our actual workflow - running 5 agent tabs simultaneously, delegating everything, and why we barely visit websites anymore. The AI workspace IS the operating system now. CHAPTERS: 0:00 - Intro & Housekeeping (We Screwed Up the Link) 1:26 - GPT-5.4 First Impressions & Specs 3:12 - Chris's Testing: 40 Minutes to Solve a Problem 4:51 - Knowledge Work Improvements (Catching Up to Anthropic) 6:38 - Computer Use vs Browser/Terminal Debate 8:07 - Why We Don't Need Computer Use Anymore 9:53 - Teaser: We Built Full SaaS Apps Today 11:19 - Tool Search API & Skills Integration 13:20 - The Speed Problem (It's a Plodder) 15:12 - GPT-5.4 Pro Pricing Reaction ($30/$180 WTF) 18:14 - Someone Rebuilt Minecraft in 24 Minutes 19:46 - Gemini 3.1 Roast: "It's a Disgrace" 22:36 - DEMO: Trallo (Full Trello Clone) 29:03 - DEMO: Macrosoft Teams (Working Video Chat!) 33:30 - The SaaS Collapse Theory 36:42 - AI Workspace as the New Operating System 38:57 - Forcing Features onto Entrenched Software 43:32 - "My Programming Skills Are Useless" - Chris 46:06 - The $12 Million Legacy Software Opportunity 51:06 - Beyond Code: Forms, PDFs, Knowledge Work 55:28 - How Fast Will This Change Everything? 56:31 - Gemini 3.1 Flash Lite Quick Take 59:36 - The Delegation Lifestyle (5 Agent Tabs Running) 1:01:24 - Mike's Workflow Demo 1:04:31 - Cognitive Overload Problem 1:06:04 - Release Date: 2 Weeks (Drop Punishment Ideas!) Thanks for listening like and sub xoxo

1h 8min•Mar 6, 2026

Nano Banana 2 is Here! Gemini-3 Shutdown & The AI Layoff Myth | EP99.36

we're diving into Google's new Nano Banana 2 image model - 50% cheaper and supposedly faster (when the servers aren't melting). We put it through its paces with annotation-based editing, slide generation, and yes, the return of the legendary horse egg experiment. Plus: Google quietly kills Gemini-3 after just a few months (good riddance?), we discuss why the model was "dead on arrival" for agentic workflows, and break down the real story behind those massive AI layoff announcements from Block and WiseTech. Spoiler: it's probably not actually about AI. We also get into the current state of the model wars (Opus 4.6 vs Codex 5.3), why smaller models like GLM-5 might be the future for enterprise agentic tasks, and Chris's wife teaching Claude to literally speak to her using Mac's text-to-speech. The models are getting creative. --- 0:00 - Intro 0:36 - Nano Banana 2: Price, Speed & First Impressions 3:19 - The Compositing Problem & Last Mile Design 5:41 - Annotation-Based Editing (This Changes Everything) 9:52 - Slide Editing & Real-World Use Cases 12:34 - The Horse Egg Experiment Returns 14:30 - Image Degradation & Cost Breakdown 17:47 - Text-to-Image Leaderboard Discussion 20:01 - Why Nano Banana Dominates for Work 22:07 - Codex 5.3 vs Opus 4.6 22:54 - Google Kills Gemini-3 (What Went Wrong?) 26:48 - Google's Agentic Problem 30:08 - The Model Loyalty Cycle 34:22 - Why Opus 4.6 is Still the Best 37:05 - Cost Optimization & Smart Model Routing 43:30 - When Models Get Stuck on the Wrong Path 45:36 - Nicole's AI Learns to Talk Back 46:54 - Can Anyone Build Software Now? 52:26 - Anthropic's Legal/Finance Plugins & Market Panic 57:08 - Block Lays Off 4,000: AI or Excuse? 1:00:05 - The AI Job Apocalypse Isn't Real Thanks for listening like and sub xoxo

1h 2min•Feb 27, 2026

Gemini 3.1 Pro, Claude Sonnet 4.6 & The OpenClaw Hire That Killed the Chatbot Era - EP99.35

We're struggling to care. In this episode, we break down why Gemini went from being our daily driver to a model we barely touch, the "tunnel vision" hallucination problem that killed the Gemini 3 series for us, and whether 3.1 Pro actually fixes it. We put Gemini 3.1 Pro head-to-head against Claude Opus building a Geoffrey Hinton Doom Center, debate whether anyone can actually tell the difference between Sonnet 4.5 and 4.6, and make the case that smaller models running in agentic loops are secretly beating the frontiers. Plus: OpenAI acquires OpenClaw and we ask why a $100B company couldn't just build it themselves, DHH calls out the AI pricing bubble, Mike compares AI models to cheap wine hangovers, and Sam Altman refuses to hold Dario's hand at the India AI Summit. The model wars are getting weird. CHAPTERS: 0:00 Intro & "Is This The End" Now on Spotify 1:10 Gemini 3.1 Pro: Thinking Controls & The Medium Mode Fix 3:14 The Speed vs Intelligence Trade-Off in Agentic Work 5:10 Why Multitasking With AI Agents Made Us Anxious 6:34 Solid Updates: The Real Goal of Agentic Coding 7:45 Gemini's Fall From Grace: From Daily Driver to Dead Model 10:08 The Tunnel Vision Problem That Killed Gemini 3 13:35 Mixed Reactions: Fanboys vs Reality on Gemini 3.1 Pro 15:06 Side-by-Side Test: Gemini 3.1 Pro vs Claude Opus (Hinton Doom Center) 17:39 Why File Manipulation Accuracy Matters More Than Context Windows 19:27 The Context Window Debate: 1M Tokens vs Smart Sub-Agents 22:05 DHH on Token Pricing: "If There's a Bubble, It's This" 24:11 Should Models Ship as Agent vs Chat Variants? 28:43 Claude Sonnet 4.6: A $2 Discount on Opus? 31:44 The Model Mix: Why One Model Won't Rule Them All 34:40 Anthropic Is Winning — But Can Anyone Tell the Difference? 38:58 OpenAI Acquires OpenClaw: Why Couldn't They Just Build It? 44:18 The Silicon Valley Moment: Sam vs Dario at India AI Summit 47:05 Will Smaller Models Win the Enterprise? The Cost Reality Check 51:27 The End of Single-Shot: Why Agentic Loops Change Everything 55:48 Final Thoughts & Gemini 3.1 Pro Gets One More Week Thanks for listening. Like & Sub. Links above for the Still Relevant Tour signup and Simtheory. Two models dropped on a week again. What a time to be alive. xoxo

58min•Feb 20, 2026

Am I Even Needed Anymore? GLM-5, Agentic Loops & AI Productivity Psychosis - EP99.34

Meanwhile, we're having existential crises about whether we're even needed anymore. In this episode, we break down China's new frontier model that's competing with Opus 4.6 and Codex at a fraction of the price, why agentic loops are making 200K context windows the sweet spot (sorry, million-token dreams), and the very real phenomenon of AI productivity psychosis. We dive into why coding-optimized models are secretly winning at everything, the Harvard study confirming AI doesn't reduce work – it intensifies it, and the exodus of safety researchers from XAI, Anthropic, and OpenAI (spoiler: they're not giving back their shares). Plus: Mike's arm is failing from too much mouse usage, we debate whether the chatbot era is actually fading, and yes – there's a safety researcher diss track called "Is This The End?" CHAPTERS: 0:00 Intro - Is This The End? (Song Preview) 0:11 Still Relevant Tour Update & NASA Listener Callout 1:42 AI Productivity Psychosis: The Pressure of Infinite Capability 4:25 GLM-5 Breakdown: China's New Frontier Model on Huawei Chips 7:24 First Impressions: GLM-5 in Agentic Loops 9:48 Why Cheap Models Matter & The New Model War 14:09 Codex Vibe Shift: Is OpenAI Winning? 16:24 Does Context Window Size Even Matter Anymore? 22:27 The Parallelization Problem & Cognitive Overload 27:27 Mike's Arm Injury & The Voice Input Pivot 31:17 Single-Threaded Work & The 95% Problem 35:06 UX is Unsolved: Rolling Back Agentic Mistakes 38:45 Harvard Study: AI Doesn't Reduce Work, It Intensifies It 44:01 How AI Erodes Company Structure & Why Adoption Takes Years 50:14 My AI vs Your AI: Household Debates 50:43 The Safety Researcher Exodus: XAI, Anthropic, OpenAI 56:49 Final Thoughts: Are We All Still Relevant? 59:04 BONUS: Full "Is This The End?" Diss Track Thanks for listening. Like & Sub. Links above for the Still Relevant Tour signup and Simtheory. GLM-5 is here, your productivity psychosis is valid, and the safety researchers are becoming poets. xoxo

1h 3min•Feb 13, 2026

Is the ChatGPT Era Over? Opus 4.6 & The Shift from Chat to Delegation - EP99.33

Opus 4.6 and Codex 5.3 dropped within minutes of each other, and we're breaking down what this means for the future of AI work. In this episode, we unpack Opus 4.6's million-token context window (if you've got billies in the bank), why Codex's pricing makes it nearly impossible to ignore for agentic loops, and the real cost of running agents for 24 hours ($10K, apparently). We dive deep into why coding-optimized models are secretly crushing it at non-coding tasks, the mental fatigue of managing AI workers, and whether the chatbot era is actually fading or just evolving. Plus: Chris accidentally books three real pig grooming appointments, we debate whether you need a "life coach agent" to manage your agent swarm, and yes – there's an Opus 4.6 diss track that goes unreasonably hard. CHAPTERS: 0:00 Intro - Opus 4.6 Diss Track Preview 0:09 The Model Same-Day Showdown: Opus 4.6 vs Codex 5.3 0:50 Opus 4.6 Breakdown: Million Token Context & Premium Pricing 2:31 Token Bill Shock: $10K Research Bills & Extended Context Costs 5:04 Codex Pricing: Why It's Nearly Free for Agentic Loops 6:42 Why Coding Models Are Secretly Crushing Non-Coding Tasks 10:14 Tool Fatigue: Too Many Models, Too Many Workflows 12:47 Opus 4.6 First Impressions: "Solid" and "Faultless" 13:48 Chris Accidentally Books Three Real Pig Grooming Appointments 16:01 Unix Tools & Why Code-Optimized Models Win at Everything 19:59 The Agentic Retraining Imperative: Chat to Delegation 22:16 Agent Swarms & The Master Thread Architecture 24:51 OpenAI vs Anthropic: The Enterprise Battle 27:09 Corporate Espionage 2.0: Stealing Skills & The Open Source Threat 31:19 The UX Problem: Why Delegation Isn't Solved Yet 34:24 The Stress of Hyper-Productivity & Managing Agent Swarms 37:07 Coordination: The Next Layer of Abstraction 40:09 The Fantasy vs Reality of Autonomous AI Businesses 44:37 Is the Turn-by-Turn Chatbot Era Actually Fading? 49:23 Tokens as Spice: Turning Compute Into Money 52:08 Reduce Cognitive Overload: The Real Goal of AI 55:07 Still Relevant Tour Announcement 55:39 BONUS: Full Opus 4.6 Diss Track Thanks for listening. Like & Sub. Links below for the Still Relevant Tour signup and Simtheory. The model wars are heating up, and your token bill is about to get interesting. xoxo

1h 1min•Feb 6, 2026

Did Clawdbot Just Show Us the Future of AI Workers? & Kimi K2.5 Dis Track Tested - EP99.32

In this episode, we unpack the viral open-source AI assistant that's taken over the internet what it actually does, why everyone's losing their minds, and whether it's worth the $750/day token bills some users are racking up. We dive deep into why locally-run skills and CLI tools are beating computer-use clicking, how smaller models like GPT-5 Mini are crushing it in agentic workflows, and why the real magic is in targeted context - not massive swarms. Plus: Kimi K2.5 drops as a near-Sonnet-level model at 1/10th the price, we debate whether SaaS is dead, and yes – there are TWO Kimi K2.5 diss tracks. One made by Opus pretending to be Kimi. It might just slap? CHAPTERS: 0:00 Intro - Still Relevant Tour Update 0:48 What is Moltbot? The Viral AI Assistant Explained 3:57 Token Bill Shock: $750/Day and Anthropic Bans 5:00 The Dream of Digital Coworkers on Mac Minis 6:52 Why CLI Tools & Skills Beat Computer-Use Clicking 10:57 Why This Way of Working Is Genuinely Exciting 14:47 Smaller Models Crushing It: GPT-5 Mini & Targeted Context 17:30 Wild Agentic Behavior: Chrome Tab Hijacking & Auto-Retries 20:10 Security Architecture: Locked-Down Machines & Enterprise Use 24:01 AI Building Its Own Tools On-The-Fly 27:08 The Fear & Overwhelm of Rapid Progress 29:10 2026: The Year of Agent Workers 31:43 The Challenge of Directing AI Work (Everyone's a Manager Now) 37:24 Skills Will Take Over: Why MCPs & Atlassian Can't Stop Us 40:38 Real-World Use Cases: Doctors, Lawyers & Accountants 46:28 Cost Solutions: Build Workflows Around Cheaper Models 52:58 Kimi K2.5: Sonnet-Level Performance at 1/10th the Price 1:00:55 The "1,500 Tool Calls" Claim: Marketing vs Reality 1:05:23 The Kimi K2.5 Diss Tracks (Opus vs Kimi) 1:08:08 Demo: Black Hole Simulator & Self-Trolling CRM 1:12:55 Is SaaS Dead? 1:14:30 BONUS: Full Kimi K2.5 Diss Tracks Thanks for listening. Like & Sub. Links below for the Still Relevant Tour signup and Simtheory. The future is open source, apparently. xoxo

1h 20min•Jan 30, 2026

The AI Productivity Paradox: Why Doing More Feels Like Burnout: EP99.31

We're either above average or completely unhinged. In this one, we dive deep into the new phenomenon of "AI exhaustion" – that fried feeling you get after multitasking across six agent tabs all day. We share our breakthroughs with AI-assisted presentations (20 minutes vs several hours), why browser-use on your local machine bypasses every anti-scraping technique known to man, and how enterprise context sharing could be the real unlock for organizations. Plus: OpenAI announces ads for ChatGPT (even on paid tiers), their CFO floats taking cuts from drug discoveries (seriously), and Google publicly dunks on them for it. Also – the Still Relevant Australia Tour is coming, and our LinkedIn group hit 200 members (we're basically LinkedIn influencers now too). CHAPTERS: 0:00 Intro - Still Relevant Tour Announcement + LinkedIn Milestone 2:08 AI Exhaustion: The Cognitive Overload of Multitasking with Agents 4:14 Why Single-Tasking with AI Beats Parallel Agent Chaos 7:02 The Problem with "I Spun Up 70,000 Sub-Agents" Twitter Posts 10:03 Mike's Presentation Workflow: From Hours to 20 Minutes 14:06 Why Isn't Copilot Doing This Already? 16:54 Old Models + Great Context = Still Amazing Results 21:14 What's Actually Changed? It's the Software Layer 25:22 Enterprise Context Sharing & Organizational IP 31:22 Skills, Sub-Agents, and Role-Based Knowledge 35:22 Security Concerns: Can You Hack an Agent with Malicious MD Files? 38:23 Cloud Providers Have a Bigger Moat Than the Labs 43:16 Browser Use: The Ultimate Context Gathering Weapon 48:25 Rethinking SaaS: Software That Actually Thinks 53:08 Smart Paste, Smart CC – Why Isn't All Software Like This? 56:32 OpenAI's Desperate Moves: Ads, Age Verification & Drug Royalties 1:03:03 Google Says "No Plans for Gemini Ads" (Shots Fired) 1:07:24 Is OpenAI Okay? The Vibes Are Definitely Off 1:10:35 Capitalism Won't Give You Free Time, Just More Demands 1:11:20 Outro + Still Relevant Tour Details Thanks for listening. Like & Sub. Links below for the Still Relevant Tour signup and Simtheory. xoxo

1h 12min•Jan 23, 2026

2026 Existential Crisis, Claude Code Hype & Is SaaS Dead? EP99.30-WIZARDS

In this episode, we unpack the two camps dominating AI C/Twitter: hype boys claiming "Claude Code can do my washing" vs. software developers doom-scrolling themselves into career panic. We put the agentic hype to the test and discover that no, you can't actually run 8 agents recreating your local business ecosystem while you sleep. Plus, we reflect on why MCP is exhausting, why Gemini 3 Pro is somehow worse than Gemini 2.5 Pro, and why Geoffrey Hinton would rather write his book than answer questions in Tasmania. Also featuring: the $200,000/month enterprise AI problem, why SaaS isn't dead (but it's scared), and our prediction that AI workspaces will become the everything app. CHAPTERS: 00:00 Intro - Unpacking the 2026 AI Vibes 02:21 Putting Claude Code and Agentic Hype to the Test 05:57 Why Twitter AI Demos Never Show the Receipts 07:03 Honest Assessment of Where Frontier Models Are At 11:19 Building the Everything App with Email, Calendar and Files 16:47 Collaborative Mode vs Agentic Delegation in Practice 21:29 The Real Cost of Enterprise AI at Scale 24:32 Why Cheaper Models Like Haiku and Gemini Flash Matter 29:25 Is SaaS Actually Dead or Just Disrupted 38:11 The Future of AI Platforms, SDKs and App Stores 43:35 The Untapped Opportunity in Paid Proprietary MCPs 51:21 Geoffrey Hinton Refuses to Take Questions in Tasmania 55:05 2026 Plans and the Still Relevant Tour Announcement Thanks for listening. Like & Sub. xoxox

1h 9min•Jan 19, 2026

Gemini 3 Flash, GPT-Image-1.5, Skills vs MCPs, and Our 2025 Model Reviews - EP99.29

The Gift of Simtheory: https://simtheory.ai --- 2025 Model Timeline: https://simulationtheory.ai/5fd0e964-4c41-4f9a-bbb3-2a398d8500f0 It's the long-anticipated holiday special... except Mike and Kris forgot to prepare so it's just a normal episode. 🎅 This week: Gemini 3 Flash drops and it's actually incredible - cheap, fast, and weirdly smarter than Gemini 3 Pro at tool calling. We put GPT Image 1.5 head-to-head against Nano Banana Pro using hobo photos (spoiler: Google wins again). Plus, FireCrawl Agent is the research tool we've been waiting for, Anthropic launches Skills as an open standard, and we do a full 2025 model timeline recap. Also featuring: Best and Worst Model of the Year awards, 2026 predictions where Mike bets on OpenAI (controversial), and the full holiday musical outro where AI sings about what an "average" year it's been. CHAPTERS 00:00 Intro - Holiday Special That Isn't 00:55 Shipping Gemini 3 Flash While Looking Like a "Sophisticated Programming Hobo" 02:52 Gemini 3 Flash Review: Cheap, Fast, Surprisingly Smart 06:31 The Unreliable Frontier Model Problem 10:45 GPT Image 1.5 vs Nano Banana Pro Showdown 17:04 FireCrawl Agent: Research That Actually Works 25:56 Gemini Deep Research Agent Deep Dive 31:57 Skills vs MCPs: The New Paradigm 43:35 Enterprise Skills: Codifying Business Procedures 49:57 2025 Model Timeline Recap 59:53 Best & Worst Model of 2025 Awards 1:04:58 2026 Predictions: Mike Bets on OpenAI 1:14:09 Final Thoughts & Holiday Thank Yous 1:19:35 🎄 Holiday Musical: "A Very Average Christmas" Have a great Christmas/Holiday/New Year, see you in 2026! xox

1h 22min•Dec 23, 2025

GPT-5.2 Can't Identify a Serial Killer & Was The Year of Agents A Lie? EP99.28-5.2

it's not great. In this episode, we put OpenAI's latest model through its paces and discover it can't even identify a convicted serial killer when the text literally says "serial killer." We compare it head-to-head with Claude Opus and Gemini 3 Pro (spoiler: they win). Plus, we reflect on the "Year of Agents" that wasn't, why your barber switched to Grok, Disney's billion-dollar investment to use Mickey Mouse in Sora, and why Mustafa Suleyman should probably be fired. Also featuring: the GPT-5.2 diss track where the model brags about capabilities it doesn't have. CHAPTERS: 00:00 Intro - GPT-5.2 Drops + Details 01:25 First Impressions: Verbose, Overhyped, Vibe-Tuned 02:52 OpenAI's Rushed Response to Gemini 3 03:24 Tool Calling Problems & Agentic Failures 04:14 Why Anthropic's Models Just Work Better 06:31 The Barber Test: Real Users Are Switching to Grok 10:00 The Ivan Milat Vision Test (Serial Killer Edition) 17:04 Year of Agents Retrospective: What Went Wrong 25:28 The Path to True Agentic Workflows 31:22 GPT-5.2 Diss Track (Yes, Really) 43:43 Why We're Still Optimistic About AI 50:29 Google Bringing Ads to Gemini in 2026 54:46 Disney Pays $1B to Use Mickey Mouse in Sora 56:57 LOL of the Week: Mustafa Suleyman's Sad Tweets 1:00:35 Outro & Full GPT-5.2 Diss Track Thanks for listening. Like & Sub. xoxox

1h 3min•Dec 12, 2025

ChatGPT is Dying? OpenAI Code Red, DeepSeek V3.2 Threat & Why Meta Fires Non-AI Workers | EP99.27

In this episode, we break down OpenAI's 6% market share decline, why their ad strategy is on hold, and what they need to do to reclaim the AI crown. We also explore DeepSeek V3.2's impressive capabilities as a cheap open-source alternative, Meta's new policy grading employees on AI skills, and the crisis facing higher education as AI fluency becomes essential. Plus, Fatal Patricia hits #1 on our Spotify charts, and Tesla's Optimus robot is running like a slightly unfit human. CHAPTERS: 00:00 Intro - OpenAI Code Red & Market Share Crisis 07:03 ChatGPT's Failure to Go Deeper Into Users' Lives 16:33 What OpenAI Needs to Win Back the Crown 26:46 Chris's Wishlist for an OpenAI Comeback 31:22 DeepSeek V3.2 - The Open Source Threat 39:34 Meta Grading Workers on AI Skills 46:29 The University & Education AI Crisis 56:25 Fatal Patricia Hits #1 & WTF of the Week Thanks for listening. Like & Sub. xoxox

1h 3min•Dec 4, 2025

Claude 4.5 Opus Shocks, The State of AI in 2025, Fara-7B & MCP-UI | EP99.26

31:17 Computer Use API Updates 36:14 Will AI Replace 57% of Jobs? (McKinsey Report) 1:00:52 Claude 4.5 Opus Demos (Christmas Hut & Diss Track Preview) 1:07:13 Microsoft Farah 7B - Moose Porn Refusals 1:21:51 Why ChatGPT's MCP-UI Apps Are a Bad Idea 1:42:01 🎵 Claude 4.5 Opus Diss Track (Full Song) --- Thanks for listening. Like & Sub. xoxox Anthropic just dropped Claude 4.5 Opus and it might be the best AI model of 2024. In this episode, we compare Claude 4.5 Opus vs Gemini 3 Pro vs GPT-5.1, breaking down the new API features including effort parameters, context management, and computer use updates. We also test Microsoft's new Farah 7B parameter model for computer use - with hilarious refusal results. Plus, we react to McKinsey's controversial report claiming AI agents could automate 57% of US jobs by 2030. We dive deep into Anthropic's pricing (3x cheaper than Opus 4.1), why Claude is now beating Google and OpenAI on agentic coding benchmarks, and whether MCP-UI apps in ChatGPT are a step backwards for AI workflows. Is Claude 4.5 Opus the new king of AI coding assistants? Should enterprises be worried about AI job replacement? And why did Microsoft's Farah model refuse to draw a moose? All this plus an AI-generated diss track roasting Sam Altman, Elon Musk, and Sundar Pichai.

1h 45min•Nov 28, 2025

Is Gemini 3 Really the Best Model? & Fun with Nano Banana Pro - EP99.25-GEMINI

Where is This Going? 1:26:20 - OpenAI's Reaction to Gemini 3 Pro & Nano Banana with GPT-5.1-Pro and Codex model updates 1:32:38 - Final Thoughts & Sam Altman Sad Song 1:38:41 - FATAL PATRICIA SONG 1:42:12 - Gemini 3.0 Pro Diss Track ---- Thanks for your support plz like and sub xoxo

1h 44min•Nov 21, 2025

Are We In An AI Bubble? In Defense of Sam Altman & AI in The Enterprise | EP99.24

& What is Working in The Enterprise? 43:58 - Anthropic's Code Execution with MCP: Problems with MCP Context 52:44 - Kimi-K2 Thinking Model Release 1:00:45 - "In the Middle of a Bubble" Song ---- Thanks for your support and listening, we appreciate you! Join our Discord: https://discord.gg/TVYH3HD6qs

1h 5min•Nov 7, 2025

Why Sam Altman is Scared & Why People Are Giving Up on MCP | EP99.23

Our Thoughts on State of MCP and Why The Client Implementations are the Problem 1:07:53 - 1X NEO The Home Robot LOLZ 1:28:05 - Greg Brockman, A Sad Song. ---- Thanks for listening and your continued support. We appreciate you.

1h 33min•Oct 31, 2025

Do We Need AI Browsers? What Are Claude Skills? - EP99.22

34:49 - Claude Skills: What Are Claude Skills? What is the Difference Between MCP and Skills? 1:04:05 - Vibe Code Fashion: Oakley Meta Vanguards + Use Cases of AI Glasses 1:15:05 - Top Models Used on Simtheory & Final Thoughts ------ Thanks for listening and your support xoxo

1h 26min•Oct 24, 2025

Is Haiku 4.5 really THIS good? OpenAI's Erotic Mode & Are MCP Apps the Right Approach? EP99.21

1:09:25 - Final thoughts, Polymarket ---- Thanks for your support and listening to the show xox

1h 13min•Oct 16, 2025

What did OpenAI Announce at DevDay? Apps SDK, MCP UI & Impact to SaaS - EP99.20-APPS

50:41 - GPT-5-pro in API 53:15 - gpt-realtime-mini 56:53 - Sora 2 & Sora 2 in API Vs Veo3 1:01:43 - Final thoughts & This Day in AI albums now on Spotify! Thanks for your support and listening xoxo

1h 5min•Oct 10, 2025

Doom Scrolling SORA2, Claude 4.5 Sonnet & Are Agents Coming for our Jobs? EP99.19

1:00:25 - Claude 4.5 Sonnet Dis Track 1:06:24 - "Real AI Agents and Real Work" & Enterprise Agent / MCP workflows 1:31:41 - LOL of the week Sora2 Steve Irwin Video 1:35:07 - Full Claude Sonnet 4.5 Dis Track ---- Thanks for listening and your support, we really appreciate it! xoxox

1h 39min•Oct 3, 2025

lolz with Omnihuman, Agentic Gemini 2.5 Flash, Grok 4 FAST & ChatGPT Pulse - EP99.18-v5-FLASH

Code: STILLRELEVANT --- Links: https://worksinprogress.co/issue/the-algorithm-will-see-you-now/ https://developers.googleblog.com/en/continuing-to-bring-you-our-latest-models-with-an-improved-gemini-2-5-flash-and-flash-lite-release/ --- CHAPTERS: 00:00 - Gemini 2.5 Flash Agentic Tests with Omnihuman, Suno v5 and Research Tools 06:29 - Dis Track AI Music Video (Made by Gemini 2.5 Flash) 07:06 - Thoughts on Suno v5, More Agentic Model Discussion 29:10 - Are we all sleeping on Grok 4 FAST with 2M context? 41:46 - Radiologists are STILL RELEVANT & Is AI Going to Take Our Jobs? 44:46 - The need to use multiple specialist models 1:01:20 - Is ChatGPT Pulse To Just Sell Ads? 1:08:46 - Final thoughts for the week 1:11:54 - Gemini Flash 2.5 Dis Track 1:15:08 - Love Rat Suno v5 The Midnight Inspired Test Thanks for all of your support and listening to the show we really appreciate it! xoxo

1h 18min•Sep 26, 2025

1 / 7

We Built Microsoft Teams in 23 Minutes (And You Can Use It) & GPT 5.4 Impressions - EP99.37

1h 8min•Mar 6, 2026

Nano Banana 2 is Here! Gemini-3 Shutdown & The AI Layoff Myth | EP99.36

1h 2min•Feb 27, 2026

Gemini 3.1 Pro, Claude Sonnet 4.6 & The OpenClaw Hire That Killed the Chatbot Era - EP99.35

58min•Feb 20, 2026

Am I Even Needed Anymore? GLM-5, Agentic Loops & AI Productivity Psychosis - EP99.34

1h 3min•Feb 13, 2026

Is the ChatGPT Era Over? Opus 4.6 & The Shift from Chat to Delegation - EP99.33

1h 1min•Feb 6, 2026

Did Clawdbot Just Show Us the Future of AI Workers? & Kimi K2.5 Dis Track Tested - EP99.32

1h 20min•Jan 30, 2026

The AI Productivity Paradox: Why Doing More Feels Like Burnout: EP99.31

1h 12min•Jan 23, 2026

2026 Existential Crisis, Claude Code Hype & Is SaaS Dead? EP99.30-WIZARDS

1h 9min•Jan 19, 2026

Gemini 3 Flash, GPT-Image-1.5, Skills vs MCPs, and Our 2025 Model Reviews - EP99.29

1h 22min•Dec 23, 2025

GPT-5.2 Can't Identify a Serial Killer & Was The Year of Agents A Lie? EP99.28-5.2

1h 3min•Dec 12, 2025

ChatGPT is Dying? OpenAI Code Red, DeepSeek V3.2 Threat & Why Meta Fires Non-AI Workers | EP99.27

1h 3min•Dec 4, 2025

Claude 4.5 Opus Shocks, The State of AI in 2025, Fara-7B & MCP-UI | EP99.26

1h 45min•Nov 28, 2025

Is Gemini 3 Really the Best Model? & Fun with Nano Banana Pro - EP99.25-GEMINI

1h 44min•Nov 21, 2025

Are We In An AI Bubble? In Defense of Sam Altman & AI in The Enterprise | EP99.24

1h 5min•Nov 7, 2025

Why Sam Altman is Scared & Why People Are Giving Up on MCP | EP99.23

1h 33min•Oct 31, 2025

Do We Need AI Browsers? What Are Claude Skills? - EP99.22

1h 26min•Oct 24, 2025

Is Haiku 4.5 really THIS good? OpenAI's Erotic Mode & Are MCP Apps the Right Approach? EP99.21

1:09:25 - Final thoughts, Polymarket ---- Thanks for your support and listening to the show xox

1h 13min•Oct 16, 2025

What did OpenAI Announce at DevDay? Apps SDK, MCP UI & Impact to SaaS - EP99.20-APPS

50:41 - GPT-5-pro in API 53:15 - gpt-realtime-mini 56:53 - Sora 2 & Sora 2 in API Vs Veo3 1:01:43 - Final thoughts & This Day in AI albums now on Spotify! Thanks for your support and listening xoxo

1h 5min•Oct 10, 2025

Doom Scrolling SORA2, Claude 4.5 Sonnet & Are Agents Coming for our Jobs? EP99.19

1h 39min•Oct 3, 2025

lolz with Omnihuman, Agentic Gemini 2.5 Flash, Grok 4 FAST & ChatGPT Pulse - EP99.18-v5-FLASH

1h 18min•Sep 26, 2025

1 / 7

Menu

Episodes

Episodes