Nicholas

An exclusive inside look at GPT-5

Nicholas

In this episode, I share my hands-on experience with OpenAI’s GPT-5, the company’s new frontier model. As one of the first users outside of OpenAI to test the model, I put GPT-5 head-to-head with GPT-4.1 across real-world product use cases—from writing PRDs to generating code to assisting with visual design work. This is my unfiltered look at what GPT-5 can (and can’t) do—and how it changes the game for builders. What you’ll learn: 1. How GPT-5 differs from previous models with its engineering-focused approach to problem-solving and tendency to prioritize technical details over business context 2. A comparative analysis of how GPT-5 and GPT-4.1 generate different types of product requirement documents and prototypes for the same prompt 3. Why GPT-5 excels at technical writing, functional requirements, and code generation while potentially skipping important business discovery questions 4. The model’s impressive spatial awareness capabilities when generating images for interior design and other visual tasks 5. Practical considerations for choosing the right model based on your specific use case and audience 6. How GPT-5’s extensive tool-calling behavior and bullet-point communication style reflect its engineering-oriented design — Brought to you by ChatPRD—an AI copilot for PMs and their teams: https://www.chatprd.ai/howiai25k giveaway: To celebrate 25,000 YouTube followers, we’re doing a giveaway. Win a free year of my favorite AI products, including v0, Replit, Lovable, Bolt, Cursor, and, of course, ChatPRD, by leaving a rating and review on your favorite podcast app and subscribing to the podcast on YouTube. To enter: https://www.howiaipod.com/giveawayWhere to find Claire Vo: ChatPRD: https://www.chatprd.ai/ Website: https://clairevo.com/ LinkedIn: https://www.linkedin.com/in/clairevo/ X: https://x.com/clairevoIn this episode, we cover: (00:00) Introduction to GPT-5 (04:34) Testing GPT-5 in ChatPRD for document generation (07:10) Comparing GPT-5 and GPT-4.1 on business vs. technical orientation (11:22) Side-by-side comparison of PRDs generated by both models (15:23) Where GPT-5 excels: Technical considerations and documentation quality (17:35) Comparing prototypes generated from different model outputs (19:57) Testing homepage critique capabilities between models (23:14) OpenAI’s strengths in API design and developer support (25:37) GPT-5’s performance as a coding assistant (27:26) Examining GPT-5 in ChatGPT’s interface (28:50) Testing GPT-5’s front-end design capabilities (31:17) Personal use case: bathroom remodel planning (33:45) Comparing GPT-5 vs. GPT-4 for interior design visualization (38:10) Summary of key findings and recommendations — Tools referenced: • OpenAI: https://openai.com/ • ChatGPT: https://chat.openai.com/ • Claude: https://claude.ai/ • Gemini: https://gemini.google.com/ • Cursor: https://cursor.sh/ • v0: https://v0.dev/ • Lovable: https://lovable.dev/ • Bolt: https://bolt.com/ • LaunchDarkly AI Configs: https://launchdarkly.com/docs/home/ai-configsOther reference: • Benjamin Moore paints: https://www.benjaminmoore.com/ — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [redacted email].

Published
Published Aug 7, 2025
Uploaded
Uploaded Jun 13, 2026
File type
Podcast
Queried
0

Full transcript

Showing the full transcript for this episode.

AI-generated transcript with timestamped sections.

0:00-1:48

[00:00] GPT-5 is the newest model released from OpenAI. And from my very first interaction, I felt like this was a engineer built by engineers [00:08] for engineers and it writes good code it refactors it's thoughtful and girlfriend loves to call a tool if you have a good idea and you really just need to get down to what are the technical implementation of this feature I think GPT-5 is tremendously better at that than GP4 which again is like actually pretty light on functional requirements if your use case is getting things to [00:32] like business users or stakeholders, you might like a GPT-4103 output. A little bit more business oriented, really no complaints. It's exceptional at coding. This is a highly technical model. I think it's going to be a daily driver for lots of folks. [00:49] Welcome back to How I AI. I'm Claire Vaux, product leader and AI obsessive here on a mission to help you build better with these new tools. Today, I'm doing something a little bit different. I'm walking you through the newly released [01:02] GPT-5 model from OpenAI and giving you my honest takes on a couple workflows [01:07] that I personally use. [01:09] We're going to look at GPT-5 for product managers and engineers. [01:14] investigate some stylistic choices that the model has made, and also go through a couple personal workflows that I find useful and see if side by side, [01:23] GPT-5 outperforms other models. [01:26] Let's get to it. [01:27] To celebrate 25,000 YouTube followers on How I AI, we're doing a giveaway. You can win a free year to my favorite AI products, including VZero, Replit, Lovable, Bolt, Cursor, and of course, ChatPRD by leaving a rating and review on your favorite podcast app and subscribing to YouTube.

1:49-3:18

[01:49] To enter, simply go to howiaipod.com slash giveaway, read the rules and leave us a review and subscribe. Enter by the end of August and we will announce our winners in September. Thanks for listening. [02:05] So before we get into how this model performs, let's talk about what the model is. GPT-5 is the newest model released from OpenAI and they were generous enough to give me a little bit of early access to play with the model and really start to understand its strengths and weaknesses. And from my very first interaction with GPT-5, I felt like this was a [02:25] engineer built by engineers for engineers. This is a highly technical model, both in capabilities and style. And this is going to be one that you're really going to reach for on a daily basis if you are coding, testing the technical bounds of these LLMs or solving deeply complex problems. But it might have some pieces for the business thinkers out there, the product owners out there that might not work for your use case. And we're going to show exactly what I mean by that in just a second. [02:54] Now, I have been pretty familiar with the OpenAI ecosystem for quite some time and have been using the OpenAI models almost exclusively for my own product, ChatPurity. That being said, I do work with a variety of models and model providers in my day-to-day workflows. So when I'm coding using Cursor, I'm often using Claude 4, Claude Sonnet 4, Gemini 2.5, O3 from OpenAI.

3:24-5:09

[03:24] 4.1, even did a little test with 4.5 when that first came out. [03:29] And I use a variety of different out-of-the-box AI tools as well. So I'm using ChatGPT relatively often, occasionally go into Cloud, have my whole stable of different AI coding tools, which again, choose and fine tune their own models. So I do feel like I'm pretty familiar with the model ecosystem, at least the commercial model ecosystem, and have really developed a sense of where these models perform well for specific use cases and where they don't. [03:59] user and AI power user that really selects the model for the use case. So I was really excited to get access to GPT-5 because I wanted to know the answer to the question, which is, [04:11] where does this model fit on my team? I don't think of myself as a single model employer. I really think of models as part of a team and tools as part of a team. And each model has their own personality and capabilities. Each tool has its own personality and capabilities. And I think that rather than think, is this an upgrade? I think, is this an addition to my team? And where would I put them into play? So the first thing that I did when I got access to GPT-5 is I went [04:41] about the most, which is actually chat PRD and our core chat and document generation implementation. [04:48] It's a common use case for product managers using AI to generate product requirements documents. It's a place where I've spent a lot of time prompt testing, model testing, and really optimizing the experience for both matching the stylistic tone I want for the product as well as getting great user feedback on products.

5:09-6:41

[05:09] outputs and we've really a b tested this pretty significantly into depth in chat purity and landed most recently on gpt4.1 and a variety of tools and prompts being the best stack for our users. [05:22] And in July, we had a 96% satisfaction rate with our documents. So that's how I'm really thinking about it. I'm thinking what model is highest performance, cost really doesn't come into play, but it will later, and then do users love it? And I consider myself a proxy for the product manager and engineering user. So I feel like I have a pretty good sense of what will perform well in this use case and won't. So when I got access to GPT-5, what I did is I went ahead and used LaunchDarkly AI configs, [05:52] in local or production. [05:54] And I started testing GPT-5. And what I'm going to show you on my screen right now is really a side by side [06:01] representation of the results. [06:04] So GPT-4.1, our core model that we use on ChatPRD is on the left and GPT-5 is on the right. And a couple of things right out the gate that I noticed, and in fact, I had to prompt around, is GPT-5 when I first tested it. [06:20] It spoke like a developer. This is actually tuned a little bit for prompt on the right side. It just wanted to write me markdown bullet point lists. And I gave that feedback to the OpenAI team, did a little bit of prompt engineering, and I think it's a little bit more natural language when you speak to it, but you're definitely gonna see GPT-5

6:41-8:30

[06:41] she loves a bullet point list so we're going to get lots of bullets and we're going to call lots of tools that's what something you're definitely going to see in this episode [06:48] But if you look at it side by side, to start off, they're pretty similar responses. And I think that's really a representation of they share the same system prompt in context in chat purity. So this is the exact same system prompt. [07:01] exact same context, it's coming back and it's just really asking me questions about what I want to achieve with my product when I ask it to brainstorm new features. Now, where you start to see it diverge is what it starts to focus on when asking to brainstorm new features. And so if you look at [07:22] GPT-41's response here, the questions are really about [07:28] business impact. You get a lot of discovery around [07:32] What metric you want to change? Who is your persona? What is your business goal? And I've noticed that throughout my side-by-side evaluation. This is just one example. GPT-4-1 and some of the older models [07:44] just came at the problem from a more general, but more business-oriented lens. But GPT-5 on the right really came to features quickly. And I think this is an important point for product managers to note because you know us product managers. [08:04] We love to ask a good why, and we really love to understand the problem. And what you see in GPT-5 is a jumping to the solution. And I think that's a reflection of the way it was trained and the place that GPT-5 fits in the sort of ecosystem of open AI models. It's very clear that the coding model wars are heating up, that the IDE wars are heating up, that the coding tool wars are heating up.

8:34-10:18

[08:34] engineering use cases more than anything. And what I thought was interesting is we'll get to those engineering use cases. I think it's quite exceptional at writing code. [08:44] But that sort of angle into execution of engineering tasks even bleeds into the conversational aspect. [08:52] of the model. And so you can even see the point of view of the model, if you can call it that, is really different from [09:00] 4.1, which we're using on the left, which really comes from a business point of view, you'll see very quickly GPT-5 is getting to an execution engineering point of view. So it's just something to consider as you look at these models side by side, what you're really going to get out of them and where they might be most applicable in your use case. And so right off the gate, we're seeing 4.1 be more business oriented, 5.0 be a little bit more technically oriented, 5.0. [09:24] And then I ask it to focus on free to paid conversion. And again, I'm not. [09:30] We get pretty... [09:32] similar ideas. So again, this isn't the most radical product area to focus on. It's well-trodden, well-documented. Both of these models probably have access to best-in-class growth tactics. So you'll see the... [09:47] kind of features be very similar across the two. But if you really inspect, you will see that the description of the features for 4.1 on the left [09:59] are much more user centric and much more business centric. So it's really like a who, why question. If you look at GVD5, again, I find this so fascinating. It's really a what, how answer. And I think that really sums up how I would say my interactions with this model has been.

10:19-11:55

[10:19] you still get a little bit more of that like business user discovery from, you know, 4.1 or 4.0. [10:25] 03 even. [10:27] GPT-5 is like, tell me what to build. Tell me exactly how the features work. Give me numbers. Give me user stories. Give me something to code. And so I just thought it was really interesting to see that the ideas themselves, again, pretty similar, but the way those ideas are executed are very different. And you'll start to see the chats branch here, and you'll start to see the GPT-5 chat really branch into wanting to get into technical code, which has its pros and cons. [10:57] you [10:58] model really stay in this business kind of like high level mindset. And so as an app builder focused on product managers, what am I thinking to myself? I'm thinking [11:06] Well, my product's a product manager. It needs to talk to engineers, but it's a product manager. And so I'm unsure if my users are going to love GPT-5 because it skips that step of product management thinking and gets right to what to build, which, again, engineering side of my brain loves. So I'm going to pull these docs up side by side and really show you what the PRD is. [11:28] that got generated from each of these models look like. And again, [11:32] Pretty similar prompts, pretty similar inputs. You can see right out the gate. I mean, I told you it's an engineer for engineers. It tried to put this code block comment at the top of the document. Again, just a pure signal. This is, you know, trained to write technical documents and trained to write code. Even when you tell it to write like a prose document, like a PRD, it's

11:55-13:25

[11:55] you see artifacts like this, which are code based, which I find very, very [11:59] Very interesting. And so if I'm looking at these PRDs side by side, a couple things that you're going to notice. [12:07] GP5 writes more. It is significantly more detailed in its content. And I think there are pros here. [12:16] and cons to that. [12:17] I think when you're trying to define something for a engineer or a coding agent to execute, the more detailed you can get, the better. When you are trying to align stakeholders as product managers or other business users might need to do, sometimes a level of detail too far can actually obscure the primary message that you're trying to get across. And so I'm looking at these side by side strategies. [12:42] And I'm really thinking, [12:44] Do I want five business goals for this product? [12:47] Are these the right business goals? And are they artificially too precise? [12:53] on the GPT-5 or are they like perfectly precise? And so it was just something that I observed in looking at these side by side. [13:01] Now, [13:02] If we scroll down, really interesting. Again, the personas... [13:07] are a lot more detailed. There are more of them, and the use cases are very specific. But [13:15] On the GPT-5 model, the use cases are very feature centric. And on the GPT-4 model, they're very like what I'm trying to achieve as a user.

13:25-14:55

[13:25] specific. And so I thought it was really interesting to just kind of compare and contrast both of these. Again, GPT-5, very detailed. Where I love [13:36] GPT-5 and prefer it over the 4.1 model. [13:41] is the functional requirements are exceptional. The formatting got a little weird, but you can see here there's a prioritized list in a table. There's lots of details about soft warnings, hard warnings. I mean, these are the kinds of things that the [13:55] best engineers are going to ask you about how this stuff works. And so if you have a good idea and you really just need to get down to what are the technical implementation of this feature, [14:05] I think GPT-5 is... [14:07] tremendously better at that than GP4, which again, it's like actually pretty light on functional [14:13] requirements. I think you could say the same for user experience. Again, you're just going to get a lot more detail out of GPT-5 in terms of describing the user experience in pros. And so if you are using any of the prototyping models, like a V0, a Lovable, a Bolt, a Magic Patterns, whatever those might be, [14:33] The more specific you can be about describing the user experience and prose, the happier you're going to be with your prototype. [14:39] And I think 4.1 is actually pretty high level. And 5.1 is pretty exceptional at that. Now, the narrative is an interesting one. You know, GPT-5 is a little longer. I will say, like, it's not a terrible...

14:56-16:28

[14:56] writer. So I don't think that its prose is necessarily... [15:00] cold or not compelling or not lyrical, which are things as somebody who has a liberal arts degree, I really care about. It's just a little bit more detailed. And I think, you know, [15:11] Writing shorter prose is also a virtue. And so you really need to think about, do you need as many words? Is simpler, better? Are the details really valuable here versus in another version? Now, again, another place where I think GPT-5 [15:29] obviously outperforms for one in a side by side is technical consideration. So if you are an engineer, [15:35] And you need to write a tech spec, I would highly recommend GPT-5 over any of the other models that I tested. [15:42] It is just very specific. It speaks in the language that an engineer would understand. It's really detailed in its analysis of requirements. And so I do think it is a really nice technical writer. And I think engineering teams, docs teams are going to be quite happy with it. [16:01] I honestly think product managers might not need to be writing this part of a PRD. So maybe there's a division of labor here that happens naturally or in your AI tools. [16:10] But again, GPT-5 is really gonna outperform on technical considerations and detail across the board. So that's a side by side, but these PRDs don't operate in a vacuum. [16:22] They are artifacts generated for another purpose. And so what I wanted to do is actually generate...

16:28-17:59

[16:28] a prototype based on those different purities. So if we go back to my general analysis, I thought that GPT-4.1, business-oriented, higher level, maybe easier to read as a reader because it's not so dense, not as technical, not as detailed. GPT-5, [16:45] engineer, engineer, engineer, very detailed, perhaps overly so. But the real question is, do I get a better prototype? Yes. [16:52] one shot out of those prompts versus another. And this is where I think things get interesting, because I would say to you, [17:00] If your use case is getting things to humans, [17:04] You might not want to, and those humans are not engineers. Engineers, I love you. You're humans. But I'm going to put you in a different category for just the sake of this argument. [17:12] If you are trying to get this to business users or other stakeholders in your company, [17:19] You might like a GPT-4.0 [17:21] 4103 output. [17:23] A little bit more business-oriented, a little slightly more condensed, easier to read, not so much excessive detail. If you're trying to get this to an engineer... [17:32] I think you're going to be happier with a GPT-5. And so what's interesting about the side-by-side is honestly for a prototype and visual style, I like what 4.1 prompting did into this is our V0 integration. I like what 4.1 prompted into V0 and the outcome here. It's colorful. It's clear. [17:53] I understand, you know, what's happening here. I think this looks nice. [17:58] "Meta observation,

17:59-19:31

[17:59] I could not get... [18:01] V0 via GPT-5 to generate color. It's like all very gray and blue, but you can see on the left side with 4.1 for whatever reason, whatever prompt was behind the scene, which I'll have to go look at, [18:14] We got a little bit more color and a little bit more design. [18:17] It's much simpler. [18:19] It looks nice, it's visually appealing. [18:21] But I feel like GPT-5 over here on the right [18:25] gave me and I'm just going to make it a little bigger so you can see. [18:28] Gave me a lot more to work with. And what I mean is I tend to think of these prototypes as inspiration for implementation, not implementation itself. So I'm never like going to ship this. This is not what chat purity looks like. It's not what our product looks like. But I'm really looking for ideas on upsells and free to paid. [18:46] ideas. And I just think the fact that they put so much [18:50] Detail into the PRD means they put so much [18:54] into the prototype which means i have a lot of components to choose from [18:59] when I really want to... [19:01] make my product better. And so [19:03] I have locked spaces, I have upgrade widgets, [19:08] I have free trial details. I have, I'll try it later. I have upgrade now, but I mean, I just have, there is just... [19:16] As much in here as I want to pick. And when you're looking at prototypes as an ideation space, honestly, I think taking a... [19:23] abundance mindset and generating as much as possible and be like, I'll never use that. Oh, I like this is a lot better. And so I think the verbosity, um,

19:31-21:04

[19:31] of GPT-5 in terms of technical specifications and user experience actually output more interesting ideas when given to a prototyping tool. So that was a really interesting observation for me. I [19:46] on first pass. But once I started to click through, I was like, man, it really [19:51] thought of a lot here. And I think that's because it was given... [19:55] quite a bit of detail. [19:58] So that's just one little side by side on prototype generation. I want to give you one last observation in the specific chat PRD use case, which I found quite interesting, which is... [20:10] I gave it a copy of our homepage. [20:12] And I asked it to change things. [20:15] And this is what I find interesting. As much as I thought that GPT-5 was a... [20:23] pretty cold, straightforward, detailed engineer. [20:28] GPT-4 was much, 4.1 was much meaner to me. It was much more critical. And I thought that was kind of interesting. GPT-4.1 starts out, and this makes me feel bad about my homepage, but just says, [20:39] Not up to standard. Very straightforward. [20:42] GDV5 was like, that's pretty good. Areas to improve. And what's interesting about the instructability and promptability of... [20:51] The model is I actually went back and gave it another pass and said, could you be a little bit more critical of my homepage? Same prompt. [20:59] And again, [21:01] GPT-4-1 was legitimately...

21:04-22:35

[21:04] legitimately critical, cruelly critical, if you look at it. And GPT-5 really, again, started with like... [21:12] the shit sandwich, excuse, pardon my French, [21:15] But it really started with [21:17] here's what's not working or here's what's working. [21:20] Here's what's not working, but you can make it better. And I think this is interesting. One of the things that you really have to test as an application builder [21:29] is working with LLMs is can you tune it via prompts [21:34] effectively. Now, again, these two side-by-sides are using the exact same prompts. I have not prompted to the strengths and or weaknesses of GPT-5. I've just simply been giving it similar side-by-side content, context, and prompting. And it was just really interesting to see how you can massage the LLM responses to meet your needs. [21:53] So my general conclusion remains the same through the side by side, which is functionally [22:00] This thing is built to code and this thing is built to help you code. And you're going to be very happy with the strengths of that. But it might have some drawbacks on the other side, especially as an application developer, a business user. And then we'll get to it. I actually think it's got some strengths from the consumer perspective. [22:17] Today's episode is brought to you by ChatPRD. I know that many of you are tuning into How I AI to learn practical ways you can apply AI and make it easier to build. That's exactly why I built ChatPRD. ChatPRD is an AI co-pilot that helps you write great product docs,

22:35-24:10

[22:35] automate tedious coordination work, and get strategic coaching from an expert AI CPO. And it's loved by everyone, from the fastest growing AI startups to large enterprises with hundreds of PMs. [22:47] Whether you're trying to vibe code a prototype, teach a first time PM the ropes, or scale efficiently in a large organization, ChatPRD helps you do better work fast. [22:57] And we're integrated with the tools you love: vzero.dev, Google Drive, Slack, Linear, Confluence, and more. So you don't have to change your workflow to accelerate with AI. [23:07] Try Chat PRD free at chatprd.ai slash howiai. And let's make product fun again. [23:15] So let's go really quickly into coding and then I'll zip back around to a couple of personal use cases and we will get you to using GPT-5. [23:23] So let's talk about coding for just a little bit. And before I get to that, I do have to give OpenAI true [23:29] and unsponsored props here. I think that the OpenAI team continues to outperform [23:36] on API design capabilities and developer support. [23:41] One of the reasons that for ChatPRD, honestly, that I have centralized on a lot of the open AI models is that it's not the models themselves are exceptional compared to ones by Anthropic or other providers. It's really not that. It is quite simply the API designs, developer tools, ecosystems, and essential primitives that get exposed on top of these models are just much easier to work with as a software engineer developing LLM-backed tools.

24:11-25:44

[24:11] I've been very happy with many of the upgrades, not just to the GPT-5 model, but with the GPT-5 model. [24:18] some increased improvements in tool calling, reasoning, all these sort of parameters and controls that you have over the model that as an application developer, [24:28] make me very happy. So I'm not going to go into that too deeply. If anybody wants to talk about it, I'll chat with you all day about it. But I think the API improvements here are worth taking a look at and you should check out the documentation now. [24:41] using GPT-5 to code. [24:45] I'm gonna just show you two things. [24:48] It's my favorite right now. And I am a model switcher. Nothing stresses me out more than someone selecting auto in cursor. Like auto model select. I cannot... [25:00] I cannot imagine. It really stresses me out. Like, you just leave it to the forces that be to choose your model. No, no, no, no. You have to be very opinionated with your model. And so I, historically, using Cursor just as an example, was... [25:13] I'm really prescriptive with what model I choose. And you can say this is all made up stuff. I use Sonnet 4 a lot for front end work. I think it does pretty good front end work. [25:22] I use 2503 quite a bit in the past for deeper technical work. Been pretty happy with it. I do think 2.5 is clinically depressed. It's always so sad and it's thinking so... [25:33] Google friends out there, please just cheer it up a little bit. I don't mean my mean prompts. And then I have recently been testing GPT-5 here for a couple weeks.

25:44-27:18

[25:44] And it's been really interesting because I got access to GPT-5 [25:48] when I was shipping a very major feature, I mean, thousands and thousands of lines. And I will tell you, one, the performance of the model is very fast. So I've been very happy with the performance of the model. It's allowed me to do a lot very quickly. [26:02] Two, it's, I mean, it's good. It writes good code. It refactors. It's thoughtful. And let's take that word thoughtful and talk about one of my primary observations on this model. Okay. [26:14] Girlfriend loves to call a tool. So if you look over here on the right, man, I have rarely hit Cursor's 25 tool call limit in a single call. [26:26] in many, many moons. I have not hit that in a long time and I hit it really consistently with GPT-5. It will take advantage of tools. It is a tool calling beast. And so [26:38] You can see here on the left side, [26:41] "It's reading, it's searching, it's reading, it's searching, it's reading, it's searching." Honestly, sometimes it felt a little inefficient and ineffective, and this will be one of my questions as these get rolled out into production in these coding tools. [26:54] Will token usage? Will tool calling and performance start to become an issue? [26:58] But man, she loves a tool call. The second thing you'll see here is, [27:04] is it loves bullet points it will talk to you in bullet points all day and all night it loves loves loves bullet points and so [27:13] You'll see it talk to you like an engineer might talk to you in Slack.

27:18-28:59

[27:18] Lots of bullet points, but that being said, [27:20] the code i am happy with the quality i'm happy with it's a great engineering partner [27:25] As I said, you want one of these on your team. So we didn't go too deep into coding, but again, GPT-5, [27:32] is now my daily driver. I love it. And it's really great when you're actually using the code in production. So again, [27:39] I'm going to repeat myself, I really do think this is a great engineers model and you're going to really like it for that use case. But let's switch over and look at ChatGPT and how GPT-5 actually operates in their core product. [27:53] Okay, so one thing you'll know is you'll have two options here, at least I had two options here, GPT-5 and GPT-5 thinking. I'm used thinking for specifically prototyping and design in chat GPT. So [28:06] I think that with GPT-5 thinking, it is possible that ChatGPT really becomes a viable option for folks trying to do some high level prototyping inside an AI tool. I love the specialty tools. I love VZero, Lovable Bolt, all those. Of course, I work in Cursor. [28:25] But if you're just trying to design something, one of the things I noticed about GPT-5 is it's got [28:30] great front end design taste and actually makes things that look pretty good. So I'm going to go ahead and turn on canvas, which allows chat GPT to generate some images and I'm going to drop in a copy of the chat purity homepage. So you can see it's very pink. We love her. And I'm actually going to write just a really simple prompt here. I'm going to say design and prototype a blog for chat purity matching our style.

28:59-30:35

[28:59] Okay, that's it. So... [29:01] GP5 is going to use that reference image. It's going to think it loves to think we can actually expand this thinking. [29:08] right now and see how it thinks through generating this. [29:14] It's got good front end design guidelines, and then it's going to actually generate the code here in line. [29:19] in canvas and i've done this a couple times with gpt5 in chat gpt and the thing that i've been most impressed with is it's classy she's classy and i think a lot of the prototyping tools sometimes have a pretty standard boring and repetitive style for their ai generated front end and i would just say that gpt5 in my you know anecdotal experience has had a little bit more [29:49] right out the box. Now they all have their strings. I'm certainly gonna keep them in my rotation, but it was a nice observation to say, in particular on front end and user experience design, this was particularly nice. So let's take a look at it and see if I actually got that right. [30:04] And what do we have? Oh, let's just... [30:07] Allow, okay, allow access. [30:10] You know, it's not terrible. I think we're struggling with a couple issues here. I actually raised this to the OpenAI team. Struggles a little bit with background and text to color contrast. It could be an issue with the code in CSS. It could be an issue with the model. [30:27] It really replicated my gradient that I like to use. Didn't quite do the logo, but I didn't expect it to, but kind of got to a good sense.

30:35-32:20

[30:35] of what my header looks like and then again came in here and generate for what I think is just a generally nice component here and then this I really like I think this looks [30:47] quite lovely for a blog post. Again, not pixel perfect, but I think a little bit nicer than you might see and out of the box previously with some of the other models from OpenAI and in Canvas. So I've been relatively happy with with that and think that [31:06] you know for somebody looking to do some front end prototyping it can be pretty nice but again we've got to solve this text on background issues so open 18 get to get to that fix quickly now a couple other things i want to show you before we wrap up the episode is just a personal use case where i actually did another side by side of gpt5 and gpt4 [31:29] And I really saw GPT-5 shine. [31:33] So you all may have your ESALs and benchmarks that you're evaluating the technical and mathematical strengths of your models against. And I have my own benchmark that I am testing all models against. And that benchmark is... [31:50] Can it reasonably help with my bathroom remodel? Yes, you heard it here. Can it reasonably help? [31:56] with my bathroom [31:58] Remodel. [32:00] Now, I've been doing a lot of things with GBT4 on my bathroom remodel, including experimenting with whether or not different layouts will be up to code, what I could possibly do, generating screenshots of what my bathroom might look like. It's all very thrilling. And I've actually been awesome.

32:20-33:41

[32:20] okay happy with what 4.0 has done for me. So if you want to see what kind of high quality AI powered work I'm doing with ChatGPT right now, I'm really trying to explain to my contractor exactly how I want my new bathroom laid out. And so I have been prompting 4.0 with these prompts like, [32:39] I need a bathtub with fixtures at one end, a level tile ledge at the other with 8 inches and 4 inch tile shelves on the wall picture. [32:48] Generate is very good prompting here. And halfway through this chat, I really switched to GPT-5. And I will tell you, I can show you [32:56] exactly where I did right around here I was switching to GPT-5 and I was very happy with the actual outcome and layout that the image generation did in this instance. I've actually struggled a lot with image generation of room layouts. I think that interior design is such a fun use case of AI and I have actually had a really challenging time [33:19] I'm getting AI to interpret my prompting correctly, where things are on the left wall versus the right wall versus the back wall, up, down, left, right, what's inside the room, what's outside the room. And I will say I think that GPT-5 did a quite lovely job of it. Had to ask it for a couple do-overs, but if you are curious, this is a little bit of my new tiny San Francisco bathroom might look like.

33:49-35:26

[33:49] - Yeah. [33:49] GPT-5. And if we all remember, [33:52] We love 4.0's image generation capabilities. When this first came out, everybody was thrilled with the performance of the 4.0 image gen model. It could write text. It was really instructable. The image generations were beautiful. It was very, very fun. [34:11] Very memeable, super exciting. And I will say my experience with the GPT-5 Plus image generation has been exceptional. And it's actually gotten better at all those things we know and love. [34:22] in 4.0. [34:23] text generation good and one of the things that I really noticed about GPT-5 is it has a much better spatial awareness in both code so when you're instructing it to lay out things as well as an image generation so it was something that really came across to me as spatial awareness and you'll see that in this side by side I'm about to show you so [35:05] I said, what Benjamin Moore paints, because I like a Benjamin Moore paint, will this green tile wall match? And can you help me with this? Now, this is actually a pretty hard thing. [35:15] task. I wasn't sure how the model had indexed the sense of color. Honestly, this is a new use case for me. And what was so fascinating is I not only got

35:26-37:00

[35:26] colors that matched each of the tiles. I got specific names of those colors. The text is very crisp, very clear, and spelled correctly. And even the paint codes for those details [35:38] paint samples. [35:40] was not expecting this at all. I was in fact not expecting an image at all. I was expecting them to just give me a couple like green colored paint samples and instead they actually mapped it out here. And I just asked it what it would recommend. It gave me some options and then it said do you want to do a full mock-up? And I said yep do a full mock-up with High Park. And I was really blown away by this and you'll even see the sense of it side by side when I show you what 4.0 generated. [36:10] kind of plain mock-up, it really followed the instructions of where these tiles samples are going to go and where the paint was going to go and gave me sort of a 3D rendering. [36:19] that I could look at. And this is the version I love the most, which is it actually followed my instructions. It said, [36:26] Half wall of tile, black on the floor, marble on the walls, high park. And it gave me this beautiful... [36:32] layout of exactly what my walls and floors and stuff would look like. I was [36:37] really impressed with this now I asked it to paint the wall it did an okay job it didn't know what wall I was talking about but again this gave me a really good sense of what my bathroom remodel was going to look like and now I'm going to go to the Benjamin Moore paint store and ask them to pull high park 467 um actually I should check it has been consistently 467 throughout oh um

37:00-38:32

[37:00] Yeah, throughout, so it seems like consistent reference for the paint number. I thought this was really interesting, and I just want to go to a side-by-side of what... [37:09] GPT-4 generated with the same prompts. So I'm going to show you that quickly and then [37:15] We will wrap up. [37:16] So if you look on the left, I did the same prompt into GPT-4. And you can see just the mock-up that it did was a little less... [37:26] sensical honestly and didn't actually match what my description was of the uses of these tiles and paints. And so again, I gave you this as a use case that I think is pretty practical applicable to [37:39] Other use cases, a common consumer might think about, how do I design my room? How do I pick an outfit? [37:45] How do I lay out my backyard? You know, how do I organize my books? [37:50] And I really do think GPT-5's sense of space plus improved image generation options [37:56] might be a reason that consumers reach for it, [38:00] It's just yet to be seen how they train the in-chat model to have a little bit less of that developer bent and a little bit more friendly consumer orientation. [38:10] So to sum everything up with a high-level takeaway about GPT-5... [38:15] for engineers by engineers as an engineer. This is a technical thinker, a technical writer, an exceptional coder. [38:21] You know, for a product person, it may give you more features how. [38:26] and what as opposed to who and why. So you'll have to really think about what kind of asset you're generating.

38:32-40:09

[38:32] or why you might use this model in production or in your day to day workflows. And make sure that it's just the appropriate tool for the job. [38:41] From coding, really no complaints. It's exceptional at coding. I've been very happy with it. I've shipped tons of stuff using this model. I think it's exceptional. My only complaints is... [38:51] you know, try something other than a bullet point and maybe call like one fewer tool if you don't really need it. So we'll see how ultimately the coding tools optimize around the strengths and weaknesses of this model. But I think it's going to be a daily driver for lots of folks. [39:06] depending on cost and access and then the final thing I think chat GPT is going to get a major upgrade in specific areas especially canvas [39:16] front-end design as well as image generation, good sense of spatial awareness, and let's just make sure it has a cute personality to go with all those technical chops. [39:24] So that is my summary of GPT-5. This is our first deep dive episode of How I AI. Please let us know in the comments if you like and want more content like this. I'm happy to walk through my favorite models, my favorite tools, and my favorite creators in more detail. Thanks, and we'll talk to you soon. Thanks so much for watching. If you enjoyed the show, please like and subscribe here on YouTube, or even better, leave us a comment with your thoughts. [39:51] You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider leaving us a rating and review, which will help others find the show. You can see all our episodes and learn more about the show at howiaipod.com. See you next time.

Want to learn more?