How to build faster with AI

Friday (YC F24) founder Allen Naliath's advice on maximizing LLMs

Aug 05, 2025

“This is the best f*cking time ever, in the history of technology, ever, period, to start a company.” - Sam Altman

AI is changing how we build startups. That much is obvious. The hard part is telling the difference between the hype and what actually works today.

To cut through the noise, I’ve been chatting with founders of AI-native products. First up: Allen Naliath — Stanford CS dropout, YC alum, and one of the most thoughtful AI builders I’ve met.

Allen is the founder of Friday, an AI-powered email client designed to help you reach inbox zero at the speed of thought. Think Superhuman, rebuilt for the AI age.

Not only is Allen building an AI product, he’s also building an AI-first company. In some cases, he’s using LLMs to code 50–100x faster 🤯. Every day, he’s navigating tough product decisions that come with building in this new world.

In our convo, Allen breaks down what he learned as founder of Friday in two parts:

How to code faster with AI - including the four levels of AI coding autonomy.
How to find PMF as an AI-native startup - including how Allen thinks about speed and accuracy.

If you’re curious how an AI-native startup operates in 2025, you’re in the right place.

How to code faster with AI

Cursor and Claude Code are all the rage right now and for good reason. These AI-assisted coding agents have the potential to speed up what’s often the slowest part of building a startup: writing code.

But while the promise is exciting, for most people, the details are still fuzzy. That’s why I was so excited to chat with Allen who shared his detailed process for using coding agents. Using these tools smartly, he’s been able to build Friday without hiring a single employee.

So what’s his secret? AI is involved in 99% of the code he produces, but used in different ways depending on the situation.

“I would say 99% of the code I write is in some way completed by AI… it's a poor decision to write lines of code yourself unless they're very high impact code that need a lot of thinking that the AI is proven to be bad at.” - Allen

Allen’s been experimenting with AI coding tools since 2022. Over time, he’s developed a clear system that adapts to the complexity of the task he’s working on. He breaks it down into four different modes of working with AI, from simple Q&A to full-on agent-style development.

Let’s walk through each level, how it works, and when to use it.

Four levels of autonomy for AI-assisted programming

Level 1: Thought partner

Even before Cursor and Windsurf, Allen was an early adopter of AI-assisted programming. As early as 2022, Allen was using ChatGPT as his programming thought partner. Before diving into the code, he goes back and forth with ChatGPT for tips on how to approach the problem. If he’s integrating a new technology, Allen skips the week of reading docs and instead gets an overview directly from AI

In Level 1, AI doesn’t actually write any of Allen’s code, yet the time savings can still be enormous. Days of research are often compressed to hours, especially with newer concepts. Pretty much anyone can benefit from this Q&A style usage of AI.

“I spend 90% of my time just thinking, I read the code. I think, okay, what should it look like? How do I implement it? Actually talk to the AI. I'm like, here's my ideas for implementation. Suggest alternatives or evaluate it.” - Allen

When to use Level 1

Before a new project.
Before a new feature (with some complexity).
Before using a new technology/language.

Level 2: Pair programmer (human-led)

For Level 2, think Github Copilot or Cursor tab autocomplete. You, as the developer, are still very much leading the way, but the AI accelerates you by predicting where you’re going. Allen uses this religiously as part of his development process. What was once a novelty is now just a standard part of his development toolkit.

“I build Friday with a lot of the tab line completes. Overall, I would say I'm 2-3x faster with AI. Figuring out a bug takes about the same time, but time iterating on a feature tends to be a lot faster.” - Allen

When to use Level 2

Pretty much all the time! 😄 (production-level features can be built this way)

Level 3: Pair programmer (AI-led)

Level 3 is where the balance shifts on the autonomy slider from human to AI. Instead of writing code directly in the IDE, Allen chats with Cursor in English. Cursor takes Allen’s ideas and turns them into code suggestions. Allen's job is to prompt it clearly, then accept or deny the suggestions.

Early on, this workflow was clunky. LLMs would often rewrite huge chunks of code, forcing you to spend more time reviewing changes than if you’d just written it yourself. But with the latest release of Claude 4, that problem has largely gone away.

Using Cursor to make small, specific code changes

The key for Allen is keeping requests small. If you ask the LLM for large scale changes, it’s difficult to review and understand its work. When you keep requests tight you get a feedback loop that works.

“Claude 4 has changed my programming experience. Before, AI would make these big edits and you'd have to spend so much time looking at it and most of the time it would never get it perfectly right. Claude 4 will make a two line change, think, make another two line change. You know immediately whether it's right or wrong, so I trust it a lot more.” - Allen

When to use Level 3

When building new features.
When working on less critical parts of the codebase.

Level 4: Managed employee

Finally, there’s Level 4, where AI effectively becomes a true junior developer. Think of those viral Twitter demos with AI cranking out an Airbnb clone from scratch in ten minutes. The first time you see it, it’s genuinely mind-blowing.

Asking Claude Code to do an entire open-ended project

But as easy as it is to write off these demos as Twitter hype, Allen has been able to use this type of coding to accelerate Friday. While Friday’s core codebase was written using levels 1-3, Allen turns to Level 4 for side projects.

Whether its a personal project or a little growth hack, Allen is able to build up to 100x faster. And I’m not just saying that for dramatic effect.

“For my Chrome extension, there's probably 10,000 lines of code in that codebase. Just to figure out permissions, rerouting, etc., would have taken me a week. Just solid, dedicated work. AI knows this stuff, so it can do it maybe 50-100x faster.”

When to use Level 4 (for now)

Side-projects.
Throwaway code.

Maintaining control

Across all four levels of AI-assisted programming, one core principle ties everything together:

“It’s your responsibility to understand every line of code that’s being written and why it’s there.” - Allen

Even with tools like Cursor handling the heavy lifting, Allen always keeps a firm grip on his codebase. That’s what ensures things stay stable and scalable. If you give up too much control, you eventually lose track of how your system works.

How to find PMF as an AI-native startup

It’s amazing how fast you can write code now. But building faster doesn’t mean you’re any closer to making something people want. If anything, with AI it’s easier to fool yourself. People are curious about the tech, so you can get what looks like traction that isn’t real.

With Friday, Allen is taking a more thoughtful approach to finding PMF. In our conversation, he shared how he keeps Friday honest and nimble as they iterate.

You need speed and accuracy for PMF

Currently, AI’s most obvious value prop is that it saves time. LLMs write code in seconds that would have taken a human minutes. ChatGPT reads PDF’s instantly that would have taken a human hours.

But speed only works if you can trust the output. If you always have to double-check an AI’s work, you don’t really save time (which, again, is the core value prop). According to Allen, Cursor took off not because of the rate of its code generation but because of its accuracy. When Cursor suggests an edit, you know it’s going to be right most of the time.

This same dynamic plays out in building an AI email assistant like Friday. While Friday can write email responses almost instantly, it’s only useful if users consistently accept its responses.

“If it’s 10% faster for an AI to draft your email than to write your own, but only works 50% of the time, then on average the AI actually makes you 45% slower.” - Allen

Therefore, great AI products aren’t just fast. They’re fast and accurate. As you’re optimizing your AI-native product, keep this formula in mind: speed × accuracy. The higher that number, the better the product.

“If the new speed times your accuracy is higher than the existing process, people are going to use your product. That's the thing you have to optimize. That is PMF.” - Allen

AI is engineering, not rocket science

Building an AI product that’s both fast and accurate can feel daunting. While Allen’s team certainly has technical chops, he’s found their process is closer to rapid iteration than groundbreaking research. Here are his three key tips:

Make problems determinate
Give the model the right context
Keep iterating on evals

1) Make problems determinate

It’s tempting to throw a complicated task to the AI and just say “solve it”. While the models are so smart that this might work for a demo, it won’t lead to reliable results in production. Therefore, the job of a AI development team is to break down a problem into small pieces. Each individual piece can then be measured and optimized to create a reliable whole.

The more clearly each part of your product has a specific job—one it can either succeed or fail at—the better. In fact, Allen’s recommends not using AI wherever possible. Compared to traditional engineering approaches, relying on LLM calls is brittle and costly. By breaking down problems into discrete, non-AI components and limiting yourself to a single LLM call at the end, you’ll significantly improve accuracy and move closer to PMF.

“You should be selling an experience, a feeling, an improvement. If AI is a means of achieving that, you should use AI. If not, you shouldn't.” - Allen

2) Give the model the right context

As smart as the current models are (and GPT-5/6/7 will be), their output is only as good as the context they receive. Ultimately, LLMs are stochastic token generators. They are doing their best to produce tokens they think are likely to fulfill your request. If you don’t provide them with the right context about the specific situation, you’re not going to get useful responses.

How difficult it is to provide the model with the right context depends on your use case. For example, customer support is an area Allen describes as having “low entropy”. In other words, based on a customer’s previous emails and the knowledge of your product, an AI can be somewhat likely to understand the customer’s state. Email, on the other hand, is a use case with higher entropy.

“For 99% of people, getting a coupon email from Domino’s is spam. But when I ran the Startup Society at Stanford, I would order a dozen boxes of Domino's every weekend. So those emails were actually important to me. Even with the most super intelligent model we'll ever create, it's going to get that wrong every single time.” - Allen

In Allen’s view, the way to solve these types of problems is with context. Provide the model with the right information and it can make the right decisions - even in ambiguous situations.

“Now with the context, if you see this person actually clicks into Domino's coupons every single time they get one, it must mean this coupon is actually important.” - Allen

What “providing the right context” looks like will vary widely depending on your product and use case, but it’s worth taking the time to nail it.

3) Keep iterating on evals

While breaking problems down and providing context will undoubtedly improve model performance, the only way to know for sure is with evals. This is where the scientific iteration really becomes clear. It’s easy to fall into one of two traps when getting started with AI. The first is thinking that evals don’t really matter. The second is thinking that evals are too time-consuming.

Allen dispelled both of these falsehoods while working on Friday. By tweaking his prompts, following prompting guides, and adjusting the order of information he’s seen massive improvements in performance. In certain cases, accuracy of Gemini Flash 2.0 has jumped from 10% to 90% just from those small changes.

“20% of the effort into evals will get you 80% of the results. And you can just rinse and repeat that.” - Allen

Hearing about those massive improvements, I assumed Allen had built an advanced process with fancy eval tooling. Surprisingly, his whole setup is pretty basic.

“I literally took my prompt, wrote a Python script, and made API calls to a bunch of different models. Then I would go through the responses in a Google Sheet and say, did it respond to the person? Did it follow up? Did it suggest that it should reply? I would check things off manually by hand. It took maybe an hour to go through this and tweak it a bunch of times and get that 9x improvement.” - Allen

When it comes to evals, Allen adopted the classic YC mindset of doing things that don’t scale. If you’ve been hesitant to dive into the world of evals, Allen suggests just getting started.

Summary

Three key takeaways from Allen's experience building Friday:

1. AI supercharges development speed From simple code completion to full project generation, AI tools can dramatically accelerate your development process. The key is matching the right level of AI assistance to each task.

2. Product-market fit requires both speed and accuracy While AI can work quickly, users only stick around if they can trust the results. Focus on this formula: Speed × Accuracy = Value. If either component is weak, your product won't succeed.

3. Building with AI is iterative engineering Success comes from:

Breaking complex problems into measurable pieces
Providing rich context to guide the AI
Running consistent evals to improve accuracy
Using AI only where it truly adds value

Model Behavior

Discussion about this post