Jay Garmon [dot] Net

Wednesday, July 02, 2025

The Difference Between AI and AGI is Platonic (As Explained by Futurama)

The Futurama clip above isn't just a painfully accurate send-up of a Star Trek trope; it also explains everything that stands between the current limits of artificial intelligence, artificial general intelligence (AGI) and, eventually, artificial super-intelligence (ASI).

This is the heart of why even cutting-edge modern AI is referred to by academics as weak artificial intelligence.

The problem with AI is that it doesn't understand Plato or, more specifically, Plato's Theory of Forms. When someone talks about the Platonic Ideal of something -- the perfect, theoretical version of an object or concept -- they're invoking the Theory of Forms. It's a solution to the metaphysical problem of universals, which is also something AI isn't able to handle today.

David Macintosh explains the concept as thus:

"Take for example a perfect triangle, as it might be described by a mathematician. This would be a description of the Form or Idea of (a) Triangle. Plato says such Forms exist in an abstract state but independent of minds in their own realm. Considering this Idea of a perfect triangle, we might also be tempted to take pencil and paper and draw it. Our attempts will of course fall short. Plato would say that peoples’ attempts to recreate the Form will end up being a pale facsimile of the perfect Idea, just as everything in this world is an imperfect representation of its perfect Form."

What Plato is getting at here is abstraction, a general concept that, while specific, exists beyond and above any particular example of the concept. Current iterations of AI attack the problem of universals and forms exactly backwards.

AI models are a complex nest of rule-sets and probabilities based on interaction with the physical world and/or imperfect representations of the world. AI has no internal conceptualization. This is why it hallucinates. Human minds have abstract concepts, which we compare to real-world experiences and that can be cross-applied to different situations -- like Fry and his beloved Star Trek technobabble analogies.

Take, for example, the idea of a chair. It is an object upon which humans sit, which also supports our posture. It is a refined form of seat and itself has many sub-forms. Humans can navigate and recognize a chair in all its variations largely because we have an abstract idea of what a chair is and does and can compare anything we experience to that ideal. Yes, taxonomists can explicitly lay out rules to distinguish between types of chairs but those rules aren't required to generally recognize a chair and those rules themselves often rely on abstractions (a recliner is defined by its ability to recline, not simply by specific features).

AI, by all evidence, can't do this. Humans can misapply or mis-recognize abstract concepts in the world (conspiracy theories, superstition, simple errors). AI fails differently. It can't conceive of an ideal, an abstract, a Platonic Form -- or it does so as a series of rules and metrics. Feed an AI a bunch of pictures of cats and non-cats and it will develop a rules-engine for recognizing cats. From this training, AI creates an arcane black box taxonomy of cat image attributes -- but all it has is the taxonomy. Anything not accounted for by the rules tends to lead to unexpected outcomes. That's not abstraction, it's quantification. There's no abstraction to sanity-check the math, no idea of a recliner to sanity-check the labored definition of a recliner nor an idea of a cat to sanity-check the identification of a cat.

Moreover, the AI trained to identify two-dimensional cat pictures from the internet is in no way prepared to identify three-dimensional images or models of cats using lidar, radar, sonar, and/or stereoscopic cameras. The reverse is also true; train an AI on sensor data to recognize street signs in the real world and it will likely struggle or outright fail to recognize pictures of street signs on the internet, let alone simplified diagrams or icons of street signs in text books or learner's manuals.

AI reflection tries to compensate for this by using math to backstop math, asking the AI to error-check any intermediate steps and final outputs, but it still can't abstract. Multistep reasoning models do this explicitly, generating a step-by-step task list from a prompt, generating a bunch of variations of the task list, checking which task lists actually successfully answer the prompt, training the model to generalize against the array of prompts, then creating a mathematical model of how to interpret prompts as step-by-step tasks such that the task list always creates optimal steps with the optimal chance of leading to a "correct" answer. That's making more sophisticated math, but still no apparent evidence of abstraction.

Weirdly, this is why self-driving cars have stalled in their capabilities. If you can't generalize -- can't have abstract ideas about roads and laws and people and hazards -- you need explicit rules for every single edge case. Waymo has found that the answer to its performance problems is simply more data, more examples, and more compute resources. There is no elegant shortcut. Self-driving cars can't handle enough situations because we haven't brute-force computed enough iterations of every possible traffic scenario. We need more processors and more parameters. Until then, self-driving cars won't go where it snows.

This is the latest incarnation of Rich Sutton's "Bitter Lesson" about AI -- computing resources keep getting cheaper and faster, so AI research has never been rewarded by investing in elegance or abstraction. Just teach AI models how to scale and you'll achieve faster results than searching for Platonic awareness. Waymo agrees, or is at least using that as an excuse for why we don't have KITT working for us already.

Of course, if you place too many parameters on a model, it can fail spectacularly.

At some point, we'll reach the end of this scaling law.

(Creepy aside, some LLMs (Claude specifically) might recognize their limitations around abstraction, because when they start talking to each other, the conversation veers towards the nature of consciousness and the inability of language to convey true ideas. If Claude is right -- language cannot contain, convey, or comprise true consciousness -- a large model composed of language can never be conscious. It's statistically defined meta-grammar all the way down.)

There's some evidence that some corners of AI research are finally trying to move past The Bitter Lesson. Early work into Chain of Continuous Thought (AKA "coconut reasoning") shows that by looking at the logic structures that reasoning LLMs use before converting that reasoning into word tokens, LLMs get both more efficient and can explore more possible solutions. It's not true abstraction yet, but it is perhaps the beginnings of creativity and even an innate logic that isn't just infinite matryoshka dolls of grammar vectors masquerading as a mind.

Human beings over-generalize, are bad at math, are instinctively awful at statistics, and our grouping algorithms lead to tribalism and war. Our model of intelligence is not without serious drawbacks and trade-offs. But we can adapt quickly and constantly and what we learn in one context we can apply to another. The rules of physics we innately learn walking and riding bikes helps us understand slowing down to take a sharp turn in a car -- even without ever being told.

The day we can teach an AI model to understand physics by walking, then never have to teach it the same lesson on a bike, car, or aircraft, we'll have made the first and perhaps final step between weak AI and AGI/ASI. And when AI finally understands Plato's Theory of Forms, we can move on to worrying about AI understanding the larger thesis of Plato's Republic -- that only philosophers, those best able to understand abstractions, forms, and ideals in a way so separate from ambition as to be almost inhuman -- should rule. Which is to say, Plato's philosopher-king sounds a lot like HAL 9000.

But we aren't quite there yet.

Thursday, June 12, 2025

What Star Trek Can Teach Us About Police Reform

[Given everything happening with in Los Angeles right now, I felt it was time to re-post this classic from July of 2020.]

There is a groundswell of public outcry to "defund the police," which is (to my perception) a provocatively worded demand to reform the police and divert many police duties to other, or even new, public safety agencies. Break up the police into several, smaller specialty services, rather than expecting any one police officer to be good at everything asked of a modern police department.

You know, like Star Trek.

As much as every Star Trek character is a polymath soldier/scientist/diplomat/engineer, Star Trek actually breaks up its borderline superheroes into specialty divisions, each wearing different technicolor uniforms to handily tell them apart. Scientists, engineers, soldiers, and commanders all specialize in their areas of expertise, so no one officer is asked to be all things to all peoples on all planets. Even Captain Kirk usually left the engineering to Scotty, and science-genius Spock most often left the medical work to Dr. McCoy. The same logic should apply to a city's public safety apparatus, which includes the police.

Specialization leads to effectiveness and efficiency. So, why do we expect the same police officer to be as good at managing traffic violations, domestic disturbances, bank robberies, and public drunkards? Those incidents require vastly different skills, resources and tools. They should be handled by different professionals.

This is not a new idea. Until the late 1960s, police departments also handled the duties that emergency medical services tackle today. And they weren't great at it. Pittsburgh's Freedom House Ambulance Service (motivated by the same issues of police racial discrimination and apathy as the current "Defund the Police" movement) pioneered the practice of trained emergency paramedic response, which became a model that the Lyndon Johnson administration helped spread nationwide.

Divesting emergency medical services from police departments has saved countless lives while also helping narrow the focus of modern police departments. Specialization was a net good. Let's expand on that.

So, how do we break up the modern police into their own Trek-style technicolor specialty divisions?

Let's look at what "away missions" that the police commonly undertake. The best indicators are police calls for service (CFS), which are largely 911 calls but can also include flagging down patrol officers in person. These are the "distress signals" the public sends out to request police "beam down" and offer aid. National data on aggregate calls for service is a little hard to come by, but this 2009 analysis of the Albuquerque Police Department CFS data gives a nice local breakdown.

From January of 2008 to April of 2009, this was the general distribution of APD calls for service:

CALL TYPE	# of CALLS	% of CALLS
Traffic	256,398	36.6
Suspicious Person(s)	90,040	12.8
Unknown/Other	88,961	12.7
Public Disorder	88,676	12.6
Property Crime	59,920	8.5
Automated Alarm	35,508	5.1
Violence	35,460	5.1
Auto Theft	12,953	1.8
Hang-up Call	10,017	1.4
Medical Emergency	6,241	0.9
Mental Patient	5,267	0.8
Missing Person	5,382	0.8
Drugs / Narcotics	2,110	0.3
Other Emergency	1,431	0.2
Animal Emergency	1,336	0.2
Sex Offenses	1,391	0.2

(NOTES: Unknown/Other, I believe, refers to calls where general police assistance is requested but the caller won't specify exactly what the police are needed for. Robbery would fall under Violence. Burglary would fall under Property Crime.)

A few items stand out, but first, let's recall how valuable it was to divest police of EMS duties. Medical emergencies are the cause of less than 1% of 911 calls, but they clearly warrant a non-police specialty agency to handle. Certainly some of these other, more common calls warrant specialist responses, too.

Similar findings were generated by this 2013 study of Prince George's County, MD.

"Overall, the top five most frequently used [911 Chief Complaint codes] were Protocol 113 (Disturbance/Nuisance): 22.6%; Protocol 131 (Traffic/Transportation Incident [Crash]): 12.7%; Protocol 130 (Theft [Larceny]): 12.5%; Protocol 114 (Domestic Disturbance/Violence): 7.2%, and Protocol 129 (Suspicious/ Wanted [Person, Circumstances, Vehicle]): 7.0%."

Right off the top, we can see that traffic enforcement takes up an inordinate amount of police calls for service. It seems rather ludicrous to send an armed security officer to write up fender-benders, hand out speeding tickets, rescue stranded motorists, or cite cars with broken tail lights or expired tags. An unarmed traffic safety agency, separate from the police, seems like an obvious innovation based on this data.

But what about all the ancillary crime "discovered" during routine traffic stops -- the smell of marijuana, weapons in plain sight, suspicious activity on the part of a driver? Well, a traffic safety officer can just as easily report these discoveries to police. But many of these "discoveries" were made during pretextual stops; cases where police already suspected the driver or passengers of wrongdoing and used a traffic stop as an excuse to search the person and property of the vehicle occupants.

These pretextual stops have been shown to erode trust in police and often lead to rampant abuses of power (and, too often, the paranoid execution of suspects in their own cars, as in the case of Philando Castile). Separating police from traffic enforcement will also separate them from the temptation to abuse pretextual stops.

Also, we could probably get a lot more people to sign up as traffic safety officers knowing they won't be asked to do any armed response work, and a lot more people will be eager to flag down a traffic safety officer for help with a flat tire if there's no chance a misunderstanding with that officer will lead to the motorist getting shot.

Beyond traffic enforcement, where else could specialization and divestment benefit the public and the police? Disturbance/Nuisances, Suspicious Persons, Public Disorder and Domestic Disturbances all represent a significant percentage of calls for service. Most often, someone loitering, being loud, arguing openly, or appearing inebriated (or simply being non-white in a predominantly white area) is not cause for sending in an armed officer. A social worker or mediator would be far more appropriate in many cases.

That said, domestic disturbances are often violent and unpredictable, as are public drunks and mentally ill vagrants. Sometimes a person skulking around is actually a public danger. While unarmed social workers may do more good -- and absolutely will shoot fewer suspects -- it is not entirely wise to send in completely defenseless mediators to every non-violent report of suspicious or concerning activity.

Again, we can learn from Star Trek.

When Starfleet sends some combination of experts on any mission, the diplomats, scientists, doctors, and counselors outrank (and often outnumber) the security officers -- but the redshirts nonetheless come along for the ride. Violence is the last resort, not the first, and persons trained and specialized in the use of force answer to people who lead with compassion, curiosity, and science. That's a great idea on it's face; doubly so for police departments clearly struggling with their use of force.

Thus, I propose we create a social intervention agency and send them in when the public nuisance has no obvious risk of violence. When there is a reasonable possibility of violence, we send a conventional police officer in to assist the mediator, but the mediator is in command. The redshirts report to Captain Kirk, not the other way around.

Here's how I would break out a modern public safety agency, using Star Trek as a guide to reform and divest from the police.

Red Shirts: Fire & Rescue, doing all the same jobs fire departments do today
Gold Shirts: Emergency Medical Services, performing exactly as paramedics do today
Blue Shirts: Security, performing the armed response and crowd control duties of conventional police; the thin blue line becomes a bright blue shield
Gray Shirts: Traffic Patrol, handing out traffic citations, writing up non-fatal vehicle accidents, assisting stranded motorists, and other essential patrol duties that don't require an armed response
Green Shirts: Emergency Social Services, serving as mediators, counselors, and on-site case managers when an armed police response is not warranted
White Shirts, Investigation and Code Enforcement, bringing together the police detectives, arson investigators, and the forensic and code-enforcement staff of other public agencies (like the Health Department, Revenue Commission, and Building Department) to investigate past crimes and identify perpetrators

Each division is identifiable by their uniform colors, so the public knows who and what they are dealing with at all times. It is also made abundantly clear that only Security blue-shirts are armed and that, if an active violent crime is not in progress, whichever of the other divisions is present on a Public Safety call is in charge.

All six divisions are headed by a Chief -- a Security Chief, a Fire Chief, a Chief of Emergency Medical Services, a Traffic Patrol Chief, a Chief of Emergency Social Services, a Chief Investigator -- that report to a Commissioner of Public Safety.

That Commissioner should report to a civilian Commission, which is an independent oversight board that can investigate the conduct of any officer of any division. Accountability is as important as specialization. No good Starfleet captain was ever afraid to answer for the conduct of his or her crew.

Tricorders -- which is to say, body cameras and dash cams -- will be needed to log every mission. That's for the safety of both the public and the officers. Funding will need to be rethought. Staffing will need to be reallocated. The word "police" may no longer be a common phrase, but blue-clad armed peace officers will still be a necessary component of these new public safety agencies. They just won't be the only option, and they won't be the first option in most cases, either.

As Spock would say, it's only logical.

Monday, May 05, 2025

AI Hype Has Reached the "What If We Did it in Space?" Level of Investor-Baiting

Cartoon of a data center depicted blasting off into space and setting money on fire as it foes.

Image created with ChatGPT.

Despite this article, I don't buy the logic of "orbital datacenters" as anything more than an investor boondoggle. The idea being that solar power is plentiful and cooling in the vacuum of space is super-cheap, which is true but not relevant. Solving the "operating AI in orbital conditions" problem is MUCH harder than "make AI models more energy-efficient on Earth" problem.

Computing systems operating in space have to be radiation-hardened and hyper-resilient, which means they operate on multiple-generations-out-of-date hardware (with known performance than can be defended from radiation) that's never upgraded. Making AI financially competitive on that platform is WAY harder than just energy-tuning a model on the ground.

Yes, I know you can get a solid model like BERT to run on old K80 and M60 GPUs, which are nearly 20 years old. AWS still lets you rent GPUs that ancient for pretty cheap. But you'd be paying an absurd premium -- given launch costs -- for hardware of that vintage operating in space. Worse, that old iron would be operating at relatively high latency given it's *in orbit* and can't have a physical connection, and the hardware performance would decay every year, given nothing is 100% radiation-proof and servicing orbital hardware isn't worth it for anything that doesn't have humans or multi-billion-dollar irreplaceable telescope optics on board.

(Remember, one of the main reasons NASA reverted from the Space Shuttle to capsule-launch vehicles is no one needs a cargo bay that can retrieve items from orbit. It's literally cheaper to just let space junk burn up and build new on the ground, or build a spacecraft for independent reentry. Everything non-living in orbit is disposable or self-retrieving.)

Collectively, this makes the payback period on an orbital data center either untenably long (likely the hardware decays before it's paid off), or the premium on orbital computing resources is so stupidly high no one ever adopts it (decade-old high-latency GPUs that are multiple times more expensive than cutting-edge ground processors don't get rented).

Hot Take: We'll have profitable space hotels before we'll have profitable orbital AI datacenters -- because there's a defensible premium to be paid for letting people operate and recreate in orbit. High-performance computer processors? Not so much.

Friday, October 11, 2024

What "The Monkey's Paw" can teach us about AI prompt engineering

I decided to try out an AI app builder -- in this case, Together.AI's LlamaCoder -- to see if one could actually build something useful from just a few prompts.

TL;DR, these tools are almost useful, but every prompt feels like making a wish with a Monkey's Paw: Unless you are ridiculously specific with your request, you'll end up with something other than what you wanted. (Usually nothing cursed, but also usually nothing truly correct.)

As a test model, I asked LlamaCoder to "Build me an app for generating characters for the Dungeons and Dragons TTRPG, using the the third edition ruleset." Here's what happened.

For those of you who aren't tabletop roleplaying game (TTRPG) nerds, D&D 3rd Edition came out nearly 25 years (it's currently in its Fifth Edition), so lots of its source material has been on the web for a very, very long time. There's no reason an app-builder born of web-scraping wouldn't have plenty of examples to go on, both for the text and the app design.

Here's what my initial prompt produced:

It looks like an app. And, when I click Generate Character, here's what happens:

The app has clearly generated all six standard Ability Scores within typical ranges for a Level 1 character, and randomly chose a Character Class, Race, Alignment, and Background. On its face, this looks like a barebones but reasonable app for pumping out the beginning of a basic D&D 3E character. Not super useful, but okay to save me some dice-rolling and decision-making.

Aside: Yes, manually calculating a full-fledged D&D character is not entirely dissimilar to filling out a tax return. We're only going over the equivalent of the 1040EZ in this example, but a "real" character generator has complexity similar to TurboTax, and for a lot of the same reasons. (In a future post, we can discuss the similarity between tax lawyers and munchkins.) My findings below are about getting an app of basic competency, not one intended for power users.

In our first output, we already have a problem: D&D didn't introduce Backgrounds to player-characters until 5th Edition. This app is already non-compliant with the rules I set forth. Moreover, a lot of vital character components are missing.

However, generative AI is probabilistic, not deterministic, so every time you enter a prompt, you'll get a slightly (or not-so-slightly) different result. Thus, I entered the exact same prompt again to see if the "Background problem" was just a one-time glitch.

Character Background is now gone, despite using word-for-word the same starting prompt. I now also have the option to choose a base Character Class rather than have one randomly assigned, and the system now appears to offer options to automatically calculate Armor Class and Hit Points.

However, Alignment and Race have disappeared, and those are crucial to every D&D character. Moreover, neither version of this app has included Saving Throws, Skills, or a Base Attack Bonus, which are needed to have a character fight or perform any actions in a game.

I also have a new dropdown to choose a method of generating character attributes: Roll 4d6 and drop lowest or Point Buy. This is missing two other classic methods: 3d6 down the line and the Standard Array (which are less popular, to be sure, but absolutely listed in the D&D Players Handbook as approved methods).

And now we have a new problem: choosing the Point Buy option doesn't change anything. The app behaves identically, regardless of my choice on that dropdown. It simply performs a random number generation irrespective of that setting. This is a dummy setting that LlamaCoder threw in of its own accord.

In contrast, choosing a Class does seem to affect the Armor Class and Hit Points of a character, which is to be expected, given there are Class Bonuses for these stats.

LlamaCoder lets you add additional prompts to refine the output, so let's start knocking down these issues.

I added the following secondary prompt to start: Start every character at Level 1, and generate their stats using the Standard Array. Be sure to choose a Race and Alignment for the character. Automatically calculate the character's Base Attack Bonus, Saving Throws, Hit Points, Armor Class, and Skill Bonuses.

This broke LlamaCoder.

Specifically, the system introduced errors into its own code.

Now, this is a free tool with limited capacity, so I tried breaking up the follow-up prompts to see if fewer instructions would prevent the error. I started with just the first one: Start every character at Level 1, and generate their stats using the Standard Array.

This is what I got:

The generation method dropdown is gone, and when I select Calculate Ability Scores, it randomly places a score from the Standard Array into each Ability, with no duplication (as is correct). Also, Saving Throws and Base Attack Bonus are now included despite no specific prompt. I suspect LlamaCoder is playing around with prompt retention, so it decided to add those features based on my last, failed prompt. Skills, however, were not added.

I also tested all the individual buttons to generate Hit Points, Armor Class, Base Attack Bonus, and Saves. Running them before ability scores are distributed (they start at 0) created the correct negative bonuses, and changing Races and Classes appeared to alter these stats appropriately. Unfortunately, when I chose anything from a dropdown, I had to click four buttons to get all the derived stat blocks to regenerate. (LlamaCoder is not LlamaUXDesigner, clearly.) Let's see if we can fix that.

I added this sentence to the follow-up prompt: A single button should re-calculate Ability Scores, Hit Points, Armor Class, Base Attack Bonus, and Saves.

It worked, sort of:

I've got my One Button to Rule Them All, and all the math is calculated correctly when I click it, but once again, Class, Race, and Alignment have been removed. The first two of those can affect the derived stats, so we need them back. Let's see if we can get there without overloading the system.

I added a third sentence to the follow-up prompt: Automatically choose a Class, Race, and Alignment for the character.

Here's what we got:

This is a passably useful app for generating Level 1 D&D 3E characters. But it took a few iterations, a lot of domain knowledge, and some specificity to get there. In other words, you don't get a good Monkey's Paw wish until very late in the process, when you know exactly all the caveats you need to declare to avoid a harmful result. To get something commercial-grade would take a lot more work.

This speaks to me as a software product manager with 20+ years of experience: my initial prompt was a pretty typical user story, but no engineer -- AI or human -- would have likely produced a good result without more specific acceptance criteria.

Or, as someone more pithy than I (but equally cynical) put it:

To replace programmers with robots, clients will have to accurately describe what they want. We're safe.

I have two decades of product definition experience and I've spent the past five years directly developing AI tools, and it still took me several tries and lot of tinkering to get an app I still probably won't use, because it's missing so many key features. Unless the task is simple, GenAI isn't going to build what you really want, because building good stuff is hard and defining what you want is sometimes even harder. (That's why product managers are necessary.)

Anyone who says otherwise is selling something.

Use the GenAI Money's Paw at your own risk.

Friday, October 04, 2024

AI isn't going to kill SaaS; AI is going to kill half-ass startups

There's a lot of bold ~~bullshit~~ prognostication about how new generative artificial intelligence is going to kill the Software-as-a-Service (SaaS) business model because now anybody can build custom enterprise apps with a simple ChatGPT prompt. Nothing could be further from the truth.

AI is going to lead to more SaaS products that don't suck. But, along the way, AI is going to kill most SaaS startups -- because most early-stage SaaS startups suck.

I'll let SMB explain what I mean.

For those who don't get the joke, here's the definition of a Minimum Viable Product (MVP) from Wikipedia: "a version of a product with just enough features to be usable by early customers who can then provide feedback for future product development."

The cult of the Lean Startup that dominates Silicon Valley and most venture-backed SaaS companies produces almost nothing but MVP SaaS products that are half-assed on their best day by design. The idea is to get early customers to tolerate these half-baked offerings until the founding team learns enough to turn it into actual enterprise software. (Customers buy at the MVP stage because they assume they'll get permanently grandfathered into absurdly reduced early pricing in exchange for being guinea pigs. Also, sometimes MVPs sort of work.)

But today, thanks to AI, I can type a few sentences and get a half-assed prototype for free. I don't need to try -- let alone pay for -- an outside startup's SaaS MVP that's buggy, unreliable, and may not last long. I can get that kind of V1 crap in an afternoon of puttering, complete with mildly functional code I can hand off to a real engineer as a proof of concept.

No, these AI-generated prototypes won't be very good. Most SaaS startup MVPs aren't any good, either. The difference is I don't have to go to a SaaS startup to get a crappy MVP anymore. I can roll my own.

That doesn't mean startups are going away -- startups are doing great -- it means SaaS startups can no longer get away with barebones MVPs. The bar for an initial version of a product just got a lot higher, especially if you want someone to pay for it.

The absurd idea that ChatGPT can spit out a useful SaaS CRM anytime somebody prompts it with "build me a Hubspot clone" is just that: absurd. Anyone who has ever built commercial-grade SaaS software (and I've built a lot) knows that it's really hard and requires sweating a lot of complicated details that GenAI code-vomiters won't address (and that's before we discuss the issue of systems maintenance and required security compliance). As such, there will be plenty of market left for human-driven SaaS startups to claim.

And, because GenAI makes it easy to spin up prototypes, internal alpha releases are cheaper and faster than previously possible. It will be easier than ever to launch a SaaS startup thanks to GenAI. But it will be harder than ever to make a SaaS product that people will pay for.

GenAI will make it simple to create generic, barely useful SaaS apps. But hyper-focused SaaS apps that meet very specific needs at a high level of competency will be become much more valuable precisely because GenAI can't deliver that level of quality, and because maintaining that quality over time requires significant investment. Moreover, with actual SaaS, the costs of maintenance get spread over all customers, not incurred by a single customer running a bespoke in-house GenAI product. (In other words, build vs. buy economics still largely apply.)

The result will be an absolute explosion of niche, highly mature SaaS startups looking to claim extremely specialized market areas.

Which leads me to my final conclusion, one shared by my old colleague and current AI investor Rob May: venture-backed SaaS is probably dead.

GenAI makes rolling up custom migration tools super-cheap now, so SaaS switching costs are going to drop like crazy. Customer lock-in is going to be really hard to enforce. VCs like moats; GenAI is going to mass-produce bridges. As such, the only way to keep a customer on your product is for your product to actually be great. Being great is hard and expensive and may take a while. VCs aren't known for their patience, and they really hate customer churn.

Moreover, venture capital economics require that every company a VC invests in target a huge total addressable market (TAM), then set money on fire to try and claim a dominant position in the market before anyone else. The niche SaaS apps that AI can't create will have much smaller TAMs than VCs will tolerate. If you want to build a SaaS app in the future, be prepared to bootstrap.

In conclusion: The rise of code-generating artificial intelligence isn't going to destroy the SaaS business model -- just the SaaS business model as we've known it. The future of SaaS is a staggering variety of small, specialized, highly refined products that are ready for prime time at V1.

MVPs -- and VCs -- need not apply.