🗞️ You're Paying Hallucination Tax💥🚀

AI for eCommerce Newsletter - 66

In partnership with

If you’re new here, welcome! If you’ve been reading for a while, thank you for sticking around as we navigate this wild AI shaped shift happening across eCommerce. Each week I share what I’m experimenting with, what’s actually moving the needle, and the trends that deserve your attention before they hit your competitors’ playbooks.

A quick heads up. I’ve organized all previous editions into one searchable hub. If you want the full journey, it’s all here.

You are Paying Hallucination Tax

You know that lovely message:
“You have reached the end of your chat limit. Conversation too long”

That is not your LLM showing you attitude. That’s your context window hitting concrete.

Roughly speaking:

🔶 Gemini 1.5 Pro gives you a context window up to around two million tokens in dev land. That is hours of audio, long video, entire codebases in one go.

🔶 GPT 5.1 gives you a very large window in the API and a generous window inside ChatGPT on the Thinking tier so you can work with serious documents instead of tiny snippets.

🔶 Claude Opus 4.5 sits on the Claude long context family around the two hundred thousand token mark for most users with special long context modes that stretch higher.

On paper it sounds like one simple story. Bigger windows. Fewer broken chats.
In practice, each lab is making a different trade.

Claude, Gemini, OpenAI and where the burden lands

All major LLMs are juggling the same three pieces

  1. Context window what the model can see right now

  2. Compression what gets summarized or compacted when that fills

  3. Retrieval or memory how old info gets pulled back in

Gemini 3.0 leans on retrieval. It indexes huge piles of content then pulls the right slices into that big window when you ask a question. The headline is long context. The real trick is which pages it chooses to show the model.

OpenAI GPT 5.1 pushes on size and structure. Big windows in the API and a separate memory layer in ChatGPT for stable preferences to save your preferences and instructions.

Anthropic Opus 4.5 looks fussy because it ends long chats earlier. That is by design. Claude would rather stop than squeeze too much history into the window and start guessing. It seems counter-intuitive, but they are doing the right thing.

Opus 4.5 now compresses older turns so agents can run longer, but when answers might get blurry it still hits pause. So the user stays responsible for managing long context.

Gemini and OpenAI might seem to absorb more of the pain for you, but may silently be switching to hallucination mode without you realizing.

Anthropic protects answer quality and hands more control and more work back to the human.

Where the hallucination tax shows up

Every time a model compresses your history it keeps the headline and trims the fine print. The main story survives. The nuance does not.

Stack enough compression and you start paying hallucination tax

🔶 Numbers get rounded into friendlier stories
🔶 Edge cases and caveats disappear
🔶 The model confidently repeats something that was true three summaries ago and no longer is

Huge windows have their own failure mode. Treat Gemini or GPT 5.1 like an infinite dumping ground and retrieval starts to blur at the edges. The answer still sounds sharp. It is just quietly wrong in the third decimal place or the small print.

Bigger windows and smart compression stop chats from crashing.
They do not remove your responsibility to keep facts straight.

How to stop overpaying

If you are using these models for real work, treat context like budget.

🔶 Work in episodes
New project, new thread. Start with a tight recap. Update that one recap instead of expecting the model to reconstruct six weeks of side quests.

🔶 Re anchor in source when it matters
When money, contracts, or live data are involved, put the real spec or sheet or doc back in front of the model. Ask for answers that point back to that source, not to the chat history.

🔶 Do not chain summaries
If you need a fresh summary, ask the model to pull from the original doc or a rich outline, not from yesterday recap. A summary of a summary is where subtle hallucinations breed.

🔶 Split profile from project
Let long term memory handle who you are, tone, and reusable frameworks. Keep live numbers, SKUs, constraints and timelines inside the current context so they cannot be quietly warped by old compression.

🔶 Ask the model to show its doubts
Tell it to flag places where it might be guessing because earlier parts of the chat were compacted away and to ask you to restate facts. You trade a little magic for a lot more reliability.

Gemini long context, GPT 5.1 expanded windows, and Opus 4.5 careful compaction are real upgrades.

The edge goes to the people who still behave as if the window is small and act like editors of the context, not just consumers of the output.

Notebook LM for Infographic Style Explanations

You know that feeling when you finally finish a long, carefully crafter how-to guide and then realize most normal human beings will find it hard to follow?

It walked through two separate JavaScript bookmarklets, how to turn them into browser bookmarks, what each one does, and the exact sequence for expanding RUFUS questions, scraping Q and A, and grabbing everything to your clipboard.

So I tried something different.

NotebookLM by Google has a cool bunch of tools to take any collection of documents, photos, website links or YouTube videos and turn them into any of these:

I asked Notebook LM to create an infographic style explanation for my RUFUS extractor and basically threw my entire document at it. You are looking at the result.

It separated the workflow into two clean parts, gave each its own color band, pulled out the exact three steps for the Clicker, the exact three steps for the Extractor, and turned my doom scroll instructions into a one screen, “oh, I get it” visual.

No designer. No extra prompts. Just “here is my guide, make it visual and simple.” I like it.

Now, could you have done this with Nano Banana? Yes, of course, but you would also have to direct the look and the prompt. More work.

With Notebook LM, none of that is required. It is like an intelligent analyzer of documents that was created with the the sole purpose of helping humans learn better.

AI video is not “someday” anymore

At UnBoxed, someone stopped by our booth, looked at our screens, and asked this question:

“So when do you think AI will be able to make realistic videos”

I did not hedge.
I said, “Now.”

Not “in five years.”
Not “once compute catches up.”
Now.

(GIF Loop of a small clip from a larger video we created)

Right now you can:

🔶 Take a single product image and turn it into a fully animated video clip
🔶 Generate dozens of short hooks in different styles for Amazon, TikTok
🔶 Iterate on motion, camera angle, and pacing with text prompts instead of reshoots

Is it perfect?
No.

Will you toss 9 out of 10 videos because of weird hands, odd physics, and the occasional uncanniness?
Yes!

But it is already good enough to extract segments that are fine, stitch them together with B-roll that already exists, throw in stills where you can, create 15 second Sponsored Brands video ads.

You’d rather be ahead of the race than wait for things to be perfect.

The future of e-commerce requires scale and speed. PPC Ninja helps brands dominate the AI transition. We leverage AI to build stunning, high-converting images and video, efficiently scaling your content production across all channels (Amazon Ads, Social Media, Posts). Reach out to [email protected] to explore how we can immediately upgrade your content and future-proof your listings.

I jammed on this topic with Danny McMillan in a recent conversation about AI video and what it means for brands that sell physical products. If you want a deeper dive, the full chat is here:

What we kept circling back to is this

The bottleneck is not the tech anymore.
The bottleneck is imagination, taste, and workflow.

If you are still waiting for AI video to reach some mythical “broadcast ready” milestone before you experiment, you are already behind the people who are using it today to

🔶 Prototype concepts before they pay for a studio
🔶 Test four story angles in a week instead of one per quarter
🔶 Give their creative team starting points instead of blank timelines

AI video is not here to replace your best work.
It is here to remove all the excuses between your idea and your first testable draft.

We hope you liked this edition of the AI for E-Commerce Newsletter! Hit reply and let us know what you think! Thank you for being a subscriber! Know anyone who might be interested to receive this newsletter? Share it with them and they will thank you for it! 😃 Ritu

Your competitors are already automating. Here's the data.

Retail and ecommerce teams using AI for customer service are resolving 40-60% more tickets without more staff, cutting cost-per-ticket by 30%+, and handling seasonal spikes 3x faster.

But here's what separates winners from everyone else: they started with the data, not the hype.

Gladly handles the predictable volume, FAQs, routing, returns, order status, while your team focuses on customers who need a human touch. The result? Better experiences. Lower costs. Real competitive advantage. Ready to see what's possible for your business?

Reply

or to participate.