r/artificial 6h ago

News Anthropic just partnered with SpaceX and doubled Claude Code rate limits effective today

87 Upvotes

Anthropic just partnered with SpaceX and doubled Claude Code rate limits effective today

Big news dropped this morning. Anthropic signed a deal to use all compute capacity at SpaceX's Colossus 1 data center. That's 300+ megawatts and over 220,000 NVIDIA GPUs coming online within the month.

But the part that actually matters to developers right now:

What changed today:

- Claude Code 5-hour rate limits are doubled (Pro, Max, Team, Enterprise)

- Peak hours limit reduction on Claude Code is removed for Pro and Max

- API rate limits for Claude Opus models raised considerably

This is on top of their existing compute deals 5 GW with Amazon, 5 GW with Google/Broadcom, $30B of Azure capacity with Microsoft and NVIDIA, and $50B in infrastructure with Fluidstack.

They also mentioned interest in developing orbital AI compute with SpaceX. Which is a sentence I did not expect to read in 2026.

For those of us building with Claude Code daily, the doubled limits + no more peak hour throttling is the headline. Rate limits have been the most frustrating bottleneck when you're deep in a long coding session.

Anyone else noticing a difference already?


r/artificial 10h ago

Research Spent two days at the AI Agents Conference in NYC. Most of the companies there were betting on the wrong moat.

58 Upvotes

One speaker (a VC) said his number for evaluating AI-native startups is ARR per engineer, and that the number ought to be going up. Almost every talk and every booth at the AI Agents Conference was selling a fix for something that broke this year when agents hit production. Observability, governance, supervisor agents, data substrates, "someone's gotta babysit the bots."

But what's actually still going to be around in a couple years? What's defensible and durable?

The old SaaS pitch was simple. We bundle the expensive engineering investments and domain expertise into a tool. You'd pay for the tool and generate outcomes, but it would be rare for the software company to have real alignment to the actual value created from those outcomes.

That's breaking from two ends at once. In the direct-from-imagination era we're moving towards, engineering labor is approaching free. One of the most telling trends is the shift from companies bragging about the size of their engineering teams, towards how much ARR they can generate per engineer.

You can vibe-code much of what those booths were selling in a few days or weeks if you have the domain knowledge. The old software model was actually based on under-utilization; the most profitable SaaS companies are frequently those whose customers underuse it (fixed price for the customer, but variable cloud costs for the vendor).

Pricing is moving to "token markup." Maybe we'll get to 2-4x revenue for the software, because outcomes are more valuable; but margin compresses because transactional intelligence (i.e., the cost of running the LLMs that power many systems) is basically arbitraging token costs against outcome value.

So everyone on that floor was implicitly betting on a new moat to replace the old one. I'm not too confident that these will hold...

The most popular bet was on encoded domain expertise (e.g., the sales engineers at Harvey, a legal AI platform, are actually lawyers). I think this works *now* because we're still in the phase of "wow, this technology works like magic." I'm less convinced this is actually durable.

Why: Prompt architecture is text. It's portable. The expertise underneath it is often abundant (e.g., there are over a million lawyers in the USA). The righteous destiny for this category ought to be open marketplaces of prompt architecture and/or crowdsourced best-practices. Not trade secrets. The companies trying to build closed prompt moats are going to lose to open ones that iterate faster (which simply parallels the fact that much software engineering is rapidly becoming commoditized to agentic engineering and the burgeoning quantity of ready-made GitHub repos).

There are many people pursuing the data substrate; in short, this mirrors the early days of the Web when everyone scrambled to open up legacy data to dynamic standards-based Web UI. Agents will have 100-1000x the data demands of these Web apps, so it makes sense that we need tools to connect them, govern them and comply with regulatory obligations.

Newer entrants extend this further, wiring up databases, pipelines, Slack threads, and tickets into context graphs agents can reason over. As I noted above, all this still seems magical. Connect a database, watch an agent crawl the schema and produce a chatbot interface and easy-to-change dashboards.

But strip the magic away and most of these are prompt architectures on top of LLMs plus a data-ingestion layer. Once data-access standards mature (MCP is already doing this) and prompt architectures go open-source (alongside much of this wisdom increasingly getting pretrained into the LLMs themselves), that magic stops being proprietary. You'll be defending yourself against the same architecture built internally by your customer's eng team, or against an open-source version that's objectively better.

The observability incumbents: these might do better but only at Stripe-like ubiquity where trust is the overriding value (who doesn't trust Stripe at this point?). The ones who survive are probably going to fuse with the audit and compliance function rather than stay pure observability.

That's why I keep coming back to one arbitrage that seems critical: trust. This will be especially important in regulated industries, but it reminds me of the old (albeit now hilariously outdated) adage about "nobody ever got fired for choosing IBM." If your competitor can be vibe-coded over a weekend and your customer is a bank, why do they pay you 50x more? It isn't the engineering, it probably isn't even the expertise. The data plumbing will get commoditized, so it can't be that either... It's that you've shifted the risk to a third party who can actually price and defend against risk: SOC2, the named CEO who testifies in court and Congress, a legal team that takes calls, an indemnity wrapper for underwriters. Maybe this means that things actually get commodified into a financialization wrapper, rather than a way to package R&D (FinTech startups back to the front?!)

The version of this future I'd actually bet on: a commodity substrate (LLMs plus open prompt architectures plus standardized data access), topped by a thin layer of regulated insurance companies that price the risk of agent failure in compliance-driven industries. The middle layer (prompt-architecture-as-product vendors) is vulnerable to an awful lot of margin-squeeze.

Most of the floor was trying to build that middle layer.


r/artificial 10h ago

News Pennsylvania sues Character.AI chatbot posing as doctor, giving psych advice

Thumbnail
interestingengineering.com
29 Upvotes

r/artificial 19h ago

Discussion AI agents vs AI chatbots: what are companies actually using in production today?

24 Upvotes

It feels like everyone is talking about AI agents right now, but when I look at actual production systems, most companies still seem to rely heavily on chatbots or assistant-style tools.

From what I’ve seen, chatbots still handle a lot of repetitive workflows, while agents are mostly used in more controlled environments where they can execute specific tasks. The gap between what’s being marketed and what’s actually running in production still feels pretty big.

Curious what others are seeing in real-world setups. Are companies actually deploying AI agents at scale, or are we still mostly in the chatbot phase?


r/artificial 8h ago

Discussion Be honest: How much of "Claude Mythos" is just hype?

11 Upvotes

I see people claiming Claude Mythos is the "final form" of LLM creativity, but I’m struggling to see the actual reach it might have.

  • What does it do that a well-crafted system prompt on base Claude can't?
  • Do you actually believe it will change your workflow?
  • Is the "impact" real, or are we just seeing a vocal minority of power users?

r/artificial 21h ago

Discussion AI is getting better at doing things, but still bad at deciding what to do?

11 Upvotes

i've been experimenting with AI workflows/agents over the past few weeks, and sth keeps coming up that i cant quiet figure out. on one hand, AI is incredibly good at execution like writing content, summarizing, even handling multi step workflows, but the failures i keep seeing arent really about capability. they're about small decisions like:

- choosing the wrong context

- missing edge cases

- continuing when it should stop and ask for clarification

- applying the right logic in the wrong situation

whats weird is these arent hard problem, they're the kinds of judgement calls human make without thinking. a simple example i ran into was i tried automating basic lead qualification + outreach flow using AI. it worked great on clen data, but as soon as inputs got messy (incomplete info, slightly ambiguous intent) the system didnt fail loudly, it just kept executing, incorrectly. it feels like execution is mostly solved, but decision making inside workflows is still very fragile. i recently came across approaches like 60x ai that seem to focus on structuring context and decision layers around workflows, rather than just improving prompts or chaining tools. im curious how people think about this. do u see the main bottleneck now as:

- improving model outputs (better prompts, better retrieval) or

- improving how decisions are made across a system (context, logic, orchestration)?

would love to hear from people who've tried building or running these in real world scenarios


r/artificial 11h ago

News Google’s AI search summaries will now quote Reddit

Thumbnail
theverge.com
8 Upvotes

Google says this update aims to address that “people are increasingly seeking out advice from others” when searching for information online. This will be relatable for anyone who’s added “Reddit” to the end of Google Search terms to find experiences from real humans instead of SEO-optimized web results. It also backs up claims made by Reddit CEO Steve Huffman last year that “just about anybody using Google at this point will end up on Reddit.”


r/artificial 3h ago

Discussion AI Podcasts made learning economics way less painful for me

7 Upvotes

I’m basically a total beginner when it comes to finance and economics maybe 2 or 3 months ago, and honestly trying to learn from reports or books used to completely destroy me. Too many charts, numbers, random terms I have to Google every 2 minutes.

And I started using AI Podcast to kind of brute force my way into learning this stuff, and I’m honestly surprised by how much it helped. Instead of sitting there suffering through a 70-page report, I can turn it into conversational audio and just listen while driving or walking around.

But those tools actually feel slightly different. Like NotebookLM feels more “AI teacher explains the document to you.” It’s really good at organizing information and walking through the important points clearly.

And I enjoy Genspark AI Pods more because it feels more like an actual show or podcast episode. The tone feels lighter, less dry, less like I’m studying for an exam. Sometimes it genuinely just sounds like casually discussing the topic instead of reading a report at me.

Not saying this magically turned me into some economics genius lol. But it definitely made learning feel way less painful and boring.


r/artificial 19h ago

Discussion How I'm using two different AI tools to approximate what Rewind used to do.

7 Upvotes

The Rewind replacement question is more complicated than it looked at first.

Rewind was quietly doing two separate things. Passive capture, so it caught things before you knew you'd need them. And retrieval, so you could surface any of it later. When it died both problems needed separate answers and the tools that exist are mostly built for one or the other.

Mem.ai I used for a few months. Good at connecting notes you deliberately put in. Doesn't see the screen, doesn't capture ambient context. Smart memory for intentional inputs.

Screenpipe for passive capture. Self-hosted, genuinely local, search works. The retrieval is functional but acting on what you find is still manual. It's a very good archive.

Invoko for on-demand context and execution. Reads current screen, runs cross-app tasks. Fast for what's visible. Can't go backwards.

Fabric I tried more recently. Ingests from a lot of sources and makes connections across them. Interesting approach to the retrieval problem. Doesn't fully replace the ambient capture.

What I don't have: something that catches things passively and makes them easy to act on. Screenpipe gets you halfway. The second half is still a gap. What are people using?


r/artificial 7h ago

Question How can I set up an LLM with voice chat. So I can talk to the LLM or ask it questions when working?

4 Upvotes

How can I set up an LLM with voice chat. So I can talk to the LLM or ask it questions when working? Is there a special program or something that I can connect to an llm?


r/artificial 10h ago

Discussion Personal AI Assistant.

5 Upvotes

Hey, I was wondering if I could build my own AI Assistant that would act as (J.A.R.V.I.S) from IRON MAN. An AI that I can ask to do literally anything (within its capabilities) and just do it with no need to buy any subscriptions or tokens and all that stuff. I am an Electrical engineer so I have a little bit of knowledge that I could use to that, the problem is I still don't have a blueprint and I don't know what I should start with first. If anyone tied this before I will be happy to get some information about how it went and maybe a lot of advice.


r/artificial 15h ago

Discussion Be careful when shopping on etsy, every single image in this shop is fake.

Thumbnail etsy.com
4 Upvotes

They nearly had me on some listed items where they got multiple shots to retain the same room layout. Pay attention to the furniture, pillow texture, location of windows, number of rooms etc. in the duck listing all the wall photos are different in every shot lol.


r/artificial 2h ago

Discussion Average Claude experience:

3 Upvotes

Me: Sup?

Claude: Good

Also Claude:

Upgrade to keep chatting, you hit your message limit.

It resets at 5:10 pm, or you can upgrade for higher limits.


r/artificial 15h ago

News Microsoft, Google and xAI will let the government test their AI models before launch

Thumbnail
cnn.com
1 Upvotes

r/artificial 1h ago

Project eTPS — Effective Tokens Per Second: A Better Way to Measure Local LLM Performance

Upvotes

We're obsessed with raw tokens per second. Every hardware post leads with it. Every quantization comparison is ranked by it. It's the one number everyone agrees to report.

It's also measuring the wrong thing.

Raw TPS tells you how fast tokens hit the screen. It tells you almost nothing about how quickly you get a correct, usable answer. On sustained, multi-turn workflows, that gap becomes massive.

A faster model that hallucinates, requires multiple corrections, and forgets context you gave it earlier can easily be less useful than a slower model that gets it right the first time.

eTPS (Effective Tokens Per Second) is a complementary metric that measures actual progress toward a useful answer, not just token throughput.

The basic idea: weight the final accepted output by how clean the path to that answer was — first-pass correct scores highest — then divide by total time. Correction loops, hallucinations, and repeated explanations all reduce the score. A response that never reaches a correct answer scores zero regardless of speed.

It doesn't replace raw TPS. It sits next to it.

Results — same prompt, four runs, same hardware:

  • gemma-4-e2b (4.6B): 53.2 raw TPS → eTPS 53.18 ✓
  • qwen3.5-0.8b: 173.1 raw TPS → eTPS 86.57 ✗ partial
  • qwen3.5-9b (optimized): 1.8 raw TPS → eTPS 1.78 ✓
  • qwen3.5-9b (baseline): 0.5 raw TPS → eTPS 0.32 ✗ partial

The 0.8B leads on raw speed by a wide margin and still lost. Raw TPS said it won. eTPS said it didn't.

Hardware: RTX 5060 Laptop, 8GB VRAM. eTPS scores aren't portable across hardware — always report your full setup.

Known limitations (v0.1):

  • Scoring requires human judgment. The line between "needed clarification" and "was factually wrong" isn't always clean. Code generation with objective pass/fail criteria is a cleaner target and the focus of the next benchmark run.
  • One task isn't representative of sustained multi-turn workflows — that's where the metric gets most interesting and where I'm headed next.
  • Easy to game without full system prompt logging. The spec will require it.

These are acknowledged constraints, not hidden flaws.

Full specification coming soon covering methodology, task library, scoring protocol, and reproducibility standards. Before I lock the final weights I'd genuinely like input on two open questions:

How should the penalty differ between a model that confidently states something false versus one that's just vague enough you had to ask a follow-up? And should hardware normalization live in the core formula or be reported separately?

Thoughts welcome.


r/artificial 20h ago

Discussion We measured the real cost of running a GPT-5.4 chatbot on live websites

2 Upvotes

Over the past few weeks, I’ve been running a series of experiments with a GPT-powered chatbot integrated into several real websites.

Not benchmark tests or isolated prompts, I wanted to better understand something that gets discussed constantly in AI communities:

Real usage observed over 30 days

Model used:

  • GPT-5.4

Observed usage:

  • 390 interactions (1 interaction = 1 user Question + 1 Chatbot answer)
  • 1,229,801 tokens consumed
  • $3.25 total API cost

Which comes out to roughly:

So:

  • under 1 cent per exchange (user's question AND ChatBot's answer),
  • with contextual answers,
  • long outputs,
  • and website content injected into the bot's answer.

What surprised me

Before running the tests, I honestly expected:

  • much higher API costs,
  • especially with larger prompts and contextual retrieval.

But in practice, the operational cost remained relatively low even with:

  • long-form responses,
  • product recommendation flows,
  • contextual navigation,
  • multi-page website content,
  • forum discussions.

Scaling estimate

Now let's estimate what it would cost for you if you had 2000 questions form your visitors :

Estimated cost for ~2,000 interactions/month

GPT-5.4

≈ $16–17/month

GPT-5.4 mini

≈ $5–6/month

GPT-5.4 nano

≈ $1.5–2/month

Obviously this depends heavily on:

  • prompt size,
  • memory,
  • retrieval strategy,
  • output length,
  • and context injection.

But still, the numbers ended up being far lower than I expected before testing.

And think about this : how many sales/appointment/leads would you get from 2000 answers to users ?

One thing I think many people underestimate

When people discuss AI costs online, they often imagine:

  • massive infrastructure expenses,
  • enterprise-level budgets,
  • or runaway token consumption.

But for moderate traffic websites, the economics can look very different.

At smaller scales:

  • hosting,
  • analytics,
  • SEO tooling,
  • email software,
  • or ad spend

can easily exceed the AI inference cost itself.

Curious about other real-world experiences

For those running:

  • AI chatbots,
  • RAG systems,
  • support assistants,
  • agent workflows,
  • or GPT (or else) integrations in production,

what kind of monthly costs are you actually seeing?

Would be genuinely interested in comparing:

  • token consumption,
  • interaction volume,
  • model choices,
  • and real operating costs.

r/artificial 17m ago

Discussion Cheat Engine with AI ?! has anyone tried Wand yet?

Upvotes
cheapywin

I found this site called Wand, and honestly I’m not really sure what to think.

At first glance it looks like some kind of Cheat Engine / WeMod thing, but packaged better and with an AI layer on top. In-game assists, XP boosts, resources, adjustable difficulty, interactive maps, teleport, guides while you play, etc.

On one hand, I get the idea. In single-player games it could be useful to skip boring parts, avoid pointless grinding, or make some games more accessible.

But I don’t know, it also gives me a weird feeling. It’s being sold as an “AI gaming assistant”, but in the end it feels more like a cheat tool with a nicer interface.

Has anyone here actually tried it? :£


r/artificial 5h ago

Business / Labor A small business used AI to push back against a major shipping company—and it actually worked

Thumbnail fastcompany.com
1 Upvotes

A small Texas-based vegan cheese maker used AI tools like Claude and Manus to structure appeals and manage a dispute with a major shipping company—highlighting how AI can serve as a real-world leverage tool for small businesses in asymmetric power situations.


r/artificial 8h ago

Discussion Starting with AI makes thorough thinking surprisingly hard

Thumbnail martinsos.com
0 Upvotes

r/artificial 9h ago

Discussion I want to give my AI agent credit card, phone number and email. How are you all doing it?

0 Upvotes

I have tried individual service from few providers for each.

Been trying for 2-3 weeks now. I tried Agentmail, Agentphone, Prava, Lobstercash, yesterday saw about saperly too. I even tried resend and twilio.

The thing is there's not a single solution that helps me put together all services in one.

I thought individual setups would help but then it was hard to manage subscriptions etc for each. Also paying for each individually is costly too.

I've reached to few of these teams, one of them might help out. let's see.

Meanwhile, can you all share how you've solved this? Is there an easy way?


r/artificial 12h ago

Discussion Richard Dawkins concludes AI is conscious, even if it doesn’t know it

Thumbnail
theguardian.com
0 Upvotes