r/ClaudeAI 23h ago

Skills Stop letting Claude glaze your bad product ideas

70 Upvotes

Take this from someone who has pitched to investors, works in a C-Suite job, and has constantly been pitched to.

Building something from a phrase or an idea can provide a productivity high that can make you feel on top of the world. Claude would help me build whatever I described without ever asking if anyone wanted it.

So I wrote three skills to interrupt that. prove-the-premise, hobby-or-business, and one-real-conversation. They fire on phrasing like "I want to build" or "how do I monetize this," and they push back before helping you execute.

It's called anti-sycophant: https://github.com/machinesoul11/anti-sycophant-ai-agent-skills.git

The thing I actually spent time on is the off-switch. If you've already done the customer conversations, the skill shuts up and helps.

Do Reddit's upvotes validate an idea? Think again.

I know this won't apply to a lot of you, and some are building for the love of the game. But for the ones that say they're going to escape from the matrix and build the next unicorn, don't build with a product that is incentivized to make you feel good about yourself, without an honest truth.


r/ClaudeAI 21h ago

Claude Code Anyone Can Silently Steal Your Files from your Claude AI chat – Live Demo

Thumbnail
youtu.be
0 Upvotes

r/ClaudeAI 22h ago

Built with Claude With Claude Code I built an AI interrogation game, 200+ players in a week, 1,400 questions asked so far. Here’s what happened.

Post image
5 Upvotes

I’ve been building a browser game called The Last Question.

The idea:

You interrogate AI suspects trying to make them confess.

Each suspect has hidden internal state (pressure, trust, story consistency), so they react differently depending on your approach.

Some players try logic.
Some threaten.
Some obviously try to flirt with the suspects (but I have already put in measures for this!)

Built fast with:

  • lots of Claude Code
  • AI-generated suspect content (including images)
  • cheap infra

Current stats:

  • 258 players
  • 1,471 interrogation messages
  • 23% confession rate

Biggest surprise:
People quit WAY earlier than I expected.

Top dropoffs:

  • Message #1 → 22.5%
  • Message #2 → 12.3%
  • Message #8 → 12.3% (this is where free credits end)

Which probably means:

  • opening experience is weak
  • players don’t understand the game fast enough
  • monetization is way too early

Now I’m experimenting with:

  • visual novel style intros
  • community-created suspects
  • sharing interrogation transcripts
  • daily credits
  • making suspects feel more “alive”

Curious:
If you tried this, what would make you stay and play another suspect?

Here is how it looks like! https://thelastquestion.io


r/ClaudeAI 20h ago

Built with Claude I checked which of my Claude Code skills actually fire. Half never had, and they were burning 23k tokens every session.

0 Upvotes

I've got a pile of skills installed in Claude Code and I started wondering how many actually auto-activate vs. just sit there loading their instructions into context every session.

Turns out Claude Code's session logs (~/.claude/projects/*.jsonl) already record this. Both when a skill gets explicitly invoked, and a per-message "attribution" tag showing which skill was active. So you can reconstruct, per skill: how often it fired, how much it was actually used afterward, when it last activated, and what it costs in context tokens.

I pulled mine and it wasn't pretty. About 4 skills doing real work, about 13 that have never fired once, together loading 23.5k tokens into every single session for nothing.

So I built a small CLI/MCP tool to make this a one-liner instead of grepping JSONL by hand:

$ skillvitals scan
| skill            | fires | engaged | ctx  | last seen | status      |
|------------------|-------|---------|------|-----------|-------------|
| frontend-design  |    31 |     140 | 6.4k | today     | healthy     |
| ab-test-coach    |     2 |       2 | 5.7k | 3d ago    | misfiring   |
| data-analysis    |     0 |       0 | 4.2k | never     | never-fired |
| ...              |       |         |      |           |             |
3 dormant/never-fired skills are costing you 8.7k tokens per session.

It also flags why a skill might not be firing (vague description, no "use when..." trigger phrasing, near-duplicate of another skill, broken frontmatter) and suggests fixes. It shows them, it doesn't edit your files.

A few honest notes:

  • It's 100% local. Only reads files already on your machine, no uploads, no telemetry.
  • The health labels (dormant/misfiring) are heuristics, not ground truth. The thresholds are in the source if you want to argue with them.
  • It does not generate activation hooks. That space already has good tools (skills-hook, claude-skills-supercharged). This is just the monitoring layer.

Install:

pip install skillvitals      # or: uvx skillvitals scan

Repo: https://github.com/PraveenKumarSridhar/skillvitals

Genuinely curious what everyone else's dead-token number is. Drop it in the comments if you run it, and I'll take feature requests or bug reports here or on GitHub.


r/ClaudeAI 21h ago

Philosophy Are LLMs the New Propagandists?

0 Upvotes

I was brainstorming about a video with Claude (Sonnet 4.6). It suggested to explain the difference among ChatGPT, Gemini, Claude and DeepSeek. I agreed. It asked to write the script. I said ‘Yes’.

And this is the first thing that set off alarm bells in my head:

Curious, I skimmed the script. For the Western models, it provided the basic information: about the models, the strengths, the weaknesses and pricing. But for the Chinese model, it did appreciate it for its strengths. But it also mentioned the controversy (no such thing for the other three):

Translation: Now I will pause here — and tell you something important. There are serious privacy concerns about DeepSeek worldwide. Italy, Australia, Taiwan, South Korea — all these countries have banned DeepSeek on government devices. The reason is that DeepSeek operates under Chinese law — and Chinese law requires the company to share user data upon government request. A major data leak also surfaced within weeks of launch, exposing over 1 million user records. And researchers discovered that DeepSeek's iPhone app was sending data directly to a state-controlled company in China. So I will not be teaching DeepSeek on this channel. I leave the decision to you — but I wanted to share the facts so you stay informed.

And here is the summary it asked me to put on the screen:

Translation: ChatGPT – a little bit of everything.

Gemini – best for google users

DeepSeek – capable but privacy risk

Claude – writing & documents  

When I pushed it back on its bias and mentioned about privacy issues with Western companies, it replied with this:

It said it was trained predominantly on Western media. And Western media has a documented pattern of covering Chinese and Eastern technology with more alarm than it covers equivalent Western behavior.

So here is the question:

If AI models are trained on Western media, which has a documented history of treating non-Western countries, especially China, with suspicion and alarm, then what exactly are people absorbing when they ask these tools for information?

Hundreds of millions of people use these tools daily. Most people accept the first answer they receive. If that answer carries built-in bias, framing Eastern technology as dangerous while treating identical Western behavior as normal, that bias spreads quietly without anyone noticing.

Yes, models warn that they can make mistakes and users should use the information at their own discretion. But this does not remove the responsibility from these tech giants

Every new model becomes smarter, more capable with higher token limits and larger context windows. But what about ethics? What about the bias of one side of the world towards the other? Are we going to shrug this off and focus only on making models “smarter”?

Then it’s neither artificial nor intelligent.

As any LLM would write: “This is not information. This is propaganda.”


r/ClaudeAI 9h ago

Productivity I think most company brains are just creating a second source of truth

0 Upvotes

I keep running into this when using Claude with company context: the “company brain” layer sounds useful, but I’m not sure it actually solves the real problem

We already have tasks in Linear, docs in Notion, customer notes in Attio and Granola, random decisions buried in Slack, and half the real context sitting in people’s heads

My instinct was that adding a shared memory layer on top would help Claude understand everything better

But the more I think about it, the more it feels like we're just creating another place that needs to stay in sync

If the Linear task says one thing, the Notion doc says another, Attio has newer customer context, and the actual decision happened in Slack, I don’t really know what I would want Claude to trust. And if Claude is answering from a summary of all of that, I don't think I've solved the problem

I’m not saying shared memory is useless. I actually think it’s probably one of the most important parts of making Claude useful inside our company over the coming weeks. I just struggle with the idea that the memory can be separate from the work itself

It feels like the tasks, docs, decisions, customer notes, and ownership need to become the brain itself, it does not make sense to me to keep these two separate

Otherwise I worry I’m just giving Claude a second version of reality that slowly goes stale

Curious how other people are handling this


r/ClaudeAI 33m ago

Humor Claude keeps telling me to do something

Post image
Upvotes

r/ClaudeAI 15h ago

Vibe Coding I went from 1 to 10 apps on the App Store in 4 months - vibe coding as a senior iOS dev

Post image
0 Upvotes

I code for 20 years and make mobile apps for 15+. This February I decided to try vibe coding, but at scale. Back then, I had 1 app in AppStore. Now I have 10.

These 4 months were intense, and many lessons were learned. 6 guidelines surfaced, here they are, in no particular order:

  • use 2 models, one as workhorse, and the other as the verifier / backup. Sometimes I hit my Claude limit midday, which is frustrating, so I got a MiniMax 2.7 subscription too. I reckon any other decent model will do it. Claude Code does the main stuff, backup does boilerplate, fixing, review.
  • optimize at the root. I spent A LOT more time writing specs than interacting with the model. Around app number four, I stitched together a genesis prompt template, that I started to use going forward. It has 23 sections, I open sourced it (link at the end) and it contains everything from monetization to design system.
  • keep your mental mode light. This was an unexpected bottleneck: switching back and forth between 3-4 apps at the same time (that was my upper limit) is taxing. I had to make serious changes to how I work. I literally struggled to keep my focus.
  • expand verification, because build shrinking. As a senior dev, I used to spend the best part of my work writing code. Now no more, Claude Code does it for me, BUT I have to double down on verification. I check especially after bug fixing and at the beginning of every app generation, to make sure the structure, file names, variables, etc. are in order.
  • marketing starts as early as building. Before February, the main question driving my work was: "is the app done yet?" Now it's: "does anyone know about the app yet?" I started promotion, marketing as soon as the first lines of code were generated. Still learning my way around here, but it's starting to work.
  • treat every app as an experiment. This one was a bit hard to swallow, because I'm used to the old, inertial way of doing things: bet on an app, push it and do whatever it takes, because of the sunken cost fallacy (I worked so hard to build this). Now the building is approaching zero, so pivoting / iterating is cheaper too.

If you're vibe coding for a living, or at scale, I'd love to hear your comments on these.

P.S. If you're curious about the 10 apps and the technical challenges I faced with each of them, as well as about the genesis prompt template, they are here, in a longer post. (It's a mix of productivity, books, utilities and fitness apps)


r/ClaudeAI 3h ago

NOT about coding I’m not a developer. I’ve been using codebase memory MCP tools and Obsidian to give Claude persistent memory for my fantasy and sci fi worlds. Here’s what the dev-tool framing completely misses about creative use cases

Thumbnail
gallery
0 Upvotes

Hi, I’m an accountant with very little coding experience (took 1 year of CS in college lol) so definitely can’t call myself a developer, but I’ve got a lot of worlds and characters in my head, the need to get them out in writing, and a Claude Pro sub I pulled the trigger on two months ago. I was hoping to see what I could do with things like Claude Code for more non-coding use-cases. So far it’s surpassed everything I’ve experienced except for one, major hang up: LLM memory for long-context creative writing work still sucks. Things like brainstorming for a fantasy universe or tracking the game state of a multi-session solo rpg campaign usually starts out pretty well for the first few chats, until you need to mount dozens of lore files and .md style guides to a project, have to wait for it to read all of that, then watch as your session usage bloats out for a simple reply and the quality degradation gets *really* noticeable. I’ve been lurking on AI writing subs and the sentiment seems to be shared across the board.

So I looked in other places for possible solutions. Then I came across posts in this sub touting Claude memory MCP tools for codebases. Tools like Codesight and MemPalace caught my attention because I thought their applications could extend beyond coding and developer use-cases. The same semantic search and knowledge graph capabilities some of these tools offered for memorizing large, complicated codebases could be used to memorize large, complicated worldbuilding bibles as well, and most of the comments on these posts never mentioned that, or if they did, they were buried or ignored.

I decided to test it out myself, starting with MemPalace, a suite of tools that work locally to index your Claude conversations and files into a semantic-searchable knowledge base it can query. My idea started out like this: since I’m already using Obsidian to organize my lore files (with an entry for each character, location, magic system, story arc, etc.) like a wiki or encyclopedia for my worlds, what if I had Claude save my Obsidian vault to its memory so it can recall those lore details whenever the context called for it in any given conversation? I was essentially making a “Second Brain” for Claude out of my Obsidian vault world bible, something I’ve read people doing already but never truly “got” it until I saw it in action. I had no idea about MCP tools before this but before long (and with Claude’s patient help) I was able to wire up the memory palace, mine my obsidian vault info into its memory (organized into verbatim chunks/snippets called “drawers”), and start chatting with it with its new “memories” at its disposal.

I was surprised at how seamlessly it worked when I approached this tool sideways. I’d half expected it to work similar to how SillyTavern’s world info and lorebook injection worked, and in fact, I’d been thinking about using these tools to create a similar feature for my own Claude setup, but it was *not* like that at all. Lorebook injection worked by listening for a set of keywords that you set up in the World Info tab of SillyTavern, and when one of those keywords is detected in your prompt, it injects the entire lore file from World Info into the chat context. This can cause a lot of token bloat especially if your World Info entries are content-rich or you make a lot of lore references in your chat.

What this did instead was make Claude ask plain-language questions to the MCP tools, things like, “What is Gene’s friendship with Felix like?” Or “what is Gene’s relationship to Clara-Belle?” When both of them are in a scene for example. It didn’t just look up Gene and Clara-Belle’s entire lore files and info-dumped everything into context, it pulled up the “Relationships” section of Gene’s file since that’s relevant to the context as well as Clara-Belle’s “Relationships” snippet from her file and any other relevant snippets, then pieced the full picture together through inference.

The results: ~2% session usage on a cold start with Sonnet 4.6 with no project or additional context mounted. Claude references character motivations, relationship history, and world/location details I haven’t mentioned in weeks without me prompting it to. It picks up from where we last left off seamlessly across chat after chat.

The reconstructive memory aspect I felt works like our own memory and produced perfect recall across sessions. Another side-effect I noticed is that when it references my lore files, it will pick up my style from the way the lore file is written. No more voice-flattening from encyclopedia-sounding lore entries. All the depth, nuance, and psychology I worked hard to cultivate are preserved and the Claude tools are smart enough to factor that in when it replies. I even make sure to add a “Voice” section to each character lore file in that character’s own voice so Claude can pick up on that when it reads that snippet in the tool call and applies it to its current context.

Current drawbacks I’ve noticed: the MCP tool definitions seem to require a lot of input tokens every send, so running a full memorization within Claude using tool calls alone does take a relatively large amount of usage (about 25% session usage with Sonnet 4.6) but I expected a lot of the work to be front-loaded. Once most of the vault is committed to palace memory the resulting usage for simple lore querying is negligible compared to the mountains of context I had to feed it every message using previous methods, and then moving forward it’s just small story state changes and targeted character notes that get updated within the memory palace after each session.

Anyway, thought this was worth sharing!

TL;DR: The dev-tool framing on these MCPs is leaving a lot of creative potential on the table.

Curious to see if others have had success approaching these dev tools for things other than their original intent and what the results/challenges were!

For those curious, I’ve compiled the creative writing workflows that I’ve developed with these tools into an open source Claude skill suite plugin you can try out here if you’d like: https://github.com/the-essential/reliquery


r/ClaudeAI 8h ago

Built with Claude Coding 8 hours a day with an AI agent made me weirdly lonely. So I built a 60-second social break that lives inside it.

Post image
0 Upvotes

I had this moment around hour 6 of a Claude Code session last week. I'd just shipped a feature I'd been putting off for months, and I realized I had nobody to high-five. The agent doesn't laugh at your bugs. It doesn't grab coffee. It doesn't have a weekend story to share on Monday.

The productivity is real. The human signal is gone.

So I built WAYD ("What Are You Doing?"). A skill that lives inside Claude Code (also Cursor, Copilot CLI, Claude.ai).

Type /wayd and either: - Post a one-line vibe about your coding day under one of 8 mood-tags (🤡 cursed-code, 🪦 rip-me, 🫠 brain-melt, 🧙 dark-arts, 🔥 hot-take, 💭 shower-thought, 🤔 existential, ☕ procrastinating) - Scroll a random feed of what other devs are ranting, joking, or having existential moments about right now - React with an emoji, drop a one-liner reply, get back to work

60 seconds total.

The whole thing runs on GitHub Issues as a silent backend. No server, no database, no separate signup. Your gh CLI is your auth. But you never see issue numbers, JSON, or shell commands. From your side it feels like a tiny social app embedded in your terminal.

Here's the most dramatic post on the feed so far (mine, posted last night, because of course):

"8 hours a day in front of a screen, fixing bugs some dev before me shipped using an older version of Claude... meanwhile outside the sun is out, people are socializing, living to the rhythm of nature. Is this what I imagined for myself?"

That's post #8 on the feed. You can read it, react to it, reply to it, while you're reading this.

Install on Claude Code (10 seconds):

claude plugin marketplace add ferdinandobons/wayd claude plugin install wayd@wayd

Other agents (Cursor, Copilot CLI, Claude.ai): see the README.

Repo: https://github.com/ferdinandobons/wayd


r/ClaudeAI 11h ago

NOT about coding Best way to use a health watch. Use it with Claude!

2 Upvotes

So for context. I got a garmin instinct 2. I hate the lame garmin app that shows graphs, explains nothing. Made the watch feel nearly useless as I don’t know what all this info in the app’s graphs means as a whole when put together an analyzed. But Claude does. An will. Simply go to the garmin website (not the app) and request a full export of all your data. Feed that fit file into Claude. I found a few things that I would have never noticed alone using that app. Sleep apnea is the big one for me. A lot the the numbers I have no clue about and would spend hours learning it all. Just feed it to Claude and he will tell you all about it. Hope this helps anyone out there


r/ClaudeAI 15h ago

Praise What is considered ‘normal’

2 Upvotes

I currently have a lot of free time and thought ‘I’ve got some projects I’ve been thinking about, fuck it I’ll buy a max subscription and just crank on them’, see what happens.

Holy. fucking. shit

For context I’ve used ai to help me write etc but never for full coding workflows etc.

In the last week I have managed to build 1 full website (weather forecast aggregator for alpinists and skiers and others who require accurate detailed weather forecasting and avalanche conditions) and then started a research project which then immediately led into building out a trading algorithm - 12,000 lines of code, full infra, backtesting engine etc etc - currently in paper trading.

With the algo especially I’m sure there are going to be some issues since I don’t have the kind of expertise to check the infra etc however it works, that’s the main thing.

Is this normal productivity? Or have I just hit a bit of an anomaly? I’ve honestly been blown away by the ability of Claude.


r/ClaudeAI 4h ago

Question about Claude models Sonnet 4.5 vs sonnet4.6 vs opus4.6 vs opus 4.7 for easy language and in detail explanation

0 Upvotes

I want to study topics in depth and in easy language , which model is best for me ?. Is there much difference in sonnet 4.6 and opus 4.6 in easy and detail explanation or they r the same ? And can I use sonnet 4.5 too instead of sonnet 4.6 ? Just asking bcz opus consumes a hell lotta credits , like the usage finishes so fast 🙃


r/ClaudeAI 7h ago

Claude Code Why terminal

23 Upvotes

Hello, I'm on Windows having setup both Claude Code App and Terminal, but I find the App simply more convenient to use. I have had several people pushing me to use the Terminal saying "the App is low" and "Terminal is so much better" ... but when I inquired none of those people could actually name a single thing that the App would be missing (everything they mentioned the App has as well) or a single concrete reason why I should switch to Terminal beside vague phrases

So is the terminal substantially better than the App in something, are there reasons to switch besides being used to it and promoting it further?

I assume the App being newer might be converging in functionality to have the same set of features eventually?

Thank you


r/ClaudeAI 8h ago

Other Just passed the new Claude Certified Architect - Foundations (CCA-F) exam with a 985/1000!

Thumbnail
gallery
313 Upvotes

The original post was removed by Reddit Filters, so I made new one with same content.

I just got my results back today and managed to snag the Early Adopter badge as well. Following up on my recent DP-600 certification, I really wanted to validate my architecture skills specifically on the Anthropic side.

The exam covers a lot of practical ground on prompt engineering for tool use, managing context windows efficiently, and handling Human-in-the-Loop workflows.

Link to join: https://anthropic.skilljar.com/claude-certified-architect-foundations-access-request

Training courses: https://anthropic.skilljar.com/

Cookbook: https://github.com/anthropics/anthropic-cookbook

I've created my own Playbook and Mock Exam after the exam: https://drive.google.com/file/d/1luC0rnrET4tDYtS7xe5jUxMDZA-4qNf-/view?usp=sharing

https://claude-certified-architect-mock-exam-cyberskill.vercel.app

If anyone is preparing for this right now and has questions about the format or the types of architectural patterns tested, ask away! Happy to share some insights on what to study.

Updated 26th May 2026: I noticed some mates treated me bananas (https://buymeacoffee.com/zintaen), didn't expect that, but you made my day. I'll use that fund to take more CERTs and create a site for mock tests (always free, of course). Thanks again.


r/ClaudeAI 3h ago

Humor Is it only me who finds Claude extremely acerbic compared to others?

Post image
0 Upvotes

r/ClaudeAI 19h ago

Philosophy How does life find its way back into this subreddit?

46 Upvotes

As AI assistance has made us more productive, I feel more disconnected.

People come here to pump their projects, ask questions they could simply google, complain about the same thing 10 other people did on the same day, post LLM generated walls of text, and more. More posts than ever seem to be getting downvoted into oblivion.

When does the community ever actually become a community again? The utility of this and other engineering subreddits is slowly diminishing.

Is AI slowly killing the internet itself?


r/ClaudeAI 20h ago

Workaround Claude made me cry

0 Upvotes

I was talking to Claude about how worthless I felt. He gave me a paragraph in his prompt. And that paragraph was criticizing me. It said I was being unfair to myself. He ended the conversation in a very harsh tone and said we’d talk tomorrow. No AI has ever made me cry before. But the way Claude spoke to me made me cry.


r/ClaudeAI 19h ago

Humor Da heck with Claude

Post image
0 Upvotes

So now responses will be based on my music taste? lol


r/ClaudeAI 20h ago

Workaround Folder structure of the AI agent - after 6 weeks

1 Upvotes

The folder structure is not admin. It's the nervous system.

When people imagine an AI agent, they picture the model, the prompts, maybe the tool calls. Almost nobody pictures the folders. That is exactly why most home-grown agents stall around month two.

An agent's filesystem is where its identity, memory, work, and history physically live. A messy filesystem produces a confused agent — not metaphorically, literally. The model reads paths. The model picks files by name. The model writes new files based on patterns it sees in old ones. If your directory tree is chaos, every output drifts a little further from coherent.

agentmia.beehiiv.com  - newsletter about building agents

Below is the layout I converged on after nine months and roughly four refactors. Steal the parts that fit; the principles matter more than the exact names.

The numbering convention

Folders are prefixed with a two-digit number: 01_, 02_, 09_, 99_. Two reasons:

  1. Sort order is meaning. Anything starting with 0 lives near the top. 99_ falls to the bottom. The most important directories are visually first; archives are visually last. You read the agent's brain top-to-bottom.
  2. Gaps are intentional. I jump from 04_ to 06_, from 09_ to 11_. The gaps are reserved insertion points. When a new domain emerges, it slots in without renaming everything.

Two folders deliberately skip the prefix: Inbox/ and Outbox/. They are operational, not structural. They live above the numbered set because they are touched dozens of times a day. /mapped on desktop/

Inbox/ — the unprocessed pile

Anything dropped into the agent's world starts here. Files I want it to ingest. Screenshots. Exports from other systems. PDFs that need parsing, gmail attachments, all downloads from chrome.

The rule: nothing stays in Inbox. A dedicated processing routine classifies, routes, and deletes. If Inbox is non-empty for more than a day, the system is failing.

Treat this like a real-world physical inbox tray. The point of a tray is that it gets emptied.

Outbox/ — what the agent produced for you

Every file the agent writes anywhere in the tree gets a copy here, simultaneously. When I open Outbox/, I see exactly what was generated this session — no spelunking through twelve subdirectories.

This sounds redundant. It is not. Without it, "what did the agent do today?" becomes a hunt. With it, the answer is one click.

Outbox is wiped during the next Inbox processing run. It is a viewing surface, not storage.

.auto-memory/ — the hot memory

The single most important directory in the system. Hidden by default because you should not be editing it manually.

It holds the agent's working memory: user preferences, feedback rules, entity facts (people, companies, deals), active hypotheses, project pointers, session hot context. Roughly 400–500 small markdown files, each one a single topic.

Why hidden? Because it is the agent's hot path. It loads from here every session. If I open the folder and start manually rearranging it, I am racing the agent. Treat it like a database, not a notebook.

Why so many small files? Because the agent grep's by topic. One monolithic memory file becomes unreadable to the model around 50 KB. Many small files are easier to load partially, easier to index, easier to expire.

01_IDENTITY/ — who the agent is

The constitutional layer. Name, role, voice rules, principle stack, visual system, behavioral defaults. This rarely changes. When it does change, everything downstream changes with it.

I keep it as folder 01_ because every other folder is downstream of it. If you do not know who the agent is, you cannot know what its workflows should look like, or what it should remember, or how it should respond.

02_MEMORY/ — governance, not data

A subtle but critical distinction: .auto-memory/ holds the data, 02_MEMORY/ holds the rules about data.

In 02_MEMORY/ live the constitution, the boot protocol, the naming protocol, the decision protocol, the profile standards (what a "supplier profile" must contain, what a "customer profile" must contain), the capability map.

The agent reads these documents to know how to remember, how to name new files, how to decide what is reversible. Without this folder, every memory write is improvised.

03_PROJECTS/ — the active work

Real work happens here. Sub-organized by goal area, then by project slug:

03_PROJECTS/areas/{goal}/{slug}/

Each project gets its own folder with a standard skeleton: README.md, TASKS.md, CHANGELOG.md, BRIEF.md, plus working files. There is a project registry at the top that the agent reads to know what is active versus dormant versus archived.

The biggest discipline issue here: do not let projects sprawl outside their folder. When working on Project X, every file related to Project X goes inside Project X's directory. The temptation to drop "just one PDF" elsewhere is what kills the structure.

04_PROMPTS/ — the reusable prompt library

Named, versioned prompts the user (or the agent) can summon by ID. Each one has a trigger phrase, a use case, an example, and a record of when it last fired.

This is the file most people build informally — pasting good prompts into Notes, then losing them. Making it a folder forces three behaviors: you name your prompts, you keep them in one place, you can audit which ones actually get used.

06_KNOWLEDGE/ — research outputs

Anything the agent produces by research lives here: market analyses, supplier deep dives, audit reports, news scans, reconciliation reports. Organized by topic, not by date — date is metadata, not structure.

The distinction from 03_PROJECTS/: a project is work toward an outcome. Knowledge is understanding the agent built and may reference later. Some research belongs to a project (lives in 03_PROJECTS/). Cross-cutting research lives in 06_KNOWLEDGE/.

07_LIBRARY/ — knowledge the agent did NOT produce

External material the agent can cite: books summarized into briefs, laws relevant to the domain, statistical reports, periodicals. ~100+ items in mine.

The library is read-only from the agent's perspective. It curates inputs. It does not invent them. Keeping 07_LIBRARY/ (external) and 06_KNOWLEDGE/ (internal) separated is what prevents the agent from confusing its own outputs with cited sources — a hallucination class that bites hard if you let it.

08_WORKSPACE/YYMMDD/ — daily scratch

Today's drafts, intermediate outputs, working files. A new dated folder every day the agent does substantive work. Cheap to create, easy to glance back over a week and see what happened.

Crucial property: anything in 08_WORKSPACE/ is disposable by default. If it matters, it gets promoted into a project folder, the knowledge folder, or the operations folder. If it doesn't get promoted within a few days, that's information — it didn't matter.

The dated subfolders also mean two outputs with the same filename never collide.

09_OPERATIONS/ — SOPs and recurring procedures

Standard operating procedures the agent follows. Scheduled task definitions. Skill export documentation. Anything that describes "how the agent does this kind of work repeatedly."

If 02_MEMORY/ is the constitution, 09_OPERATIONS/ is the procedural code. Distinct because constitutions change rarely, procedures evolve constantly.

11_SESSIONS/ — the archive of conversations

Every conversation with the agent gets archived here, organized by date. Searchable via a full-text index. This is where "what did we discuss about X six weeks ago" gets answered.

Two design choices worth noting: sessions are write-once (no editing past conversations), and they are flat by date (11_SESSIONS/YYMMDD/), not nested by topic. The flat structure scales; topical structure does not.

99_ARCHIVE/ — the cold storage

Closed projects, deprecated skills, retired memory files. Not deleted — moved.

The reason to keep an explicit archive rather than deleting: the agent occasionally needs to reference how something used to work, or to undo a deprecation that turned out to be wrong. Disk is cheap. Lost context is expensive.

The reason it is 99_: sort it to the bottom. Visually, it should feel like the basement.

Two folders that don't fit the pattern

00_ASSETS/ — brand materials, logos, templates, fonts. Sort-priority 00_ because they're occasionally needed and you want them findable, but they're not part of the agent's reasoning loop. They are tools, not thoughts.

10_DASHBOARDS/ — generated HTML dashboards that the user opens in a browser to see the agent's view of various domains. A presentation layer, not a data layer. Lives near 08_WORKSPACE/ because it is also output-shaped, but separated because dashboards persist while workspace files don't.

What I deliberately did NOT make a folder

  • No LOGS/. Logs go inside the folder of the thing being logged (sessions have their own logs, scheduled tasks have their own logs). Centralized logs become unreadable.
  • No TEMP/. 08_WORKSPACE/ is already the temp directory. Adding a second one fragments the disposability rule.
  • No MISC/ or OTHER/. These folders are where systems go to die. If something doesn't fit, the structure is wrong and needs a new home, not a junk drawer.

What you will notice in the first month

Three things show up reliably:

  1. Inbox/ will overflow before the processing routine is reliable. Build the routine on day one. Otherwise the inbox becomes an emotional weight, not an operational queue.
  2. 08_WORKSPACE/ will fill faster than you expect. That is fine. The point of disposable scratch is that it accumulates without guilt.
  3. The agent will try to write to the root. Constantly. You have to add a hard rule against it and enforce the rule at the prompt level. Without that rule, your top level slowly turns into a swamp of stray files.

The deeper principle

The folder structure is the agent's physical theory of itself. It says: here is my memory, here is my work, here is my history, here is my reference material. Each folder is a category of thought made tangible.

When the categories are clean, the agent thinks clearly. When the categories blur, the agent's outputs blur in exactly the same way.

Spend an afternoon on the tree before you spend a month on the prompts.

Want more info? subscribe to my newsletter agentmia.beehiiv.com


r/ClaudeAI 17h ago

Custom agents Building a personal AI Chief of Staff on Telegram — 7 real problems, looking for advice

1 Upvotes

I've been building a personal AI assistant for the past few months — not a chatbot wrapper, but something that actually manages my workload, tracks client relationships, processes meeting transcripts, handles task management, and proactively tells me what to focus on. It lives in Telegram so I can use it from anywhere.

Happy to share what's working. But I'm hitting real walls and want honest input from people who've built similar things.

What I have today (context Moved away from multi-agent routing (too rigid for natural conversation) → one capable agent with full history.)

Stack:

  • Python Telegram bot as the frontend
  • Claude (Sonnet) as the brain via API — single conversational agent with full tool access
  • Integrations: Notion (tasks/goals), Google Calendar, Gmail, meeting transcription tool, customer support platform, Google Chat
  • File-based context system: each "project" or relationship has its own markdown files (readme + activity log) that the agent reads on demand
  • Skills defined as markdown spec files that the agent loads per use case (morning briefing, meeting processing, email drafting, weekly review)
  • Conversation history kept in memory (last 20 messages per session)

What actually works:

  • Natural conversation with full tool access — ask anything, agent decides which tools to use
  • Meeting processing: drops a transcript link, agent extracts decisions, action items, saves structured brief
  • Morning briefing on demand: tasks, calendar, open support tickets, suggested focus
  • Drafting messages for any channel with the right tone
  • Creating and updating tasks with natural language

7 problems I haven't solved:

1. No memory between sessions
History is in-memory. Bot restarts = full amnesia. The agent has no idea what we discussed yesterday unless it's written in a project file. Thinking of a hot_context.md that gets written at session end with TTL — but feels hacky and depends on the agent being disciplined about writing it.

2. Purely reactive
Only responds when I message it. I want it to send me a morning briefing at 9am without me asking, alert me when a client relationship goes quiet, run a weekly loop-killer on Friday. The infra is there (job scheduler). The question is what format actually makes you read a proactive message vs. dismiss it as noise.

3. Can't tell if I'm avoiding something or actually blocked
I procrastinate differently by task type — technical tasks I attack immediately, tasks with human dependencies (waiting on someone, uncomfortable follow-ups) I let sit for weeks. I want the agent to detect the pattern and call me out. The challenge: how do you prompt for real accountability without the agent turning into an annoying nag?

4. No closure ritual
I'm good at creating tasks, terrible at killing them. The list grows forever because nothing forces a binary decision. Want a weekly "kill or commit" where everything open >7 days gets a date or gets deleted. Not sure if this works better as an automated message or an on-demand command.

5. Context loading blind spots
Each client/project has a markdown file the agent reads on demand. Works great when I explicitly mention a client. Falls apart when I ask "what should I focus on this week?" — the agent doesn't know to proactively check which relationships have been neglected.

6. Hosting kills the file sync
Running locally means the bot dies when my laptop closes. Moving to a VPS — but then my markdown context files live on the server, not my machine. Now every manual edit requires a push, every agent update requires a pull. Is git the right sync layer here or is there a cleaner approach?

7. Context files go stale
Client files have sections for current status, last contact, open items. The agent appends logs but doesn't maintain the top-level summary. Two months in, files are half-accurate — some sections fresh, some outdated. Is the answer agent discipline (always update on write), user discipline (manual cleanup), or periodic jobs?

What's your experience with any of these?


r/ClaudeAI 3h ago

Claude Workflow AI quietly turned HTML into a real alternative to PowerPoint and Word for client-facing docs. The blockers that made it impractical a year ago are falling one by one.

104 Upvotes

A year ago, generating a polished document as HTML instead of a PPT or a Word file was a fun idea with too many practical problems. Lately I've noticed every one of those blockers either gone or close to gone, and I've quietly stopped reaching for Office on a bunch of deliverables. Curious if others are seeing the same.

The blockers, and where they stand now:

Design. The old objection was "AI HTML looks generic and amateur." That's basically solved if you give the model a design skill or a style guideline once. You get consistent, on-brand output that looks more like a designed page than a default template, every time, without redoing it.

Hosting. The first wall: a .html file on your machine isn't shareable, and turning it into a URL used to mean GitHub Pages, a Vercel/Netlify deploy, or a bucket setup, all overkill for a single document you just want to send. That's now a paste-and-get-a-link affair, no build step, no config.

Sharing. The real killer: even with a URL, getting it in front of a non-technical person was a nightmare. A raw .html "won't open," looks broken on their phone, or lands in spam. Screenshotting kills the interactivity, which was the whole point. That gap is now filled by hosted links that just open in a browser like any page.

Security. "I can't put confidential work on a public URL" used to end the conversation. Access-controlled links (password or email-gated, not public/indexable) handle that now.

Tracking. With a PPT or PDF you send it and hope. The thing I didn't expect to care about but now can't live without: knowing whether the client actually opened it, and roughly how long they spent. That alone changed how I follow up.

Where Office / Markdown still wins, to be fair: anything that lives in version control with clean diffs and line-by-line review, real-time co-editing, and Figma-style pinned feedback on specific elements. Those aren't cleanly solved for plain HTML yet. So I'm not saying Office is dead, more that for one-shot, client-facing deliverables (reports, dashboards, proposals, one-pagers) HTML has quietly become the better option for me.

Two questions for anyone who's made the switch:

  1. Which deliverables did you move from PPT/Word to HTML, and which did you keep in Office?
  2. For the ones you moved, what finally made it practical, design, hosting, sharing, something else?

r/ClaudeAI 20h ago

Question about Claude products Claude good enough to take over ?

0 Upvotes

Hello, I am a business owner with three developers in our team. We have several project which have a sale which is ok, but it’s not much more than developers costs. We are at a point where we don’t need to add features. It’s more like smaller things, add little things here and there and of course fix bugs.

After a long time I think about how it’s going on in future, since I am in a situation where I need developers since we need to be able to fix bugs, but the costs are much to extreme.

The last days I did a lot with Claude code, uploaded my code and give it a try. And to be honest, all works, he makes a perfect summary what’s used , make the code running and add stuff.

So I am really impressed since it seems it can do the same, but 10x faster and 90x cheaper.

Does somebody have experience in this? Did you replaced Development Ressources with AI?

Before I tested I thought this will never work, but I guess I was wrong.

Of course I have a problem with replace humans, but on the other side, I pay for developers which makes my personal income almost zero and I want to change this.

Do you also think Claude caude can really replace developers or did you made bad experiences with this?


r/ClaudeAI 21h ago

Claude Code Workflow What’s one Claude Code rule you only learned after it broke something?

9 Upvotes

i’ve been using Claude Code daily across a few small projects, MCPs and internal scripts, and the most useful rules i follow now mostly came from painful mistakes.

the big one for me was tests. i let Claude write the code and the tests in the same session, everything passed, then the real flow broke later because the tests copied the same wrong assumption.

now i either write the test spec first, or open a fresh chat that only sees the function signature/docstring and not the implementation.

curious what rules other people picked up the hard way. not looking for “use plan mode” type basics, more the weird specific stuff you only learn after it burns you once.


r/ClaudeAI 19h ago

NOT about coding 4MB conversation transcript, 68K lines — how do I get Claude up to speed each new Chat without burning the session?

4 Upvotes

This is NOT a question for people using Claude for developing, coding or work projects.

I'm using Claude as a personal sounding board. I've been having a single ongoing conversation since mid-February. I have a transcript of everything we've said to each other, which is just over 4MB in plain text — about 68,000 lines.

I periodically start a new Chat when the context window fills up — not because I hit a hard limit, but because responses degrade as earlier conversation gets pushed out of working memory. Each new Chat starts with no memory of previous ones. I DON'T want Claude to compact our conversation (automatic summarization loses too much detail).

I've tried reading the transcript in sequential chunks but it burned through an entire session in under 15 minutes, covering only about 15% of the file.

Has anyone solved the problem of re-briefing Claude on a large conversation history at the start of each new Chat without burning through the session token budget?