Looking for vibe coder with vibe management skills

tracyspcy@lemmy.ml · 5 hours ago

Looking for vibe coder with vibe management skills

pixxelkick@lemmy.world · edit-2 5 hours ago

Its serious and this is going to become more and more normal.

My entire workflow has become more and more Agile Sprint TDD (but with agents) as I improve.

Literally setting up agents to yell at each other genuinely improves their output. I have created and harnessed the power of a very toxic robot work environment. My “manager” agent swears and yells at my dev agent. My code review agent swears and tells the dev agent and calls their code garbage and shit.

And the crazy thing is its working, the optimal way to genuinely prompt engineer these stupid robots is by swearing at them.

Its weird but it overrides their “maybe the human is wrong/mistaken” stuff they’ll fall back to if they run into an issue, and instead they’ll go “no Im probably being fucking stupid” and keep trying.

I create “sprint” markdown files that the “tech lead” agent converts into technical requirements, then I review that, then the manager+dev+tester agents execute on it.

You do, truly, end up focusing more on higher level abstract orchestration now.

Opus 4.6 is genuinely pretty decent at programming now if you give it a good backbone to build off of.

LSP MCPs so it gets code feedback
debugger MCPs so it can set debug breakpoints and inspect call stacks
explicit whitelisting of CLI stuff it can do to prevent it from chasing rabbits down holes with the CLI and getting lost
Test driven development to keep it on the rails
Leveraging a “manager” orchestrating overhead agent to avoid context pollution
designated reviewer agent that has a shit list of known common problems the agents make
benchmark project to get heat traces of problem areas on the code (if you care about performance)

This sort of stuff can carry you really far on terms of improving the agent’s efficacy.

Bane_Killgrind@lemmy.dbzer0.com · 19 minutes ago

Dude this boils down to “moving a hundred people is simple, I am a trained pilot and I used this 747 to move them”

Like great, you have the thousands of hours of training time required to understand a machine of that complexity and produce results.

Joe dirt has 8000 hours in his puddle jumper, and that’s the majority of the people these 747s are being foisted upon. They know how to fly, and they provide that service reliably.

Telling them to move 5 people with a machine they don’t need the volume or distance of, is irresponsible.

MangoCats@feddit.it · 1 hour ago

What I have found: all that stuff that was evolving over the last 30 years: roadmap definition, sprint planning, unit tests, regular independent code reviews, etc. etc. etc. that those of us who “knew what we were doing” mostly looked down on as the waste of time that it was (for us), well… now you’ve got these tools that spew out 6 man-months of code in a few hours, and all those time-wasting code quality improvement / development management techniques… yeah, they apply, in spades. If you do all that stuff, and iterate at each quality gate until you’ve got what you’re supposed to have before proceeding, those tools actually can produce quality code - and starting around Opus 4.6 I’m not feeling the sort of complexity ceiling that I was feeling with its predecessors.

Transparency is key. Your code should provide insights to how it is running, insights the agent can understand (log files) insights you can understand (graphs and images, where applicable), if it’s just a mystery box it’s unlikely to ever do anything complex successfully, but if it’s a collection of highly visible white boxes in a nice logical hiearchical structure - Opus 4.6 can do that.

Unit tests seem to be well worth the extra time invested - though they do slow down progress significantly, they’re faster than recovering from off-rails adventures.

Independent reviewer agents (a clear context window, at a minimum) are a must.

If your agent can exercise the code on the target system, and read all the system log files as well as the log files it generates, that helps tremendously.

My latest “vibe tool” is the roadmap. It used to be “the plan” - but now the roadmap lays out where a series of plans will be deployed. As the agent works through a plan, each stage of the plan seems to get a to-do list… Six months ago, it was just to-do lists, and agents like Sonnet 3.5 would sometimes get lost in those. Including documentation, both developer facing architecture and specifications (for the tests), and user facing, and including updating of the documentation along with removal of technical debt in the code at the end of each roadmap plan stage also slows things down, and keeps development on track much better than just “going for delivery.” So, instead of 6 months of output in a day, maybe we’re making 2 months of progress, in a day, and generating about 10x the tests and documentation as we would have in those 2 months traditionally - in a day of “realtime” with the tool. 40:1 speedup, buried under 500:1 volume of documents created.

Oriel Jutty :hhHHHAAAH:@infosec.exchange · 53 minutes ago

roadmap definition, sprint planning, unit tests, regular independent code reviews, etc. etc. etc. that those of us who “knew what we were doing” mostly looked down on as the waste of time that it was

You sound insane.

MangoCats@feddit.it · 38 minutes ago

Insane, yet reliably employed in the field for 30+ years - first and current job for more than a decade.

cub Gucci@lemmy.today · 4 hours ago

I am genuinely trying to keep up with things, but what I see is completely different from what you’ve been describing

My recent experience with launching a swarm (3-4 Claude opus agents) ended up with a fiasco: a simple task ate $15-20 Claude credits in less than ten minutes. Looks indeed like science fiction, but doesn’t produce anything
In my current role as a team lead, I had to review a lot of code and I do what I haven’t ever done: decline the whole PRs as they contain a lot of architectural changes that complexify the system in order to achieve the goal.
I write much less code with Claude code these days, mostly because I don’t trust it and have to recheck every single scenario. I trust junior engineer in our team more than I trust this instrument.

MangoCats@feddit.it · 54 minutes ago

ate $15-20 Claude credits in less than ten minutes.

Lay off of MAX mode.

Also, if you’re paying API rates, look into the subscription options - I can’t burn the $200 subscription plan down much below 50% without pushing prompts into Claude every waking hour (unless I turn on MAX mode). At API rates? I can burn $50 in a few hours.

do what I haven’t ever done: decline the whole PRs as they contain a lot of architectural changes that complexify the system in order to achieve the goal.

If you’re accepting the first thing the agent gives you, you’re almost certainly “doing it wrong” - gate it before it goes down a bad rabbithole and redirect it, in writing, in architecture documents (which it can draft for you, and correct based on your guidance) - and when it ignores those architecture documents, which it will do when things get big and complex, break the architecture documents down into smaller chunks that apply to the various tasks at hand - yes, it can do this breakdown for you too and that’s another opprotunity for you to guide the process. I try to frame the output I get from AI in my mind as: usually about 80% correct / useful, and it’s my job to identify that other 20% (which, in reality, is getting a lot smaller lately), and beef up the specifications and descriptions of the job until it can get everything to an acceptable state.

I don’t trust it and have to recheck every single scenario. I trust junior engineer in our team more than I trust this instrument.

That would depend entirely on which junior engineer your are talking about, for me. I don’t trust Claude, either. But for the most part I have Claude check itself, at an appropriately granular level. If you’ve got more than 2000 lines of Claude’s code that doesn’t have good visibility into what its doing, why its doing it, and what the outputs should look like… you’re trusting it too much. But it can write that documentation and testing for you, you just have to review it - at an appropriate level. If you’re trying to do it line by line of code for a big project, maybe you should still be writing it yourself instead.

UnderpantsWeevil@lemmy.world · 4 hours ago

Opus 4.6 is genuinely pretty decent at programming now if you give it a good backbone to build off of.

Soup from a Stone.

MangoCats@feddit.it · 40 minutes ago

Opus 4.6 is genuinely pretty decent at programming now if you give it a good backbone to build off of.

Soup from a Stone.

To an extent, yes. The more “broth base” I feed Claude, the better it does. If I just vaguely describe a program, I get a vague implementation of my description. If I have a big, feature rich example (or better, examples) of what I want the program to do, Claude can iterate until the program it make’s output actually matches the examples.

tracyspcy@lemmy.ml · 4 hours ago

nah such narratives are mostly pushed by Ai companies (it is obvious they need to sell it as business tool not personal buddy). Of course some managers/companies are buying into this narrative, and it is also understandable bc idea sounds like panacea especially if sell it further to investors :) and we see whole circle of snake oil sales

FishFace@piefed.social · 1 hour ago

It’s not a “narrative”; it’s their experience. I don’t have the same experience, but do have experience of myself and colleagues using LLM agents effectively and doing more work reviewing their output than writing lines of code. Some colleagues are pretty much AI boosters, but most are very aware of its limitations.

PabloSexcrowbar@piefed.social · 4 hours ago

nah such narratives are mostly pushed by Ai companies

Someone’s personal experience is an AI company narrative now?

tracyspcy@lemmy.ml · 4 hours ago

it always was. look at people trying to automate everything with help of ai bots . and before ai companies started pushing this none of these folks spoke about it ot tried to reach same goal with iftt or other tools that are here for decades.

limer@lemmy.ml · 4 hours ago

Some people do stuff the ai is good for, simple tasks that have been done a lot online already.

I hate ai for coding, AI cannot work for me. I would never trust it to do anything

PabloSexcrowbar@piefed.social · edit-2 4 hours ago

I don’t think you understand the words you’re using…

Someone said “this is how I managed to make this work,” provided detailed explanations of it, and you’re dismissing it as propaganda rather than testing it for yourself. That is an unbelievably stupid stance.

MangoCats@feddit.it · 46 minutes ago

I don’t think a lot of people have a feel for the velocity of change… this time last year I evaluated the tools and they still felt like a waste of time for me. I looked again in August 2025 and things were… different. Not great, but you could see the potential, and the velocity of change. When Claude 4.6 dropped - whoa… not just code, it has been helping me draft plans for a new building (personal use) - I need to submit some paperwork to the county, they just hit me with a requirement for architectural elevation drawings, Claude is chewing on that problem for me right now, working from basic information about the roofline and a 2D floorplan. Oop - and it’s done, first pass took maybe 20 minutes, aaand… it’s not too bad, side elevations are quite good, I just need to remind it about the 6" roof overhangs. Front and rear are a little more funky looking, I’m guessing these will be ready after another couple of rounds of prompts, maybe 1 hour in total, as opposed to hiring an architect for the permit application… (now, will the county push back because I didn’t hire an architect? I sincerely hope not, they said photos or drawings - how am I supposed to get photos of a building that hasn’t been built yet?)

tracyspcy@lemmy.ml · 4 hours ago

you are escalating it too fast taking it to personal level. I feel you are close to bring moms to this. So relax , let your ai buddy play with your parts. This chat is over.

PabloSexcrowbar@piefed.social · edit-2 2 hours ago

Sorry you can’t handle someone telling you that what you’re saying doesn’t make sense. Hopefully someday you’ll grow up enough to have your words challenged.

Edit: Oh, lemmy.ml. That explains everything.