And I gave you a concrete example of how LLMs both find and exploit these vulnerabilities. It’s quite evident that your disagreement stems from not having actually used these tools to find vulnerabilities.
What I’m saying here is that the way you actually use LLMs is by having them go through the steps of the exploit. It makes a hypothesis and then it tries it, and then you see the result. There’s nothing to be fooled by here because the steps it takes either work or they don’t.
The reason LLMs are much better at finding these vulnerabilities is because a human can’t keep a large codebase in their head all at once. If you look at a project like Lemmy for example, there’s a ton of code in it. You have to be an expert in what that code is doing, how the moving pieces relate to each other, and the domain itself to find the exploit. The LLM can zero in on the problems much easier, and actually take the steps to try the exploit. For example, for the case I mentioned with piefed, the issue was very subtle way the oauth token was being misused. It wasn’t localized in one place where auth was done, but manifested in a different part of the codebase that relied on it. Something like that would take a lot of dedicated work to find manually.
deleted by creator
And I gave you a concrete example of how LLMs both find and exploit these vulnerabilities. It’s quite evident that your disagreement stems from not having actually used these tools to find vulnerabilities.
deleted by creator
Yes, quite extensively in fact. That’s how I found a massive security hole in piefed that I mentioned earlier in fact.
deleted by creator
No, I’m a software developer.
deleted by creator
What I’m saying here is that the way you actually use LLMs is by having them go through the steps of the exploit. It makes a hypothesis and then it tries it, and then you see the result. There’s nothing to be fooled by here because the steps it takes either work or they don’t.
The reason LLMs are much better at finding these vulnerabilities is because a human can’t keep a large codebase in their head all at once. If you look at a project like Lemmy for example, there’s a ton of code in it. You have to be an expert in what that code is doing, how the moving pieces relate to each other, and the domain itself to find the exploit. The LLM can zero in on the problems much easier, and actually take the steps to try the exploit. For example, for the case I mentioned with piefed, the issue was very subtle way the oauth token was being misused. It wasn’t localized in one place where auth was done, but manifested in a different part of the codebase that relied on it. Something like that would take a lot of dedicated work to find manually.
deleted by creator