DDoS hit blog that tried to uncover Archive.today founder’s identity in 2023. […] A Tumblr blog post apparently written by the Archive.today founder seems to generally confirm the emails’ veracity, but says the original version threatened to create “a patokallio.gay dating app,” not “a gyrovague.gay dating app.”
https://www.heise.de/en/news/Archive-today-Operator-uses-users-for-DDoS-attack-11171455.html:
By having Archive.today unknowingly let users access the Finnish blogger’s URL, their IP addresses are transmitted to him. This could be a point of attack for prosecuting copyright infringements.



“As quickly as possible” pulls a lot of weight in my statement. Just like when the EU is trying to cut our dependence with US payment providers, Wikipedia can’t do it overnight. The best time to plant a tree was 10 years ago, the next best time is right now.
Cutting ties with archive[.]today takes a long time, but the longer the decision to cut it takes, the longer to the ties are actually cut. It’s all about “make haste slowly”, ie. do a lot of planning on how to actually cut the ties with minimal impact so you can do it when forced to (for example if FBI were to take the servers one day) or when you decide that the independence from archive[.]today is more valuable than the remaining impact of cutting dependence. This could take half a year, a year, or more.
But indecision will at some point put you in a worse position: You are funneling your traffic to a malicious website that actively participates in DDoS attacks by using users’ traffic (including those coming from Wikipedia) to carry out the attack. Indecision can open you up to serious litigation and reputational damage by proximity. Given that archive[.]today crossed the line to malicious activity by misusing their traffic, what’s to stop them from malicious activity by misusing their content? IMO even if you think the integrity of your content and its sources are too valuable (and trust me, I think it’s very valuable) you need to consider this as a warning sign and realise that nothing’s stopping archive[.]today from losing the editorial integrity that you rely on.
So my suggestion, brainstorm ideas that would make you independent: Make agreements with IA to improve retention, roll your own archiver, make a deal with news orgs to show their articles as citations (this last one I actually like most the more I think about it. A good negotiator can call it advertising for the news org and you’ll at the same time not infringe on copyright like archive[.]today is). If you wait until point of no return, the choice has already been made for you whether you like it or not. And worst part is that you’d scramble to find a solution instead of the best solution.
Editors have been doing this for years.
The IA already lives on a razor’s edge in terms of copyright and is doing everything it thinks it can to push that. Many websites leave the IA be because having free, independent archives can benefit them, but it doesn’t take a lot for a copyright holder to say: “Hey, you’re hosting my IP verbatim, I sent you a takedown request, you didn’t comply, and I’m taking you to court.”
You can’t just “make agreements” for the IA to violate copyright law (more than it arguably already is). They’re already doing the best they can, and pushing them to do more would endanger Wikipedia even worse. It’s not an exaggeration to say that the IA dying would be a project-wide apocalypse.
I’d bet it could be done if the IA went down, triggering a project-wide crisis, but among other things, I’m sure the Wikimedia Foundation doesn’t want to paint a target on its backs. We’re very cautious when it comes to copyrighted material hosted on Wikimedia projects, and this would be dropping a fork into a blender for us.
I don’t think I understand one. The Wikimedia project gets to host verbatim third-party news articles? This is creative but completely unrealistic; you’d be asking news organizations to place their work under a copyleft license for citing on Wikipedia (that’s what we host except for minimal, explicitly labeled fair use material that has robust justification). It’d be a technical nightmare any way you slice it, and logistically it’d be a clusterfuck.
Even if you magically overcame those problems, Wikipedia exists to be neutral and independent, and this “wink wink nudge nudge ;)” quasi-advertising deal would look corrupt as fuck – us showing preferential treatment for certain sources not based on their quality but on their willingness to do us favors.
Here’s the thing: we know. This RfC is full of highly experienced editors deciding if Wikipedia is going to amputate. Option A means immediate, catastrophic, irreversible, mostly unfixable damage to Wikipedia. That is something that needs to be thought through, and your suggestions – which are appreciated for showing you’re giving it real thought – reflect that people who don’t regularly edit can’t really, viscerally understand how completely screwed Wikipedia is by this.
It would be just like the extant https://en.wikipedia.org/wiki/Wikipedia:The_Wikipedia_Library.
In the worst case we could just run Megalodon on all the archive.today URLs
I think you have a very severe misunderstanding of the Wikipedia Library, which I have access to and frequently use. The WPL allows active editors in good standing to access paywalled sources.
I can’t emphasize enough how absurd this comparison is. “Solar farms exist; building a Dyson sphere would be basically the same thing. Let’s get to work.” And the thing is: I wish you were right.
Edit: That said, if you ever need copyleft material, we do maintain Wikimedia Commons for media generally and Wikisource which is a transcribed digital library of free sources. Much narrower in scope than this, but I highly recommend them!
I am an active editor lol. I’m saying that the proposal is to establish something similar to TWL for media URLs. It would serve the same purpose for editors as a major complaint in the discussion was over addition of Archive.today links to bypass paywalls. Obviously developing this deal would take a lot of work but it is workable.
That’s not true. Anyone who meets the stats you mentioned may access TWL.
Indeed, that’s what makes it legally sound and prevents us from needing to relicense. We don’t need to license the content to copyleft for the thing to work.
Okay, then you’ll need to explain the annual emails I’ve gotten saying “Your application to the Wikipedia Library has been approved” after I apparently tripped and fell and filled out a manual form applying to the library every year.
It doesn’t seem selective once you meet the four aforementioned criteria, but you do need to manually apply.
The idea you’re talking about, meanwhile, is nonsensical and doesn’t address basically anything about the massive structural problems blacklisting archive.today imposes. I wholly support expanding out the Wikipedia Library, but even this pie-in-the-sky version of it falls too far short of what archive.today provides – and that’s just going forward in an ideal world where you can snap your fingers and make this fantasyland WPL happen as soon as archive.today is blacklisted.
The “backcatalogue”, so to speak, is what’s going to be the most catastrophic part of this by far. I spent years where my main focus was just on bringing dead sources back to life; I don’t know the full extent of how bad this is, but I know for damn sure what you’ve suggested (which won’t ever happen) undoes barely a fraction of the damage.