- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
We backed up Spotify (metadata and music files). It’s distributed in bulk torrents (~300TB), grouped by popularity.
This release includes the largest publicly available music metadata database with 256 million tracks and 186 million unique ISRCs.
It’s the world’s first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space), with 86 million music files, representing around 99.6% of listens.
damn, annas is fucking based. hope they stay safe…
300TB

Anyone knows if spotify metadata have BPM and keys?
Both. Per the SQL schema printed in the article, table
track_audio_featureshas both fields tempo and key along with many other technicals. Worth checking out, it’s near the bottom of the page.Mashup artist detectedBPM yes, keys I’m not so sure.
Does anyone see the torrent links?
It says,
The data will be released in different stages on our Torrents page:
[X] Metadata (Dec 2025)
[ ] Music files (releasing in order of popularity)
[ ] Additional file metadata (torrent paths and checksums)
[ ] Album art
[ ] .zstdpatch files (to reconstruct original files before we added embedded metadata)
Yeah, I saw that and assumed they’d be torrents of the metadata.
Are these the actual music files you can use to play as well?
Would be amazing if it was. I would love to just have Spotify’s music on my nas
I’d wager 70% of what’s on Spotify is not worth preserving since its AI slop.
Interestingly enough, with the data they provide, figuring out how much of it is AI slop wouldn’t be that hard I think
Yeah as with most of the internet, it’s only worth downloading anything uploaded before 2023.
So far, LLMs have done so much more harm than help.
I’m not convinced AI slop can compete with the back log of organic slop personally.
But yeah a fuckton is probably slop either way
If your nas has 300tb spare, you could.
I read it as 30tb in my head. 300tb is a bit more than i can manage
How much could it cost $10,000
A RAID6 of 24 * 20TB drives could contain that with both parity and hotswap, with room to spare. Let’s say $400 per refurb drive, $2500 rackmount SAS enclosure, $2000 SAS RAID card, $14,100 total. Assuming you already have the server and power and SAS cables.

Considering the Price per TB is 10-11 US dollars, it’s gonna cost $3500 max
10US dollar per TB?? 🤣🤣 More like 30/35€ per TB for a good graded HDD!
Let’s not talk about SSDs or nvme which are more in the 120€/TB.
I always hear people say that storage comes cheap nowaday… I’m still looking for that cheap HDD on amazon… It has been 10 years 🤣🤣
$10/TB is a bit low, but not far off. Serverpartdeals has refurbed enterprise/NAS drives at about $15/TB right now, and thats with the AI pressure driving up prices. I recall seeing 18TB drives around $12/TB a few months back.
The above is well loved vendor in the IT space. Much better place to buy from than Amazon, as they actually guarantee their inventory is legitimate.
Is it that cheap now?? I would kill for 10tb
It’s nowhere near that cheap.
Here you go mate. They dont have 10TB in stock, but the do have 20TB refurbs at around $15/TB.
The article claims that they are:
We backed up Spotify (metadata and music files).
Now make it streamable and make a stremio-like music client 🤞
Record labels themselves would march on foot to burn down the archives
I wonder why Spotify and not YouTube Music, Tidal or Apple Music all of which are higher quality.
They said this in the linked blog post:
A while ago, we discovered a way to scrape Spotify at scale.
Seems like reason enough to choose to scrape Spotify to me.
And Spetify’s catalogue is the broadest too.














