This post kinda fits (and kinda does not) both [email protected] and [email protected]. please forgive if it feels out of place.

I used to host a “drive” for my collegemates (and juniors), which consisted of all the courses and related material (lecture notes (official, or what i made), software, code, research articles, assignments, media, video lectures, books, test, etc…). Earlier I used sharepoint (onedrive) which our college had made a deal with ms to get (1 tb for everyone). But at some point sharepoint started “faultering” (while I am almost certain it was not realted to pirated books and stuff, but there was a small chance that it was), and I then switched to proton ( i got a yearly plan with student discount - coming to something around $2/month for mail and 15 GB storage). Earlier this year, I switched to posteo for mail provider (multiple reasons, but one of the primary ones was cost ($1 without discounts), a simple imap setup), but kept the stuff on proton drive. Now my subscription for proton has ran out, and do not plan to renew it.

I am not the only one who does this kinda stuff, but I was first in our department. Other departments (older than us) already had some other long running drives, which essentially become useless when original maker graduates, so new stuff has to be made for each batch. Also, most other drives are hosted on google drive, or sharepoint. I could consider the college sharedrive again, but recently microsoft broke the deal, stating our college was using too much storage (something like 50-60 TiB), and then froze acceess for everyone. this was sorted, with weeks of down time, and our limit decreased to 15 GB (1.5% of previos limit). I do not want that to happen with my drive. I also take some prestige in my drive, because it is arguably the best organised, and has the most stuff (i find many books, if not available, get them from library, or purchase them, then scan and ocr, or assignments, which you are not meant to share with others, but i do mostly because for end sems they serve as practice).

I want a “semi anonymous” (anyone can see/upload stuff without accounts) (semi because tonnes of stuff has my personal identity (name on assignments and tests), somewhat private and “piracy friendly” (so no mainstream stuff like google and onedrive works, especially because i currently do not have a account for either to upload), and non public (or at least, be something like a very long and random url, stuff which will likely not end up on search engine indices). I am willing to pay too, provided, once i get access to drive, others can upload/download stuff without accounts (people will likely not sign up for random stuff, yet another step of friction)(sharepoint was good in this aspect. you could make a public link to which anyone could upload. they did not have edit access (delete/modify existing files) without login, but those operations are relatively rare).

*What I have considered, but can not be done - *

  • a private git repo, with some hosted webui (something like forego) - the actual size of stuff is not that large (5gib if consider stuff only related to course work, 80gib if i include other stuff(like offline wiki/stack overflow/other wikis backups)), so i could just get something like a 128 or 256 gib ssd, connect to a sbc (which i would have to buy, but practically a 1 time cost).

  • a private torrent

  • a google/onedrive, but i compress and/or encrypt the stuff, so they will not practically check content

  • our college’s existing infra (made by student welfare body), which already hosts tests for some of the courses, and another website, which hosts “some” (barely anything) other resources, likely useful websites, or lecture notes

*Why not possible? *

  • intended audience is essentially tech illiterate - many of them do not know much about what files/folders are. they usuallly do not dowwnload stuff (so they keep going back to drive, to view stuff in browser previews). they can not download stuff and then extract. they most definitely can not do that for the compression i usually use (it is a zstd in squashfs (as to why specifically this - it effectivelly uncompresses only the stuff it needs, like a pdf from a folder containing pdfs, so storage space saving at cost of slight cpu increase (zstd helps to lower that usage)), for windows - you need 7zip, which most do not have). they can not decrypt stuff. they can not torrent (like 95+% have no idea how torrenting even works. to them, that is just a way to acquire “linux isos”). out of 5%, maybe 4.5% will not be able to work with priavte torrents (something they have no experience with)

  • our college’s infra forbids most of copyrighted material. they forbid tests which were not allowed to leave exam premises. (In those cases, I used ot memorise the questions, and recreate the tests, or when we went to check our answersheets, I would try to take pictures of question paper/answer sheets)(this was not the case for majority of courses, but a significant minority for sure)(this was done mostly by professors who teach a course multiple times, and do not want to design questions every year, and they just rehash old stuff). They forbid any executable stuff. (they have allowed file types of common multimedia formats. but from my minimal testing, they just do a stupid extension check, and not file metadata or headers, so i can fool this shit, by making zips and then appending a .txt or .pdf at end).

is there some other drive provider I can consider? which has good usability, and mostly anon? there is mega drive/pcloud, icecloud, and many others, who provide somewhat private/anonymous viewing. some of them have quite generous free plans too, so I can use some of them as backups. but most do not have a “qol” features (for example, afaik, mega drive does not do previews, they download in a wierd way in browser memory, and then to your filesystem, others have little information available).

I could consider maybe something like archive.org, if I do a very tedious task of making most of stuff not have my names, and making a archive.org account is easy , but that is still a deterrent.

edits -

PS: For anyone wondering about legality or ethics of this drive - it is a mixed bag. I definitely have copyrighted works which I did not acquire properly. I share stuff which I am not supposed to. hence i decided to post on piracy subreddit.

on to ethics - many of our professors know about this. most are fine with it. some complain that I am ruining future students, by providing most stuff in a easy single spot, and that leads them to not go to classes or stuff. But i disagree with them. kids not going to classes is only weakly dependent on availability of notes, but stongly dependent on the nature of course, for example teachers interactivity or teaching style. Some professors even like the drive, and also use it (as a reference for what was taught earlier, or questions they have already used.) IMO, their tests become better (less rote learning type stuff, and more applications based stuff, for which you can come up with questions for).

A funny side story - My drive was used in a “mass cheating” case. In one of our programming related courses (we are not studying computer science, and this course was a effectively - how to use python to do some scientific computation). Our prof was very chill, and allowed people to use internet freely, and even allowed using chatgpt (at that time, it was like a few months old) to write the programs (essentially, we were expected to know the “science”, and formulate a good set of equations, and then use python (or any other language for that matter). so allowing use of internet/gpt was just to allow people not good at programming, but good at the actual science to perform well too.). He also knew about my drive, and encouraged people to check it. (I got most of the packages setup for most people, also had a good relationship with the prof prior to course). During midsem, a “eureka” like moment hit. My drive would largely automatically sync stuff once a day. what if, during exam, my drive would sync, lets just say, every 10 minutes, then my solutions (as i write them) would be avaialable on the drive in during the exam, and somewhat frequently updated. But it would feel odd for everyone to check a single drive, the same file, at the same time. So I also wrote small curl scripts for windows and mac folks, which would pull the required file every 10 or so minutes from the drive. Now it would feel as if everyone was checking some other code (probably old code from class)(ot was a open book/notes/internet exam). Was what i did technincally rule breaking ? - no, people were allowed to use my drive, i just made much of that process automated. Was it ethical ? - definitely not. I knew what I was doing was wrong, but I was doing it one part for the giggles. Also, I was not the one who actually came up with the idea (someone gave a rough proto-idea, upon which i further developed upon). What was the result? No one caught the cheating. Teaching instructors (tas) find it normal for people to check otheer files, or a terminal process running. Not everyone used it. I was not checking from others, but roughly 80+% of class was using it. But someone snitched (i got to know did that fairly soon. he also cheated). Luckily, no disciplinary action was taken against me (I could have been suspended, or even rusticated, since it was a mass cheating), but the prof was very chill. He did scold me a lot. did not talk with me in his usual jovial manner for few weeks, but it eventually got ok. NOBODY (including me) got any punishment or point deduction.

edits 2 PS2: while reading the last part again, I realised what i had done was something close to implementing a adblock. you are allowed to close the ads manually (like checking drive manually). It is only frowned upon, if a program (like ublock) closes all the ads (in this case, by blocking ads from being even fetched, in the background) (like using a curl script to fetch the file in background).

  • sga@lemmings.worldOP
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 days ago

    i do not think syncting would scale well with ~200 people during exam seasons. Also, that would require everyone to download syncthing (practically impossible task)(i think syncthing android app is depreceated or something, and only forks are alive, no idea about ioss clients). Also, that would actually download stuff, all from one server, that would be expensive (fetching 1 file or 1 course worth of file is relatively cheaper as compared to fetching all course files. At that point, i might as well implement a private torrent.

    • misterbngo@awful.systems
      link
      fedilink
      English
      arrow-up
      6
      ·
      2 days ago

      As a note, I believe that syncthing will actually scale up with more nodes as they will all share with each other if they know each other. If you’re doing this 1 to many then this is not the case of course.

      • sga@lemmings.worldOP
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 days ago

        i did not know syncting could have multiple masters. afaik, syncthing had a master-slave architecture, where a folder on a device is master, and another folder is slave (both can true simultaneously, a folder can be used both as source and sync). if there is another folder, it can be slave of prior 2, but not master, because then you can have conflicting results (which master to pick). do you possibly mean something like a pyramid/tree architecture, where a father nodes has 2 daughter node, and each daughter has 2 and so on. if so, that is even harder to setup (getting people to ask others if they will be their father/daughter cell. this also has problem if some node is out of sync (because of being offline or something), daughters and grand daughter will also not sync. A cyclic link list is also possible, but again chain can be broken. and this can not be a doubly linked list either (2 masters). or is there some other way?

        • misterbngo@awful.systems
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          1 day ago

          The way I’ve been using it for a few years is that most of my machines can see each other and I have a shared folder and versioning setup. As I add things they move between the different machines and once an additional machine has it it is available to the others until everything is in sync

          You can definitely do chain topologies which are useful for certain things with a single source of truth