This post kinda fits (and kinda does not) both [email protected] and [email protected]. please forgive if it feels out of place.
I used to host a “drive” for my collegemates (and juniors), which consisted of all the courses and related material (lecture notes (official, or what i made), software, code, research articles, assignments, media, video lectures, books, test, etc…). Earlier I used sharepoint (onedrive) which our college had made a deal with ms to get (1 tb for everyone). But at some point sharepoint started “faultering” (while I am almost certain it was not realted to pirated books and stuff, but there was a small chance that it was), and I then switched to proton ( i got a yearly plan with student discount - coming to something around $2/month for mail and 15 GB storage). Earlier this year, I switched to posteo for mail provider (multiple reasons, but one of the primary ones was cost ($1 without discounts), a simple imap setup), but kept the stuff on proton drive. Now my subscription for proton has ran out, and do not plan to renew it.
I am not the only one who does this kinda stuff, but I was first in our department. Other departments (older than us) already had some other long running drives, which essentially become useless when original maker graduates, so new stuff has to be made for each batch. Also, most other drives are hosted on google drive, or sharepoint. I could consider the college sharedrive again, but recently microsoft broke the deal, stating our college was using too much storage (something like 50-60 TiB), and then froze acceess for everyone. this was sorted, with weeks of down time, and our limit decreased to 15 GB (1.5% of previos limit). I do not want that to happen with my drive. I also take some prestige in my drive, because it is arguably the best organised, and has the most stuff (i find many books, if not available, get them from library, or purchase them, then scan and ocr, or assignments, which you are not meant to share with others, but i do mostly because for end sems they serve as practice).
I want a “semi anonymous” (anyone can see/upload stuff without accounts) (semi because tonnes of stuff has my personal identity (name on assignments and tests), somewhat private and “piracy friendly” (so no mainstream stuff like google and onedrive works, especially because i currently do not have a account for either to upload), and non public (or at least, be something like a very long and random url, stuff which will likely not end up on search engine indices). I am willing to pay too, provided, once i get access to drive, others can upload/download stuff without accounts (people will likely not sign up for random stuff, yet another step of friction)(sharepoint was good in this aspect. you could make a public link to which anyone could upload. they did not have edit access (delete/modify existing files) without login, but those operations are relatively rare).
*What I have considered, but can not be done - *
-
a private git repo, with some hosted webui (something like forego) - the actual size of stuff is not that large (5gib if consider stuff only related to course work, 80gib if i include other stuff(like offline wiki/stack overflow/other wikis backups)), so i could just get something like a 128 or 256 gib ssd, connect to a sbc (which i would have to buy, but practically a 1 time cost).
-
a private torrent
-
a google/onedrive, but i compress and/or encrypt the stuff, so they will not practically check content
-
our college’s existing infra (made by student welfare body), which already hosts tests for some of the courses, and another website, which hosts “some” (barely anything) other resources, likely useful websites, or lecture notes
*Why not possible? *
-
intended audience is essentially tech illiterate - many of them do not know much about what files/folders are. they usuallly do not dowwnload stuff (so they keep going back to drive, to view stuff in browser previews). they can not download stuff and then extract. they most definitely can not do that for the compression i usually use (it is a zstd in squashfs (as to why specifically this - it effectivelly uncompresses only the stuff it needs, like a pdf from a folder containing pdfs, so storage space saving at cost of slight cpu increase (zstd helps to lower that usage)), for windows - you need 7zip, which most do not have). they can not decrypt stuff. they can not torrent (like 95+% have no idea how torrenting even works. to them, that is just a way to acquire “linux isos”). out of 5%, maybe 4.5% will not be able to work with priavte torrents (something they have no experience with)
-
our college’s infra forbids most of copyrighted material. they forbid tests which were not allowed to leave exam premises. (In those cases, I used ot memorise the questions, and recreate the tests, or when we went to check our answersheets, I would try to take pictures of question paper/answer sheets)(this was not the case for majority of courses, but a significant minority for sure)(this was done mostly by professors who teach a course multiple times, and do not want to design questions every year, and they just rehash old stuff). They forbid any executable stuff. (they have allowed file types of common multimedia formats. but from my minimal testing, they just do a stupid extension check, and not file metadata or headers, so i can fool this shit, by making zips and then appending a .txt or .pdf at end).
is there some other drive provider I can consider? which has good usability, and mostly anon? there is mega drive/pcloud, icecloud, and many others, who provide somewhat private/anonymous viewing. some of them have quite generous free plans too, so I can use some of them as backups. but most do not have a “qol” features (for example, afaik, mega drive does not do previews, they download in a wierd way in browser memory, and then to your filesystem, others have little information available).
I could consider maybe something like archive.org, if I do a very tedious task of making most of stuff not have my names, and making a archive.org account is easy , but that is still a deterrent.
edits -
PS: For anyone wondering about legality or ethics of this drive - it is a mixed bag. I definitely have copyrighted works which I did not acquire properly. I share stuff which I am not supposed to. hence i decided to post on piracy subreddit.
on to ethics - many of our professors know about this. most are fine with it. some complain that I am ruining future students, by providing most stuff in a easy single spot, and that leads them to not go to classes or stuff. But i disagree with them. kids not going to classes is only weakly dependent on availability of notes, but stongly dependent on the nature of course, for example teachers interactivity or teaching style. Some professors even like the drive, and also use it (as a reference for what was taught earlier, or questions they have already used.) IMO, their tests become better (less rote learning type stuff, and more applications based stuff, for which you can come up with questions for).
A funny side story - My drive was used in a “mass cheating” case. In one of our programming related courses (we are not studying computer science, and this course was a effectively - how to use python to do some scientific computation). Our prof was very chill, and allowed people to use internet freely, and even allowed using chatgpt (at that time, it was like a few months old) to write the programs (essentially, we were expected to know the “science”, and formulate a good set of equations, and then use python (or any other language for that matter). so allowing use of internet/gpt was just to allow people not good at programming, but good at the actual science to perform well too.). He also knew about my drive, and encouraged people to check it. (I got most of the packages setup for most people, also had a good relationship with the prof prior to course). During midsem, a “eureka” like moment hit. My drive would largely automatically sync stuff once a day. what if, during exam, my drive would sync, lets just say, every 10 minutes, then my solutions (as i write them) would be avaialable on the drive in during the exam, and somewhat frequently updated. But it would feel odd for everyone to check a single drive, the same file, at the same time. So I also wrote small curl scripts for windows and mac folks, which would pull the required file every 10 or so minutes from the drive. Now it would feel as if everyone was checking some other code (probably old code from class)(ot was a open book/notes/internet exam). Was what i did technincally rule breaking ? - no, people were allowed to use my drive, i just made much of that process automated. Was it ethical ? - definitely not. I knew what I was doing was wrong, but I was doing it one part for the giggles. Also, I was not the one who actually came up with the idea (someone gave a rough proto-idea, upon which i further developed upon). What was the result? No one caught the cheating. Teaching instructors (tas) find it normal for people to check otheer files, or a terminal process running. Not everyone used it. I was not checking from others, but roughly 80+% of class was using it. But someone snitched (i got to know did that fairly soon. he also cheated). Luckily, no disciplinary action was taken against me (I could have been suspended, or even rusticated, since it was a mass cheating), but the prof was very chill. He did scold me a lot. did not talk with me in his usual jovial manner for few weeks, but it eventually got ok. NOBODY (including me) got any punishment or point deduction.
edits 2 PS2: while reading the last part again, I realised what i had done was something close to implementing a adblock. you are allowed to close the ads manually (like checking drive manually). It is only frowned upon, if a program (like ublock) closes all the ads (in this case, by blocking ads from being even fetched, in the background) (like using a curl script to fetch the file in background).
What about something like nextcloud? You could link nextcloud to external drive/mount like an encrypted google drive for example but keep it tech illiterate friendly for upload/download and read and you could control how you enable the access?
Another idea: https://github.com/9001/copyparty
I could try these, but the problem is, I am graduating. I could set it up once, and maybe even give someone else (or myself) remote access to the hosting infra, but I would likely be less available to manage stuff.
Good dsolutions though. I could possibly try to make the latter solution work (managing nexcloud is relatively harder imo, and i have no idea how would i mount a encrypted google drive to it). It still feels like something only i would have to maintain, but if I can get it in a setup and forget stage (or like a annual maintainence), then I could consider it.
the only problem now is money. I would have to use vps for this kinda stuff. the sbc + ssd idea was something i had proposed to a junior. but hosting anything in college premises with college internet would have to “techinically comply with copyright rules”. if it was a git like solution or torrent like, even administrators would not be able to access the stuff (comparable in tech literacy)(i am not talking about people who manage or internet infra, but copyright stuff, like our library department). With a drive like setup, they would be able to use it too.
Can i setup a password to access the stuff, like a simple password, common for all? then only student who have password be able to use it. maybe i can setup http authentication. but then again i would fall back to - it is getting too tough for them to use it.
You could still host nextcloud on your college infra but the copyright stuff on third party external storage https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/external_storage_configuration_gui.html
Like this your stuff is compliant and you allow folks to access a google drive for example that you or someone else can manage.
Access wise you could create a dummy user for anyone to access or let people manage their account https://docs.nextcloud.com/server/latest/admin_manual/configuration_user/index.html
The good thing is that give you flexibility and integration capabilities depending on how you go.
Maintenance wise I self host it on my side with 2 docker 1 for the app 1 for the db so it is very straightforward. I use it to share with family that is not tech literate.
I would seriously consider this, and try to implement this weekend.
Like this your stuff is compliant and you allow folks to access a google drive for example that you or someone else can manage.
people would not differentiate (in future, i would not be the one to upload stuff, but newer gens would), so i would set everything to be uploaded to google. I would still try to maintain 2 offline backups in case google gets angry.
I use it to share with family that is not tech literate.
that is definitely reinforcing.
I would still try to find if i can host directly in campus, and not google or any 3rd party at all.
Seems things have gotten more complicated in the age of cloud computing. I think these archives have always been a thing. In the good old days sometimes on their infrastructure, buried several layers deep in some windows network share or on some specific computer in the computer lab or maintained by the student body of a faculty… And there was always some secret file stash somewhere.
If you’re concerned with a long-term solution. Are there any entities run by the students? Associations or clubs interested in maintaining such a thing long-term? I mean technology aside, the real issue is that this is done by random individuals and they’re gone after a while. Ideally this is done with some help of an entity that lasts longer than that and passed down to future generations.
there is another resource run by dev club, but not many of our department folks are in dev club. Also, their solution imo is worse. they techinically do have a index with search, but not a good one. also - very slooooooooooooow. Ans this is when their solution is smaller than mine (and their solution gets contribution from other departments as well).
I understand it would be better be done by a entity instead of a person, but problem is, entities would have to abide by institute rules. and then whole lot of problems from “why not possible” point 2 applies.
I could get some of my juniors to form a small group within department, but i do not think many of them do anything unless they are given a motive for it. and there is no real motive to maintain a good database other than helping others. entities can do “good things” but individuals often do it for selfish reasons. that was partially the case with me. I used to make notes, and then tons of messages from classmates to share notes, and got fed of sending stuff individually. then i started sending stuff in group chat, but not having a good way to search chat history meant people would not find it, and ask me again. so i made a drive. a course happens where people have to install software, but actual instructions are very hard - i get messages - i make scripts to install, or compile the end product and just ship it. You might think these are good deeds, but they are still selfish acts. I used to maintain a good directory structure anyway, might as well upload it.
The main issue is, you graduate as well and life will move on for you. You might move far away, get a full-time job, maybe have new hobbies or a family and time will come and you’ll stop supporting it as well. I’ve seen that all the time and most privately run things vanish sooner than later.
Of course the entities have to abide by the rules. We also did that… officially… It just happened to be the case that some of the same individuals also did other things after hours, and not in their role as members of the entity… And while mingling you’d find similar-minded people and/or successors for the inofficial operations. It’s a bit trick to get it right. The official entity of course denies any involvement, they can’t take any blame.
And I’d say if you’re the main/sole contributor of content, it’s questionable if this even survives long term. Unless people upload recent exams and material, the content will become obsolete after a few years. Professors will have changed the questions and assignments or the entire course is done by a new professor and the archive will slowly become obsolete. So you kind of need some community anyways. Or skip the hassle and just upload the thing to archive.org or some one click hoster.
Another option would be to talk to the dev club. Maybe they’d like to revamp their solution and take yours, or they have some idea about tech infrastructure.
some day you’ll graduate as well and life will move on for you.
I am graduating. That is why when i leave, i want to leave stuff in a functional state so they do not have to start a fresh. I did mention this in post, but i wrote a whole lot more than i should have, and i do not expect anyone to read all this.
ou’ll move far away, get a full-time job, maybe have new hobbies or a family and time will come and you’ll stop supporting it as well. I’ve seen that all the time and most privately run things vanish sooner than later.
absolutely. as i said, individuals work for selfish reasons, and once i leave, i would not have a selfish reason anymore.
And I’d say if you’re the main/sole contributor of content, it’s questionable if this even survives long term. Unless people upload recent exams and material, the content will become obsolete after a few years.
yes. it does get obsolete. but our department is still relatively new (5th or 6th year since establishment) and hence, most course have not been taught by 2 or more profs. hence, much of it will stay relevant as long as professors stay.
My juniors have started bugging me again to get drive working again (new sem has started).
So you kind of need some community anyways.
I would have to pull some shit to form a sub division of department society. then i can get budget to either buy some drive subscription, or set something local, but set it behind some proxy, so it would appear not to be hosted in college (reverse vpn if you will)
I am graduating […]
Sorry, I read that after I replied and edited my comment, but a bit too late… That changes some things…
I agree. There’s roughly two options. Either a static archive as your heritage. Or some writable file storage which can be kept up to date. And yeah, that needs payment, maintenance…
And from my observation, finding people willing to maintain something, or clean up after someone did some annoying things or filled up storage or whatever, is harder than setting up the technology.
Obviously that option would be preferable, though.
[…] or set something local, but set it behind some proxy
Maybe Cloudflare is your friend. They dominate the market of free reverse proxies / tunnels.
But I’m really unsure if I have any good recommendation that fits your situation. Ideally find a successor, next best thing is a Nextcloud, Google Drive, OneDrive or some of the other ones. And if that can’t be done, split it into manageable chunks by course and dump it to some one click hoster or archive.org. That’s all I can come up with.
And by the way, I did appreciate such archives and made use of them. And there’s a lot of reasons (cheating aside) to share notes, PDFs, try old exams to prepare…
cheating aside
that was just one case. I was stupid. Now I am not.
I did appreciate such archives and made use of them.
to share notes, PDFs, try old exams to prepare…
exactly why i never said no to anyone. it is the age old saying - we stand on shoulder of giants. I remember falling ill, not being able to attend classes, and missing stuff. but then some friend would lend their notes. drive imo is just for that.
Every Google account comes with 15GB of free storage.
Make as many Google accounts as you need.
Store files in Google Drive up to the limit for each account, then add links to the other GDrive archives.
Organize the files as needed.
Maybe make a free web page in Wix, Wordpress, or whatever with links to the files easy to search and find.
Pass the userid and passwords to the GDrive and webpages to whomever wants to be your successor.
You will be dependent on the generosity of Google, Wix, Wordpress or whatever to not cleanup the webpages and files due to inactivity.
So keep a backup. Torrents can be messy because they can be broken if there are no seeders.
If the content is static, then I’d recommend some older P2P filesharing like eD2K to keep one big zip/rar file backup shared among peers.
thank you. but i would prefer not to host stuff on google drive. but some other provider could work (as i said, i am willing to pay a reasonable amount).
So keep a backup. Torrents can be messy because they can be broken if there are no seeders.
If the content is static, then I’d recommend some older P2P filesharing like eD2K to keep one big zip/rar file backup shared among peers.
yes. content would be partially static. most files will not update, only new stuff will be added. and some files be updated.
And something like syncthing
i do not think syncting would scale well with ~200 people during exam seasons. Also, that would require everyone to download syncthing (practically impossible task)(i think syncthing android app is depreceated or something, and only forks are alive, no idea about ioss clients). Also, that would actually download stuff, all from one server, that would be expensive (fetching 1 file or 1 course worth of file is relatively cheaper as compared to fetching all course files. At that point, i might as well implement a private torrent.
As a note, I believe that syncthing will actually scale up with more nodes as they will all share with each other if they know each other. If you’re doing this 1 to many then this is not the case of course.
i did not know syncting could have multiple masters. afaik, syncthing had a master-slave architecture, where a folder on a device is master, and another folder is slave (both can true simultaneously, a folder can be used both as source and sync). if there is another folder, it can be slave of prior 2, but not master, because then you can have conflicting results (which master to pick). do you possibly mean something like a pyramid/tree architecture, where a father nodes has 2 daughter node, and each daughter has 2 and so on. if so, that is even harder to setup (getting people to ask others if they will be their father/daughter cell. this also has problem if some node is out of sync (because of being offline or something), daughters and grand daughter will also not sync. A cyclic link list is also possible, but again chain can be broken. and this can not be a doubly linked list either (2 masters). or is there some other way?
The way I’ve been using it for a few years is that most of my machines can see each other and I have a shared folder and versioning setup. As I add things they move between the different machines and once an additional machine has it it is available to the others until everything is in sync
You can definitely do chain topologies which are useful for certain things with a single source of truth
Resilio Sync has an option for selective sync - it’ll only sync files that a person selects.
It can be set read only (read/write is determined by the key someone is given, plus you can enable an approval mechanism and expiry of a key).
Not sure that really fits your use-case.
thanks for the suggesion, but syncing is not the problem currently. I can use stuff like rsync, or rclone for drive providers. problem is where to sync to.
intended audience is essentially tech illiterate
Use cryptomator to create an encrypted volume (a folder), that you can then share through whichever cloud you want (like mega or one of the bigger ones).
it is meant to be editted by others. I do not think many others will be able to use cryptomator (afaik you would have to download the whole cryptomator folder, and then update). Still, thanks!