Showing a good Flashback

decorator
Lessons from fixing some dumb af bugs with nextcloud selfhost setup
Published on 05 May 2024
# nextcloud # tech # lessons

Nextcloud Ditch

Recently, folks from OSM Delhi suggested we move to a complete FOSS Suite and we should be getting away from the Google Suite as we are “FOSS” United and we were actually making heavy use of GSuite. Hence, we were suggested to start the migration from Google Suite for stylesheets, docs, and other minor stuff to Nextcloud and we should be self-hosting it. 

Vishal told me about their conversation at DelhiFOSS, and I was pretty excited again to self-host nextcloud. I had always been thinking about alternatives to all of the non-FOSS software we were making use of daily. 

Around 20-25 days ago I self-hosted nextcloud on a fresh Linode with 2GB ram and 1 CPU, and this is obviously very slow for something like nextcloud even if the documentation mentions that this is the minimum nextcloud needs. But when it comes to multiple people running nextcloud its very important to have higher specs as there are continuous session breaks and lags. But while setting up I mistakenly left a security issue and did not notice that anyone with a “fossunited.org” domain was able to slide into our nextcloud instance. And Vishal asked me to fix it. 

Hence I came home opened nextcloud added Admin Verification and hovered through extensions to see if there were any options. Then I stumbled upon the SSO & SAML Authentication extension, and this is where the fun begins. I googled about SAML and SSO information. For those of you who don’t know SSO is Single-Sign-on basically which allows you to slide in if you have a username and password, you’re good to go. The details are verified and you’re allowed to get in. And SAML (Security Assertion Markup Language) is an XML standard that provides a safe and secure pathway for authentication by exchanging XML digitally signed XML documents. And going ahead I installed the extension :D and logged out to check the previous security feature I enabled, and after that I never got in until I did the fix yesterday (4th April 2024). 

I tried reading multiple threads on nextcloud forum and internet. Everyone had a similar error like in the image below but everyone had different sort of scenarios as the cause of the error. I tried reading the logs, but for a very newbie System Admin (would be a very wrong word lmao) like me, I wasn’t able to solve or fix the issue at all. 

Nextcloud Error

Hence, I reached out to one of my friend aka all time tech support - Kayg and he is pretty experienced with all of these things. But, because of MumbaiFOSS I wasn’t able to get on a with him and both of our times clashed. Meanwhile, I had also asked for help from Sahil and that was also pretty slow. So, from monday I did continuous calls with these folks to figure out and try solving the issue. For all 2 of them, the issue was very pretty new. The first day we were trying to debug the network and we noticed that the Server had firewall issues and the server wasn’t returning any replies when giving a ping. Then we tried TCPdump, switched firewalls, and disabled-enabled it but nothing seemed to work. The call with sahil was major to fix the caddy service, but the issue was very weird and we thought of moving to nginx instead of Caddy as they weren’t very familiar with caddy. 

The next day the calls with kayg went very fun, he gave 2-3 hours every day for the last 2-3 days and I got to learn a lot. We fixed caddy first of all by running caddy inside a container and not as a server system service. After that, it was time for the original nextcloud issues. 

All of the containers were running fine including Redis, nextcloud container, and apache which were our major suspicions. Kayg suggested I to change the docker network as mentioned in this github issue. And I did so, but that did not help either. 

But, the fun thing started when we invested 15 more minutes behind this and Kayg remembered that I told him that I had installed the SAML Auth Extension of Nextcloud. So, we did a docker exec into the nextcloud container and ran. 

sudo docker exec --user www-data -it nextcloud-aio-nextcloud php occ app:disable user_saml 

restarted the container and the service, and after a reload everything was back to normal. It was a damn funny and laughing moment for us. 

It’s okay to sound like a pain in the ass

One thing I feel is always important as someone who is asking for help is that you should be able to give a good Flashback of everything from top to bottom about whatever you did while you’re asking for some technical help even if that annoys the individual to death. They might call you dumb or anything, but I guess that’s not very important in that situation right?

I had learned this during the initial days when I used to break stuff almost every day and then try solving, give up and reach out to people. Even, if I was redirected to Google or websites, folks used to listen to everything I had done from the first command or first step. 

Imagine if I had not told both sahil and kayg about my extension installation scenario we would’ve ended up debugging an issue that did not really exist forever or even getting pissed off and reinstalling nextcloud on a fresh new server. 

I’d like to cite this from Julia’s Blog

State what you understand about the subject so far

So, being able to explain everything is a very important skill and also needs practice.