failed – nocin.eu

Proxmox

[Proxmox] Upgrade 7.4 to 8.0 – Failed to run lxc.hook.pre-start for container

September 11, 2023 September 11, 2023

After updating my Proxmox Server to PVE8.0, suddenly two lxc containers did not start anymore.

root@pve:~# pct start 192
run_buffer: 322 Script exited with status 2
lxc_init: 844 Failed to run lxc.hook.pre-start for container "192"
__lxc_start: 2027 Failed to initialize container "192"
startup for container '192' failed

I tried to view the error.log but couldn’t find any helpful information.

lxc-start -lDEBUG -o error.log -F -n 192

When googling, I stumbled across this reddit post. Although the issue was a bit different, I tried the recommended steps. The first command, directly led me to the right direction…

root@pve:~# pct mount 192
mounting container failed
directory '/mnt/nfs/data/folder' does not exist

For whatever reason, after restarting proxmox it did not mount the nfs shares properly on the host. And of course, after this hint, I noticed that both containers were trying to mount some of these folders, which were actually nfs shares from my NAS. A simple mount -a on the host fixed it immediately. Besides of this little problem, everything went well with the proxmox upgrade!

CAP

[CAP] Login to CF using a bash script

April 25, 2022 September 6, 2024

Create your script file, make it executeable and add it to your .gitignore as it contains sensitive information:

touch login.sh
chmod +x login.sh
echo login.sh >> .gitignore

Open the file and paste the following:

#! /bin/bash
cf login <<!
myemail@mail.com
mypassword
1 
!

With “1” you select your target space. Save your script and run it using:

./login.sh

After some time, it can happen that the default identity provider of the SAP BTP (SAP ID service) is asking for a password change. I don’t know exactly, but it seems to be every 90 days?!
The login process will fail with the following output:

$ ./scripts/login.sh
API endpoint: https://api.cf.eu10.hana.ondemand.com

Email: myemail@mail.com
Password: 

Authenticating...
{"error":"invalid_grant","error_description":"User authentication failed: PASSWORD_CHANGE_REQUIRED"}

To change your password, just go to https://account.sap.com or https://accounts.sap.com/, and it should directly open the password change screen.

Update 06.09.2024: The login can now also be done by completely using the cf command.

cf login -a https://api.cf.eu10.hana.ondemand.com -o myOrg -s mySpace -u myEmail@mail.com -p myPassword

Nextcloud

[Nextcloud] Docker update 21.0.7 to 21.0.8

February 17, 2022 December 2, 2024

And again… a Nextcloud upgrade failed. After a docker-compose pull and docker-compose up -d the maintenance mode won’t turn off. So I tried to turn it off manually, but I still couldn’t finish the update via WebGUI. I also couldn’t find any errors via the log.

docker exec --user www-data nextcloud-app php /var/www/html/occ maintenance:mode --off

So I triggered the upgrade again from the terminal and finally got an exception.

$ docker exec --user www-data nextcloud-app_1 php /var/www/html/occ upgrade
Nextcloud or one of the apps require upgrade - only a limited number of commands are available
You may use your browser or the occ upgrade command to do the upgrade
Setting log level to debug
Turned on maintenance mode
Updating database schema
Updated database
Updating <user_ldap> ...
An unhandled exception has been thrown:
Error: Call to undefined method OC\DB\QueryBuilder\QueryBuilder::executeQuery() in /var/www/html/apps/user_ldap/lib/Migration/GroupMappingMigration.php:56
Stack trace:
#0 /var/www/html/apps/user_ldap/lib/Migration/Version1130Date20220110154717.php(54): OCA\User_LDAP\Migration\GroupMappingMigration->copyGroupMappingData('ldap_group_mapp...', 'ldap_group_mapp...')
#1 /var/www/html/lib/private/DB/MigrationService.php(528): OCA\User_LDAP\Migration\Version1130Date20220110154717->preSchemaChange(Object(OC\Migration\SimpleOutput), Object(Closure), Array)
#2 /var/www/html/lib/private/DB/MigrationService.php(426): OC\DB\MigrationService->executeStep('1130Date2022011...', false)
#3 /var/www/html/lib/private/legacy/OC_App.php(1012): OC\DB\MigrationService->migrate()
#4 /var/www/html/lib/private/Updater.php(347): OC_App::updateApp('user_ldap')
#5 /var/www/html/lib/private/Updater.php(262): OC\Updater->doAppUpgrade()
#6 /var/www/html/lib/private/Updater.php(134): OC\Updater->doUpgrade('21.0.8.3', '21.0.7.0')
#7 /var/www/html/core/Command/Upgrade.php(249): OC\Updater->upgrade()
#8 /var/www/html/3rdparty/symfony/console/Command/Command.php(255): OC\Core\Command\Upgrade->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#9 /var/www/html/3rdparty/symfony/console/Application.php(1009): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#10 /var/www/html/3rdparty/symfony/console/Application.php(273): Symfony\Component\Console\Application->doRunCommand(Object(OC\Core\Command\Upgrade), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#11 /var/www/html/3rdparty/symfony/console/Application.php(149): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#12 /var/www/html/lib/private/Console/Application.php(215): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#13 /var/www/html/console.php(100): OC\Console\Application->run()
#14 /var/www/html/occ(11): require_once('/var/www/html/c...')

Quick search on google and it seems that the image for 21.0.8 is broken and already withdrawn. Here and here. Great if users can still pull it from the docker repo…

I disabled the user_ldap addon via command line. This site helped me finding the right commands. After another occ upgrade and a few minutes, the instance finally came back online.

docker exec --user www-data nextcloud-app php /var/www/html/occ app:list
docker exec --user www-data nextcloud-app php /var/www/html/occ app:disable user_ldap
docker exec --user www-data nextcloud-app php /var/www/html/occ maintenance:mode --off
docker exec --user www-data nextcloud-app php /var/www/html/occ upgrade

In the end I also stumbled across the official nextcloud blog, where they announce 20.0.9 and that they had problems with 20.0.8…. But if the new docker image is not yet provided, and you can’t downgrade from your broken 20.0.8 back to 20.0.7, this doesn’t help you at all.

Nextcloud

[Nextcloud] Docker upgrade 20.0.10 to 21.0.3

July 21, 2021 December 2, 2024

As always the nextcloud update failed for me…

After a quick search I found this post. Seems like using mariadb:latest is not a good idea anymore. After switching to mariadb:10.5 and manually turning the maintenance mode off I could proceed the update process.

$ docker exec --user www-data nextcloud-app php /var/www/html/occ maintenance:mode --off
Nextcloud or one of the apps require upgrade - only a limited number of commands are available
You may use your browser or the occ upgrade command to do the upgrade
Maintenance mode disabled

NAS

[TrueNAS] Replacing failed disk

March 29, 2021 March 31, 2021

Yesterday I got a mail about a failed disk in my TrueNAS system.

So I logged in and checked the Alerts and also zpool status.

Luckily I had an unused disk lying around. So just did a quick look in the TrueNAS Wiki (https://www.truenas.com/docs/hub/tasks/advanced/disk-replace/) and switched the drives…

After 5 minutes everything was done and the resilvering started.

It took about 4 hours for the 3TB disk. A zool clear poolname removes the error message and the pool was back online.

ZFS

[ZFS] Replace failed disk on my Proxmox Host

May 16, 2020 March 17, 2022

Yesterday evening I got an email that on my Proxmox server a disk had failed. In my ZFS Raidz1 I have 4 different drives of two manufactures: 2x HGST and 2x Seagate.
In the last 7 years I also used some Western Digitals. The only faulty hard drives I had in this years were from Seagate. This was the third… So this morning I bought a new hard disk, this time a Western Digital Red, and replaced the failed disk.

SSH into my server and checked the zpool data. Because I already removed the failed disk, it’s marked as unavailable.

failed disk: wwn-0x5000c5009c14365b

Now I had to find the Id of my new disk. With fdisk -l, I found my new disk as /dev/sde, but there was no id listed.

sudo fdisk -l

To be sure I checked again with:

sudo lsblk -f

With disk by-id I now got the Id.

ls /dev/disk/by-id/ -l | grep sde

new disk: ata-WDC_WD40EFRX-68N32N0_WD-WCC7K1CSDLRT
and again the failed disk: wwn-0x5000c5009c14365b

Before replacing the disks, I did a short SMART test.

sudo smartctl -a /dev/sde
sudo smartctl -t short /dev/sde
sudo smartctl -a /dev/sde

The new disk had no errors. And because it is a new disk, I don’t had to wipe any file systems from it.

So first I took the failed disk offline. Not sure if that was necessary, because I already had removed the disk.

sudo zpool offline data 2664887927330352988

Next run the replace command.

sudo zpool replace data /dev/disk/by-id/wwn-0x5000c5009c14365b-part2
/dev/disk/by-id/ata-WDC_WD40EFRX-68N32N0_WD-WCC7K1CSDLRT

The resilver process for the 3TB disk took about 10 hours.