Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLAlchemyError error processing task CommitTask #132573

Open
MilleniumOfAssassin opened this issue Dec 7, 2024 · 76 comments
Open

SQLAlchemyError error processing task CommitTask #132573

MilleniumOfAssassin opened this issue Dec 7, 2024 · 76 comments

Comments

@MilleniumOfAssassin
Copy link

MilleniumOfAssassin commented Dec 7, 2024

The problem

Logger: homeassistant.components.recorder.core
Quelle: components/recorder/core.py:882
Integration: Recorder (Dokumentation, Probleme)
Erstmals aufgetreten: 19:46:04 (1 Vorkommnisse)
Zuletzt protokolliert: 19:46:04

SQLAlchemyError error processing task CommitTask()
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 882, in _process_one_task_or_event_or_recover
File "/usr/src/homeassistant/homeassistant/components/recorder/tasks.py", line 295, in run
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 1181, in _commit_event_session_or_retry
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 1207, in _commit_event_session
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2362, in execute
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2237, in _execute_internal
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2106, in _connection_for_bind
File "", line 2, in _connection_for_bind
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/state_changes.py", line 103, in _go
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 996, in _raise_for_prerequisite_state
sqlalchemy.exc.InvalidRequestError: This session is in 'prepared' state; no further SQL can be emitted within this transaction.

**I have no idea with this Errorcode. This is the first ErrorCode of many and Everything includes the recorder. What can i do? It is the 3rd time i started from the beginning and everytime this code comes up.

This is the second Errorcode**

Logger: homeassistant.components.recorder.core
Quelle: components/recorder/core.py:882
Integration: Recorder (Dokumentation, Probleme)
Erstmals aufgetreten: 19:46:04 (1 Vorkommnisse)
Zuletzt protokolliert: 19:46:04

Unrecoverable sqlite3 database corruption detected: (sqlite3.DatabaseError) database disk image is malformed [SQL: UPDATE states SET last_reported_ts=? WHERE states.state_id = ?] [parameters: [(1733597135.6587698, 8740), (1733597062.7175126, 8592), (1733597110.6194406, 8689), (1733597110.6196215, 8690), (1733597110.9234178, 8691), (1733597110.923583, 8692)]] (Background on this error at: https://sqlalche.me/e/20/4xp6)
Traceback (most recent call last):
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1936, in _exec_single_context
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/default.py", line 938, in do_executemany
sqlite3.DatabaseError: database disk image is malformed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 882, in _process_one_task_or_event_or_recover
File "/usr/src/homeassistant/homeassistant/components/recorder/tasks.py", line 295, in run
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 1181, in _commit_event_session_or_retry
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 1207, in _commit_event_session
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2362, in execute
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2247, in _execute_internal
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/bulk_persistence.py", line 1627, in orm_execute_statement
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/bulk_persistence.py", line 357, in _bulk_update
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/persistence.py", line 912, in _emit_update_statements
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1418, in execute
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/sql/elements.py", line 515, in _execute_on_connection
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1640, in _execute_clauseelement
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1986, in _exec_single_context
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 2355, in _handle_dbapi_exception
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1936, in _exec_single_context
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/default.py", line 938, in do_executemany
sqlalchemy.exc.DatabaseError: (sqlite3.DatabaseError) database disk image is malformed
[SQL: UPDATE states SET last_reported_ts=? WHERE states.state_id = ?]
[parameters: [(1733597135.6587698, 8740), (1733597062.7175126, 8592), (1733597110.6194406, 8689), (1733597110.6196215, 8690), (1733597110.9234178, 8691), (1733597110.923583, 8692)]]
(Background on this error at: https://sqlalche.me/e/20/4xp6)

and the 3rd one:

Logger: homeassistant.components.recorder.util
Quelle: components/recorder/util.py:314
Integration: Recorder (Dokumentation, Probleme)
Erstmals aufgetreten: 19:46:04 (1 Vorkommnisse)
Zuletzt protokolliert: 19:46:04

The system will rename the corrupt database file //config/home-assistant_v2.db to //config/home-assistant_v2.db.corrupt.2024-12-07T18:46:04.908274+00:00 in order to allow startup to proceed

**Everytime this comes up i need to unplug the Powercable. It runs for maybe 1 hour. After then the Home assistant stuck and they say YAML. Data not found. And the Errorcodes above come up.

Please Help :(**

What version of Home Assistant Core has the issue?

2024.12.1

What was the last working version of Home Assistant Core?

/

What type of installation are you running?

Home Assistant OS

Integration causing the issue

I dont know

Link to integration documentation on our website

No response

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

Rasberry Pi 5 512GB SSD Card

Additional information

No response

@home-assistant
Copy link

home-assistant bot commented Dec 7, 2024

Hey there @home-assistant/core, mind taking a look at this issue as it has been labeled with an integration (recorder) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of recorder can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Renames the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign recorder Removes the current integration label and assignees on the issue, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


recorder documentation
recorder source
(message by IssueLinks)

@sysdump
Copy link

sysdump commented Dec 8, 2024

It looks like your SD card might be failing.
The logs point at the recorder database file getting corrupted.
I would check that the power supply is strong enough, and test on a different SD card.

@sadyxa
Copy link

sadyxa commented Dec 10, 2024

I have the same problem. I have a 256GB SSD. The problem occurs at least once a day. Then I can't restart in menu and I have to disconnect the power supply.

@bdraco bdraco added problem in device Issue lies within the device, such firmware, software, or user customization/config hardware and removed problem in device Issue lies within the device, such firmware, software, or user customization/config labels Dec 10, 2024
@nikplas
Copy link

nikplas commented Dec 11, 2024

Same problem here, RPI5 with nvMe SSD 240GB.
Latest HAoS version

It occurs at least once a day, cant reset it via Developer Tools, it finds no config.file , no history data,no addons etc!

The only thing that helps is power off and on again!

@nikplas
Copy link

nikplas commented Dec 11, 2024

From the Logs:

Logger: homeassistant.components.recorder.core
Source: components/recorder/core.py:882
integration: Recorder (documentation, issues)
First occurred: December 10, 2024 at 23:35:05 (1 occurrences)
Last logged: December 10, 2024 at 23:35:05

SQLAlchemyError error processing task CommitTask()
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 882, in _process_one_task_or_event_or_recover
File "/usr/src/homeassistant/homeassistant/components/recorder/tasks.py", line 295, in run
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 1181, in _commit_event_session_or_retry
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 1207, in _commit_event_session
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2362, in execute
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2237, in _execute_internal
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2106, in _connection_for_bind
File "", line 2, in _connection_for_bind
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/state_changes.py", line 103, in _go
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 996, in _raise_for_prerequisite_state
sqlalchemy.exc.InvalidRequestError: This session is in 'prepared' state; no further SQL can be emitted within this transaction.

@nikplas
Copy link

nikplas commented Dec 11, 2024

Next error log:

Logger: homeassistant.components.recorder.core
Source: components/recorder/core.py:874
integration: Recorder (documentation, issues)
First occurred: December 10, 2024 at 23:35:05 (1 occurrences)
Last logged: December 10, 2024 at 23:35:05

Unrecoverable sqlite3 database corruption detected: (sqlite3.DatabaseError) database disk image is malformed [SQL: SELECT states_meta.metadata_id, states_meta.entity_id FROM states_meta WHERE states_meta.entity_id IN (?)] [parameters: ('sensor.voltage_phase_1',)] (Background on this error at: https://sqlalche.me/e/20/4xp6)
Traceback (most recent call last):
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/default.py", line 941, in do_execute
sqlite3.DatabaseError: database disk image is malformed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 874, in _process_one_task_or_event_or_recover
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 1020, in _process_one_event
File "/usr/src/homeassistant/homeassistant/components/recorder/core.py", line 1115, in _process_state_changed_event_into_session
File "/usr/src/homeassistant/homeassistant/components/recorder/table_managers/states_meta.py", line 63, in get
File "/usr/src/homeassistant/homeassistant/components/recorder/table_managers/states_meta.py", line 113, in get_many
File "/usr/src/homeassistant/homeassistant/components/recorder/util.py", line 195, in execute_stmt_lambda_element
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2362, in execute
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/session.py", line 2247, in _execute_internal
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/orm/context.py", line 305, in orm_execute_statement
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1418, in execute
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/sql/lambdas.py", line 603, in _execute_on_connection
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1640, in _execute_clauseelement
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1986, in _exec_single_context
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 2355, in _handle_dbapi_exception
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
File "/usr/local/lib/python3.13/site-packages/sqlalchemy/engine/default.py", line 941, in do_execute
sqlalchemy.exc.DatabaseError: (sqlite3.DatabaseError) database disk image is malformed
[SQL: SELECT states_meta.metadata_id, states_meta.entity_id
FROM states_meta
WHERE states_meta.entity_id IN (?)]
[parameters: ('sensor.voltage_phase_1',)]
(Background on this error at: https://sqlalche.me/e/20/4xp6)

@sadyxa
Copy link

sadyxa commented Dec 12, 2024

Hi, problem revealed. The error is in the NVMe memory card. I restored the HA on the microSD card and now I've been running for more than 24 hours without a recorder error. I bought a new 512GB ADATA NVMe card, today I'm going to try to run HA from a new memory card. I'll give you information here after the weekend.
HW configuration:
NVMe shield for RPi5 is: Suptronics X1001 2280 M.2 NVMe
Memory card: Patriot P300 128GB SSD disk M.2 (PCIe 3.0 4x NVMe) - Maybe there is a problem here. I have to test it.

@nikplas
Copy link

nikplas commented Dec 12, 2024

Ok thnks keep posted pls, its getting worse here, every few hrs i have to power it off and on again!
HA and integrations are all updated to todays versions.

HW configuration:
RPI5 4GB
Patriot 240 GB SSD , model: P310 nVMe (same like Sadyxa, previous post)
Pimoroni PIM699 nVMe base for RPI5

@sadyxa
Copy link

sadyxa commented Dec 12, 2024

Please try to migrate the DB system to MariaDB. There can be a problem here too, so you need to rule out a bug on the SW side (easy DB system SQLite). MariaDB is recommended by many HA users.

@nikplas
Copy link

nikplas commented Dec 12, 2024

I use an aftermarket power supply,this can be also the issue,

This weekend i will test it with the official power supply of 27Watts

@sadyxa
Copy link

sadyxa commented Dec 12, 2024

I use original Raspberry power supply 27W and I have this problem. RPi5 would give information if there was a problem with the power supply.

@nikplas
Copy link

nikplas commented Dec 16, 2024

an update for those are interested in this problem:

  • I ordered the RP official power supply (27W) and tried it for 2 days long with same setup, problem is the same, HAOS niet accessible, stuck, like mentioned in the earlier post. Power On/Off helps this , but only for a few hrs.
  • Next thing I did was removing the nvMe drive( Patriot 240 GB SSD , model: P310 nVMe),though leaving the nvme pimoroni base connected with the PRI5.
  • then I installed the latest HAOS on an 64GD sandisk extreme SDCARD (i used the imager on windows pc)and restored from the backup of the NvMe drive after booting the RPI with the SD card

24hrs later no issues, the RPI5 works very fast, low in cpu resources and about 25% of its RAM (4GB)

if this is an nvMe hardware issue is still not clear (this was an brand nieuw drive) or maybe something else

@nikplas
Copy link

nikplas commented Dec 16, 2024

@sadyxa
do you have any updates after replacing your nvme drive?

@sadyxa
Copy link

sadyxa commented Dec 16, 2024

Hi, I have an update and final information.

The bug is in the NVMe drive from the manufacturer PATRIOT. I bought an ADATA 512GB, did a system restore from backup and everything is OK.
The system has lower operating temperature, does not report any errors and has fast response. Result: Do not buy PATRIOT brand NVMe SSDs.

@sadyxa
Copy link

sadyxa commented Dec 16, 2024

I tried system restore on the microSD card as well, and there was no error for 12 hours. Then I migrated to a new ADATA NVMe drive.

@nikplas
Copy link

nikplas commented Dec 16, 2024

thnks for the updates
it looks like something goes wrong with those ssds van patriot, compatibility issues with RPI5?
maybe other users of this SSD model can confirm it too;
i have exact the same like yours, so this can not be a coincidence!
i wonder if patriot will accept a return.
keep testing and updating pls

@nikplas
Copy link

nikplas commented Dec 20, 2024

nearly 1 week later runing latest HAOS on the RPI5 with the SD card and everything works fine

@sadyxa
Copy link

sadyxa commented Dec 20, 2024

I got the same thing.

@wesleyRaposo
Copy link

Hello everyone!

My HA is on a Raspberry Pi 4b, which is perfectly suited for my automations. It uses an aluminum heatsink and never exceeds 46°C in temperature, as I created an automation that keeps it below that (it uses an external fan).

I use a generic M.2 SSD (not NVMe) with an aluminum heatsink on the USB 3.0 port, as the recommendation would be not to use an SD card.

The SSD is on an adapter that is also generic (I removed one side of the case to install the heatsink).

The power supply is from a Raspberry Pi 5, which means it has plenty of power.

Despite all the care I take, I suffer from the same problem as you.

At least once a week, an SQLAlchemy error occurs, which causes a series of cascading errors and I am forced to restart the Raspberry Pi.

The hardware analysis does not point to a single problem, but from what I saw in the reports above, did you only stop having problems when you installed an SD card? Is that it?

Could the problem be with the SSD adapter?

Has anyone had a problem with an SSD changed the manufacturer and solved it?

Below are the accessories for my hardware to make the analysis easier.

RP Heatsink:
https://pt.aliexpress.com/item/1005006864095375.html?spm=a2g0o.order_list.order_list_main.98.6fdccaa4YC6XoF&gatewayAdapt=glo2bra

M.2 SSD Adapter:
https://pt.aliexpress.com/item/1005006711391765.html?spm=a2g0o.order_list.order_list_main.228.6fdccaa4a3CSt1&gatewayAdapt=glo2bra

SSD M.2:
https://pt.aliexpress.com/item/1005006573728977.html?spm=a2g0o.order_list.order_list_main.237.6fdccaa4a3CSt1&gatewayAdapt=glo2bra

SSD Heatsink:
https://pt.aliexpress.com/item/1005005166579960.html?spm=a2g0o.order_list.order_list_main.25.6fdccaa4a3CSt1&gatewayAdapt=glo2bra

Power Supply power:
https://pt.aliexpress.com/item/1005006540949845.html?spm=a2g0o.order_list.order_list_main.158.6fdccaa4a3CSt1&gatewayAdapt=glo2bra

@nikplas
Copy link

nikplas commented Dec 20, 2024

Yes indeed, going back to SD card ( from the nvme ssd) has been the solution for me.
I have my Pi allready a week up and running with the SD card with no issues at all.
I am Amazed how good and fast performs the RPI5.

Our hardware setup is a bit different but so as you say, maybe is the SSD adaptor the issue and not the drive self.

When i get another nvme drive i will thorougly test it.
Maybe someone else has this allready tested

@florian-mollin
Copy link

Same problem for me (It happens about once a day). I have the following setup :

  • Raspberry 5 (8GB)
  • Geekworm X1001 PCIe to M.2 NVMe SSD Shield Top for Raspberry Pi 5
  • Official raspberry 5 power supply
  • EMTEC X300 M.2 256 Go PCI Express 3.0 3D NAND NVMe

Has anyone found a solution? :(

@nikplas
Copy link

nikplas commented Dec 22, 2024

Same problem for me (It happens about once a day). I have the following setup :

  • Raspberry 5 (8GB)
  • Geekworm X1001 PCIe to M.2 NVMe SSD Shield Top for Raspberry Pi 5
  • Official raspberry 5 power supply
  • EMTEC X300 M.2 256 Go PCI Express 3.0 3D NAND NVMe

Has anyone found a solution? :(

Switching back to SD card from m.2 ssd has worked for me, up and running allready 10days with no issues at all!

@florian-mollin
Copy link

Same problem for me (It happens about once a day). I have the following setup :

  • Raspberry 5 (8GB)
  • Geekworm X1001 PCIe to M.2 NVMe SSD Shield Top for Raspberry Pi 5
  • Official raspberry 5 power supply
  • EMTEC X300 M.2 256 Go PCI Express 3.0 3D NAND NVMe

Has anyone found a solution? :(

Switching back to SD card from m.2 ssd has worked for me, up and running allready 10days with no issues at all!

Thanks for the info, but I wish I'd used my SSD for reliability :(

@sadyxa
Copy link

sadyxa commented Dec 22, 2024

I was only helped by replacing the NVMe drive.
I had the same problem, same shiedl X1001 and the drive was PATRIOT brand. I now have an ADATA and have been running for a week with no problems. MicroSD is also without problems. Today I migrated to MariaDB database.

@1visionmaster
Copy link

Same problem here (rpi5, 8GB RAM, SK HYNIX 128GB M2 NVMe). Going through the comments and the NVMe hardware issues don't make sense for me. My setup worked from the beginning ( 1 year+) and a couple of weeks ago SQLite error logs showed up resulting in non-responsive webui/ssh, and the need for rpi restart (once a day). It looks like a faulty SW update affecting specific hardware configurations... It's unlikely that a number of different NVMe drives failed approximately at the same time (I believe that not only my setup worked flawlessly before the issues occured)...

I totally agree. System working for more than a year, starting with 2024.12 issues started. 2-3 times per day logger throws SQLite error regardless of SD or NVME is used

Do what user "kfran78" did, changing the connection to the USB 2.0 port and see if it improves. I found his test interesting. If we have any further confirmation, the problem may be some update to the USB port driver and not the SSD drives.

Not able to quickly change to USB 2.0 so I decided to reverted to HAOS 13.2 - system is rock solid since then.

@kfran78
Copy link

kfran78 commented Jan 10, 2025

I've switched today my nmve with HAT to an external usb3 adapter ..fresh install, restore back up and we will see.

@sadyxa
Copy link

sadyxa commented Jan 13, 2025

Again, there was a problem and the system crashed once a day. I found out that it occurs at the time of backup. I am now testing the backup settings. I have dual backup -> HA native backup and OneDrive addon.

@berkh
Copy link

berkh commented Jan 14, 2025

I have also seen the same error after upgrading to the latest 14.1 OS. And after reading this thread, I booted from the old version 12.4 and everything is working fine again.

I went to terminal and send ha is info if you also have your old boot option there (I have A-old version before upgrading to 14.1 and B boot option; the latest) and then send ha os boot-slot other at the terminal. That worked for me and I dont think this is a hardware issue like @wnawr0t

@kfran78
Copy link

kfran78 commented Jan 14, 2025

No SQL crash for me since i moved my nmve on usb3

@nikplas
Copy link

nikplas commented Jan 19, 2025

No isuue for me since i went back to SD card from NvNe, allready 6weeks ago. system is RPi5 with all latest updates .

@sadyxa
Copy link

sadyxa commented Jan 19, 2025

No isuue for me since i went back to SD card from NvNe, allready 6weeks ago. system is RPi5 with all latest updates .

In my configuration, even downgrade to HAOS 12.4 did not help. This weekend I successfully migrated the entire Home Assistant system to PROXMOX.

@nikplas
Copy link

nikplas commented Jan 24, 2025

it doesnt make sense anymore, is this a software or hardware issue?

sadyxa migrates to proxmox after testing all possible combinations RPI5, while issue persists, with different NvMes and different HAOS versions!

@wesleyRaposo
Copy link

it doesnt make sense anymore, is this a software or hardware issue?

sadyxa migrates to proxmox after testing all possible combinations RPI5, while issue persists, with different NvMes and different HAOS versions!

It doesn't make sense, but unfortunately it happens.
From history, it seems that those who use M.2 NVMe SSDs suffer more.
In my case, my M.2 SSD is a regular one (not NVMe). So, instead of having daily crashes, they occur more randomly, taking a maximum of a week to occur.
It seems to be a software problem, since some users migrated to the SSD card and the problem stopped occurring.
Some users downgraded the version and had more stability, although it did not solve the problem 100%.

@nikplas
Copy link

nikplas commented Jan 24, 2025 via email

@nikplas
Copy link

nikplas commented Jan 25, 2025

in other words, RPI5 with HA + nvme HAT + NvMe SSD, DONT DO IT

@wesleyRaposo
Copy link

in other words, RPI5 with HA + nvme HAT + NvMe SSD, DONT DO IT

Personally, I don't see any advantage in using NVMe.
My machine is a Raspberry Pi 4b with 4GB of RAM. It only uses a maximum of 1.5GB of RAM. I have a lot of automation (really) and the machine makes everything work smoothly. An SSD that is faster than a SATA shouldn't make any difference.

But that doesn't justify this problem that people are having. It's an error that needs to be analyzed and corrected, after all, NVMe SSDs are quite common now.

@eskrasic
Copy link

I have absolutely the same issue with RP4. I am using a portable SSD. Rolling back to the previous OS version didn't help me. I guess it is not a pure OS regression but a mix of different updates. The only idea i have for now is to use an SD card instead of SSD.

@kfran78
Copy link

kfran78 commented Jan 27, 2025

No SQL crash for me since i moved my nmve on usb3

I confirm. No problem with a nmve adaptater on USB 3. With a HAT, don't do that. Performance are equal.

@sadyxa
Copy link

sadyxa commented Jan 27, 2025

I've reinstalled RPi 5 on RPi OS where I run VPN and Wireguard. A week without a single problem. It will be a problem of HAOS and HA itself.

@wesleyRaposo
Copy link

This problem is a real pain! It appears out of the blue!
It happened again today.
The worst part is that I can't even save the log (it doesn't allow me to export it to a file - it says it's not available), and when the system is restarted, the log files are clean.

Image

@wnawr0t
Copy link

wnawr0t commented Jan 29, 2025

Hi all,
Let me share my investigation results:

  1. The SQL issues are closely related to https://community.home-assistant.io/t/home-assistant-os-14-breaks-nvme-ssd-usage-on-rpi5/817499
  2. Once I/O errors for the NVME drive show up on the HDMI display (see the 1st screenshot from the above link), the DB errors start to appear in the HA log and some time later HA gets unresponsive (RPI5 hard reboot is required)
  3. I first downgraded HAOS from 14.1 to 14.0 and the NVME errors seemed to be gone for the first 24 hours, but they got back.
  4. Finally, I downgraded to HAOS 13.2. After that, all the NVME errors vanished and there are no SQL errors anymore (my RPI is up for the last 2 weeks).

HAOS 14.2 has just been released and I am curious if it fixes the issues of 14.0 and 14.1. Please share your observations.
Best
Wojciech

@wesleyRaposo
Copy link

Hi all, Let me share my investigation results:

  1. The SQL issues are closely related to https://community.home-assistant.io/t/home-assistant-os-14-breaks-nvme-ssd-usage-on-rpi5/817499
  2. Once I/O errors for the NVME drive show up on the HDMI display (see the 1st screenshot from the above link), the DB errors start to appear in the HA log and some time later HA gets unresponsive (RPI5 hard reboot is required)
  3. I first downgraded HAOS from 14.1 to 14.0 and the NVME errors seemed to be gone for the first 24 hours, but they got back.
  4. Finally, I downgraded to HAOS 13.2. After that, all the NVME errors vanished and there are no SQL errors anymore (my RPI is up for the last 2 weeks).

HAOS 14.2 has just been released and I am curious if it fixes the issues of 14.0 and 14.1. Please share your observations. Best Wojciech

Some considerations:

  1. The problem is not exclusive to NVMe SSDs. My SSD is a regular M.2 (not NVMe) and the problem also occurs.
  2. I updated the OS to version 14.2 and the problem just occurred today.

@nikplas
Copy link

nikplas commented Jan 29, 2025 via email

@wesleyRaposo
Copy link

wesleyRaposo commented Jan 30, 2025

This will be my last attempt (or not, since I'm persistent) to get around the problem while it's still being solved.

I created an automation to capture the system's error messages.
I record the composition of this thought in an "input_text".
Then I check if the word "SQLAlchemyError" occurs in the "input_text" string.
If it does, two scripts will be executed:

  1. Send an alert message to my phone.
  2. Send a TTS message to my phone.
    Then it clears the "input_text" and, finally, restarts the system completely (reboot).

Now I just have to wait for the problem to happen again.

  • The idea is to try to reboot the system before it freezes.

If anyone wants to replicate the experiment, here's the code for reference:

alias: HA - Capturar SQLAlchemyError e reiniciar sistema
description: ""
triggers:

  • trigger: event
    event_type: system_log_event
    conditions: []
    actions:
  • action: input_text.set_value
    metadata: {}
    data:
    value: >-
    Name = "{{ trigger.event.data.name }}" Level = "{{
    trigger.event.data.level }}" Message = "{{
    trigger.event.data.message[0] }}" Source = "{{
    trigger.event.data.source[0] }}"
  • if:
    • condition: template
      value_template: >-
      {{ 'SQLAlchemyError' in
      states('input_text.ha_ultima_mensagem_de_erro') | string }}
      then:
    • action: script.mensagem_notificacao
      data:
      campoAlarme: true
      campoTitulo: Importante!
      campoMensagem: >-
      Ocorreu o erro crítico "SQLAlchemyError". Reiniciando Home
      Assistant.
      campoDispositivo: S23
    • action: script.mensagem_notificacao_tts
      metadata: {}
      data:
      campoDispositivo: S23
      campoVolume: Alto
      campoMensagem: >-
      Atenção! Ocorreu o erro crítico "SQL Alchemy Error"! Reiniciando
      Home Assistant.
    • action: input_text.set_value
      metadata: {}
      data: {}
      target:
      entity_id: input_text.ha_ultima_mensagem_de_erro
    • delay:
      hours: 0
      minutes: 0
      seconds: 10
    • action: hassio.host_reboot
      metadata: {}
      data: {}
      enabled: true
      mode: single

Image

@eskrasic
Copy link

Nice workaround, thanks for sharing!

@nikplas
Copy link

nikplas commented Jan 30, 2025

A nice idea indeed, thnks for sharing

@wesleyRaposo
Copy link

I made important adjustments to the code.
I restricted the string size to 255 characters so as not to overflow the input_text.
There was also a missing "target" when assigning the text to the input_text and this generates a fatal error.

alias: HA - Capturar SQLAlchemyError e reiniciar sistema
description: ""
triggers:

  • trigger: event
    event_type: system_log_event
    conditions: []
    actions:
  • action: input_text.set_value
    metadata: {}
    data:
    value: >-
    "{{ 'Name = ' ~ trigger.event.data.name ~ '| Level = ' ~
    trigger.event.data.level ~ '| Message = ' ~
    trigger.event.data.message[0] ~ '| Source = ' ~
    trigger.event.data.source[0] | truncate(254) }}"
    target:
    entity_id: input_text.ha_ultima_mensagem_de_erro
  • if:
    • condition: template
      value_template: >-
      {{ 'SQLAlchemyError' in
      states('input_text.ha_ultima_mensagem_de_erro') | string }}
      then:
    • action: script.mensagem_notificacao
      data:
      campoAlarme: true
      campoTitulo: Importante!
      campoMensagem: >-
      Ocorreu o erro crítico "SQLAlchemyError". Reiniciando Home
      Assistant.
      campoDispositivo: S23
    • action: script.mensagem_notificacao_tts
      metadata: {}
      data:
      campoDispositivo: S23
      campoVolume: Alto
      campoMensagem: >-
      Atenção! Ocorreu o erro crítico "SQL Alchemy Error"! Reiniciando
      Home Assistant.
    • action: input_text.set_value
      metadata: {}
      data:
      value: "- - -"
      target:
      entity_id: input_text.ha_ultima_mensagem_de_erro
      enabled: false
    • delay:
      hours: 0
      minutes: 0
      seconds: 10
    • action: hassio.host_reboot
      metadata: {}
      data: {}
      enabled: true
      mode: single

@akira215
Copy link

akira215 commented Feb 5, 2025

Exaclty same issue on nvme ssd m.2 running on HAT of an RPi5. Digged a lot without success. I just tried @wnawr0t solution that seems to be the more reliable:

ha os update --version 13.2 in the terminal.

I will update if any news after that.

Thanks a lot for this thread

@wesleyRaposo
Copy link

Fellow users, I am pleased to inform you that my automatic reboot routine when the damned "SQLAlchemyError" occurs worked!
I was at the market shopping when my smartphone started "screaming" informing me of the occurrence of the error. Minutes later I was already accessing HA remotely!

As I explained previously, my routine does the following:

  1. Captures the error or warning messages from the system and places them in an "input_text".
  2. Reads the content of the "input_text" and checks if there is an occurrence of the expression "SQLAlchemyError" in it.
  3. Sends a notification message (in text) to my phone.
  4. Sends a TTS (voice) notification message to my phone.
  5. Waits 10 seconds (to give time for processing the previous commands).
  6. Reboots the system by executing the "reboot the host system" routine.

This is not a solution to the problem, but it is a palliative that, for me, is acceptable, because at least now I will not be frustrated because some automation stopped happening because the system froze and I did not know about it.

I will only add a small refinement so that the voice message does not play in the middle of the night, because the sound is loud and I do not want to wake up scared, I just want the system to start working again.

  • One point to note is that my script only sends the messages once, but I received them three times in a row. This makes me believe that the "SQLAlchemyError" occurs when the events are registered in the system and, when sending the message, the attempt to register new events triggers the error again, and this happened until the system restarts.
    And the error is so serious that the commands that my automation ran to restart the system were NOT registered. I have the logs of the ZigBee devices changing status to "unknown" and it only appears again when the HA changes to "started".

** The print is from the notification I received on the phone. (I speak Brazilian Portuguese.)

Image

@akira215
Copy link

Anyone tried with the update 14.2 ? From my side everything is running fine since one week on the 13.2 but I get errors as soon as I switch to 14.x

@wesleyRaposo
Copy link

Anyone tried with the update 14.2 ? From my side everything is running fine since one week on the 13.2 but I get errors as soon as I switch to 14.x

I don't remember which version of the OS I started with, but the error already existed. I've been using Raspberry Pi for less than 9 months.

I always keep my system up to date (I back it up beforehand to go back if necessary), so I'm already on 14.2.

However, since I did the update, the error hasn't occurred yet, but it hasn't been a week yet.

One thing I did differently was to run a data purge, leaving only 1 day in the database. The impression I had (and it may just be an impression) is that this made the system more stable. If the errors stop (and only time will tell), I'll make this a standard, because I'm not interested in analyzing how many times a light was turned on or off, or anything like that.

I'm currently running a weekly purge, keeping 5 days of data in the database.

Image

@wesleyRaposo
Copy link

I haven't received the SQLAlchemyError error for a few days. However, since there have been several updates, both to the core and to the addons, I have restarted the system several times. This may have prevented the error from occurring, but it may also have been my database purge routine. Now it runs twice a week and I only keep two days of data.

However, another problem that really bothers me is when HA has no internet access.

I use this system precisely because it can work offline, but obviously, if there is internet available, I want the system to benefit from it.

But this isn't the first time that the system has CRASHED when it has no internet access.

Today I was without electricity for 20 minutes. My Raspberry Pi where HA runs has a UPS with a huge battery, which ensures that it will work for many hours. The telephone operator's modem/router also has its own UPS, but now I distribute the signal through a more powerful router and this one doesn't have power support yet. Because of this, HA was without internet access for 20 minutes. It was enough to end up crashing.

In addition to HA itself, which starts to display errors when it is disconnected, I also have the following integrations and add-ons that access the internet:

  • LocalTuya
  • Gmail
  • Alexa Media Player
  • My IP

I don't understand this weakness of HA in relation to the absence of internet and it bothers me a lot.
In my opinion, HA should "encapsulate" this internet access. If the signal is unavailable, the integrations and add-ons should be suspended, and only resume their access when HA signals that it is available.

For applications developed by the community, a document would be issued instructing on the updates necessary to meet this operating standard. This would guarantee the stability of the system.

I have no way of knowing if it was HA itself that crashed due to lack of internet or if it was a third-party integration. The log shows warnings and errors from everyone without anyone being flagged as responsible for the problem.

Has anyone else experienced this?

@wesleyRaposo
Copy link

Fellows,
I reduced the database to the maximum by purging it every 3 days, leaving only 2 days of logs.
Even so, after several days, this hellish error crashed the machine again.

Some change they made did not allow my routine to restart the machine automatically because every time it tried to notify me before restarting, it generated a new error and entered a loop.

This damned error is directly related to an attempt to write to the database when, for some reason, it becomes unavailable.

To try (again) to solve this, I parameterized HA to not record anything anymore. Yes, there will be no more event history and I don't care. I just want the system to be stable.

And if the error occurs again for some inexplicable reason, the restart routine will simply restart the entire system without giving any notification.

This stopped being a mere challenge and became a personal issue.

@wesleyRaposo
Copy link

wesleyRaposo commented Feb 28, 2025

Fellows,
I reduced the database to the maximum by purging it every 3 days, leaving only 2 days of logs.
Even so, after several days, this hellish error crashed the machine again.

Some change they made did not allow my routine to restart the machine automatically because every time it tried to notify me before restarting, it generated a new error and entered a loop.

This damned error is directly related to an attempt to write to the database when, for some reason, it becomes unavailable.

To try (again) to solve this, I parameterized HA to not record anything anymore. Yes, there will be no more event history and I don't care. I just want the system to be stable.

And if the error occurs again for some inexplicable reason, the restart routine will simply restart the entire system without giving any notification.

This stopped being a mere challenge and became a personal issue.

Inclusions I made in configuration.yaml:

Image

Image

System restart automation:

Image

The "conditions" are parameters I created using helper entities. For you, this condition doesn't even need to exist or it can be different.
The "value_tamplate" is also specific. I recommend keeping only the first part, keeping the "truncate" at the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests