Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is anyone else having issues if ESP32 is powered solely bei PoE? #715

Open
fredlcore opened this issue Jan 19, 2025 · 15 comments
Open

Is anyone else having issues if ESP32 is powered solely bei PoE? #715

fredlcore opened this issue Jan 19, 2025 · 15 comments
Labels
hardware issue Problems related to (third-party) hardware issues help wanted

Comments

@fredlcore
Copy link
Owner

Discussed in #714

Originally posted by parsley January 19, 2025
Hi!

In November 2024 I updated my BSB-LAN setup from Arduino Due to Olimex ESP32-PoE-ISO. One of my goals was to power it with PoE only.
But BSB-LAN on the ESP32 did never run as stable as on the Due and frequently rebooted especially if I browsed the Web-UI. I was unable to debug this issue but at some point I realized the issue did not occur as long as my PC was connected via USB for debugging.
After that I "fixed" the issue by powering the ESP32-PoE-ISO with a 5V power supply. Today I have an uptime of 42 days on version 4.2.37-20241207010916. But I still would like to get rid of the extra 5V power supply...

PoE is providet by an Unifi USW-24-PoE rated at 44-57V (Pins 1, 2+; 3, 6-).
I'm not 100% sure but remamber having instant issues wenn calling bsb-lan web pages that trigger BSB bus communication and slightly less instant issues when calling pages like /C or the index page that do not trigger BSB communication.

Does anyone have the same issue or any clue why PoE is not sufficient?

Regards!

@fredlcore
Copy link
Owner Author

fredlcore commented Jan 19, 2025

@parsley: I have created an issue from this, so it's easier for others to find and refer to it in case we found a solution.

Yes, unfortunately, that could be a hardware issue from Olimex which we couldn't completely solve yet because all my testers in the end did not follow up on me :(.
See this issue over at Olimex for some background:
OLIMEX/ESP32-EVB#57
as well as here:
OLIMEX/ESP32-POE#49
and Olimex kind of confirmed that it could be the same issue:
OLIMEX/ESP32-EVB#57 (comment)

If you are willing to make some tests and are comfortable with editing BSB-LAN's sourcecode, I'd be willing to guide you through the process. Because of course, if there is anything that I can do on the source-code side, then I'm happy to do that. But since I don't have PoE myself, I cannot test it myself...

@fredlcore
Copy link
Owner Author

fredlcore commented Jan 19, 2025

In order to figure out whether something can be done on the software side, these are the steps that would need to be tested:

  1. Could you replace bsb.cpp in the src/BSB folder with the bsb.cpp that Dan Koloff attached to one of his posts? Can you confirm that the problem of reboots goes away then? If yes, could you still also access the heater and get valid data? Or do his changes simply mean you can access the web-interface, but due to the invalid data coming in, it’s basically impossible to interact with the heater?

  2. Could you do the following tests with the BSB-LAN adapter removed(!) from BSB-LAN and after each code change revert to the original state before running the next test (of course, removing the adapter means you cannot access the heater, only the web-interface):
    2.1. Can you confirm that the problems of reboots are still there when the adapter is removed?
    2.2. Can you change line 25 in bsb.cpp from "serial = &Serial1;" to "serial = &Serial;" and confirm that the problem goes away and accessing the web-interface is possible without any problems?
    2.3. Can you change line 52 in bsb.cpp from "Serial1.begin(4800, SERIAL_8O1, rx_pin, tx_pin);" to "Serial1.begin(4800, SERIAL_8O1, 1, tx_pin);“ and confirm that the problem goes away?
    2.4. If the problem goes away, could you test with pins 4, 36, 16, 13, 15, 2 and 14 and see which ones create a problem and which ones would work?

If all of this turns out to be true, the barrel connector / the PoE power-supply circuitry is somehow affecting pin 36 (and maybe other pins, depending on the outcome of 2.4.). If one of the pins from 2.4. work, I could change the RX pin in future hardware revisions of BSB-LAN if Olimex isn’t going to fix this somehow. And existing users could just use a small jumper wire, if they rely on powering the Olimex through the barrel connector or PoE.

@fredlcore fredlcore added help wanted hardware issue Problems related to (third-party) hardware issues labels Jan 19, 2025
@parsley
Copy link

parsley commented Jan 19, 2025

First of all:
Wow! What an insanely fast response!
I‘m sorry I won’t be that fast with further responses. Mostly due to lack of spare time.

That said:
Thank you for the additional information.
I'm happy to support you and the project. Code editing is no issue and hardware tools are available as well. But my single ESP32-PoE-ISO is taking care of my family home heating right now. 🫣😉
I‘ll read through the links you posted and will see what I can do.

By the way: One reason why this topic pops up right now is that yesterday Youtube suggested this video: https://www.youtube.com/watch?v=NICRlC5gzBo It's two years old but new for me and sparked my desire to solve this issue...

@fredlcore
Copy link
Owner Author

Well, it won't have to impact your heating in any way - just set the heating mode to "comfort" and the heater will just heat as usual. Once you have completed testing, simply return to whatever it was before, and noone will notice ;).
The thing is that most of these tests should be able to do within half an hour or so. After that, everything is back to normal (hopefully with a better idea how to solve the reboots which might jeopardize your family home heating much more ;)...

@uschindler
Copy link
Contributor

uschindler commented Jan 19, 2025

Hi,
I started to use a OLIMEX ESP32 PoE ISO in October. At beginning it worked well and it was really nicely integrated in the heating, as you would wish it:

Image

This worked well for about two weeks. It then started to suddenly disconnect from ethernet, but it was not rebooting! In the serial log, you were able to see that it reported "ethernet disconnected" and reconnected withing milliseconds. This happeneded about once per day, mostly during morning time when the inner part of heating got hotter due to warmup after night. In addition to several possibly related disconnections due to a bad MQTT connections stability (which may has to do with some lost packets on the LAN wire and a wrong implementation of the pubsubclient, so it wasn't able to detect this status correctly (see issue #682 and PR #683), this looked like I was able to live with.

In December 15th then it started to get worse. The physical ethernet disconnects started to happen every half hour and later the device turned into a on/off flipper. The switch showed "connection on and connection off" all the time and DHCP server ackowledged the same IP over and over. The device was never rebooting.

The reason for this is well known and OLIMEX did not fix it and there is only one seolution to fix it (I did not try it, see below). The problem is that the OLIMEX needs to take at least 12 mA of current from the switch, otherwise it could be switched off by the switch (this is called MPS "Maintain power signature" in PoE). Therefor they added a larger 1W resistor with 4.7kOhm in newer versions of the board in parallel to the ~48 V (52 Volts on my switch). If you calculate how much power is translated to heat, you will figure out that this exactly matches the 12 mA, but costs more than half a Watt of power. The resistor gets really, really hot (it's on the downside of the board, a larger SMD one with "R41" printed next to it). The resistor is close to 2 Elkos and the large black voltage transformer. The heating of the resistor heats both elkos and the transfomer and possibly (according to forums) the elkos fail due to 90 °C of heat sooner or later, especially in small enclosures or inside a Brötje Heater.

OLIMEX Forum:

I disabled PoE on my switch and connected an USB device, but the ethernet connection did not restore to normal operation. It looks like in my case the large heat from the resistor "melted" the transformer or one of the elkos around, possibly it may have also broke the ethernet plug. Not fully sure! In my case the Ethernet is now .... completely damaged! As I did not want to buy a new ESP32, I switched over to WiFi with an USB plug that I connected with clamps to the 230 volts output top-right on the LMU in above picture (there are some connectors labeled X2.1, which will also be turned on/off together with the heater). My new installation does not looks so nice anymore, but is completely stable. No reboots, no disconnects. The ether cable is still there if I am going to replace the broken platina later. I placed the antenna behind the display unit, so it can get a connection in the otherwise "Faraday Cage" treated Brötje box. I have no actual photo of that, will take one once it is open again. There you can see the 230V connection to X2.1 and some "Eurosteckerverlängerung".

If your ethernet part is still working without PoE (disabled in switch and connected USB), you may to take a solder and remove the resistor R41 from the lower side. It is completely useless, as the device together with the BSB-LAN Adaptor takes more than 12 mA. This spares between 0.5 and 1 Watt of wasted power. According to the switch, the device with the resistor takes 2.7 W, so you can save a lot of power (and heat).

@fredlcore
Copy link
Owner Author

Thanks, this is also mentioned in the video @parsley linked above, but this is a different (albeit similarly disturbing) error on behalf of Olimex.
I'd like to focus here on the initial issue because even if the PoE implementation is broken also otherwise, solving it would also help those users who power their Olimex with a barrel connector.

@uschindler
Copy link
Contributor

uschindler commented Jan 19, 2025

The behaviour on my side was exactly like described above. Sometimes when connecting to the web page it returned nothing. At some point it was not even possible to upload a new firmware as conmnection broke.

The original reporter said, that PoE was unstable and the USB connection was stable. So this is the opposite of the issues mentioned above:

But BSB-LAN on the ESP32 did never run as stable as on the Due and frequently rebooted especially if I browsed the Web-UI. I was unable to debug this issue but at some point I realized the issue did not occur as long as my PC was connected via USB for debugging.

So in my opinion, he has the well known PoE issue, so we can't solve this in software.

Or did I misunderstand the original issue description?

@fredlcore
Copy link
Owner Author

Please take the time and read through the linked issues before making any assumptions. USB power supply and barrel connector supply are two completely different things.
It is a hardware issue, but it is not related to the "well known PoE issue". I have even linked to the comment where @DanKoloff explains what the issue is here, and yes, it would be solvable in software, the question being whether it can be done with the way the BSB-LAN adapter is designed.
In any case, it is not related to anything you have described. So please let's focus here on the issue that was raised by @parsley.

@parsley
Copy link

parsley commented Jan 19, 2025

@uschindler thanks for the warning. Luckily I disabled PoE once I connected the 5V PSU. :-)

But on your second assumption you are wrong where as @fredlcore is right. This issue is not about the heat of that MPS-resistor. (However I must admit, that the heat caused by that resistor will be the next issue to be solved after my issue raised in this thread.)

@fredlcore I'm supprised I already found the time to read through all the links you posted but I actually did! :) The most interesting part for me was the hint towards serial rx. This rang a bell and I searched through my mails. I found a mail sent by me (24.11.2024 13:52) and an answer form you (24.11.2024 17:56) but I have to dig deeper and reinvestigate. So far I do not remember exactly what I found back them...

@parsley
Copy link

parsley commented Jan 19, 2025

I forgot to explain/respond to this:

Well, it won't have to impact your heating in any way - ...

I'm afraid it's not that easy in my case: I gather room temperature of all rooms with my Timberwolf Server, calculate a mean value and send it via BSB-LAN to the boiler which heats accordingly. This works so well, that I do not use any per-room-valve-regulation anymore and by that could further increase the efficiency of my boiler.
That's why it's not so easy for me during winter to test this stuff. Sorry.

But nevertheless I'll keep investigating. If I should not come up with results during this heating season please feel free to ping me in spring! 😉

@uschindler
Copy link
Contributor

Hi,

@uschindler thanks for the warning. Luckily I disabled PoE once I connected the 5V PSU. :-)

This is not necessary, of you have the PoE ISO version. But when the PoE is active and gets 48 or 52 volts (with or without USB), the resistor heat burns the two Elkos and the voltage converter (big black block) on the other side of the platina.

But on your second assumption you are wrong where as @fredlcore is right. This issue is not about the heat of that MPS-resistor. (However I must admit, that the heat caused by that resistor will be the next issue to be solved after my issue raised in this thread.)

Ok, all fine. To me it looks like there was addditional communication. I did not want to hijack this issue, but the story I posted above is EXACTLY fitting the issue's description: "Is anyone else having issues if ESP32 is powered solely bei PoE?". This is what I did and it worked well initially but then broke after a few weeks till the ethernet connection wasn't stable anymore. On the heat issue others also told, that some Olimex PoE-ISO burned in other ways like the voltage regulator gets unstable and you get sudden restarts. So the above story is very well describing an issue of this piece of hardware. The outcome of the overheating by the restsitor leads to different outcomes from failing ethernet to power issues on high load to failure of voltage regulator in combination with exploding elkos.

I am now connected to a stone-aged (black) 5V SAMSUNG Smartphone charger (not the newer white powerful ones) and gave up on using ethernet. Rock solid no crush, no reboots, no network stability issues.

You also said that the issues went away when you connected a 5V adapter. So just for my personal information: Could you describe in which combination of connections the issues started (solely PoE, with USB brick, some other external 5V input?) and if they also started over time? For me on the day of the setup all went well and my solely PoE powered Olimex PoE ISO had no restarts and the BSB communication through the UART was rock stable.

The only difference to my case is this last sentence and this is what @fredlcore is about: "I'm not 100% sure but remamber having instant issues wenn calling bsb-lan web pages that trigger BSB bus communication and slightly less instant issues when calling pages like /C or the index page that do not trigger BSB communication."

This issue did not happen for me neither with PoE only or with Ethernet-no-PoE or Wifi. So it seems to be only affecting some hardware variants and is caused by too low power on the serial UART. If this problem is focused here on, then please change the issue description. Thanks.

Good luck in solving the issue. I would be interested, because I'd like to buy a new Olimex ESP32 PoE ISO and desolder the resistor. If there are related problems with the UART and the main power it should be solved!

@parsley
Copy link

parsley commented Jan 19, 2025

Hi,

This is not necessary, of you have the PoE ISO version. But when the PoE is ...

I sense you do differentiate between the ISO and the non-ISO version?

Ok, all fine. To me it looks like there was addditional communication.

Kind of "yes and no" at the same time since originally the past communication was about som completely different topic:
Last year I had a look at the schematic of the BSB-LAN adapter because I hand soldered my BSB-LAN adapter on some prototyping PCB. Back then I suspecteded, that the RX path might have an issue reaching the specified HIGH/LOW levels of the ESP32. Reading through OLIMEX/ESP32-EVB#57 mentioned above by @fredlcore I was curious what I found and changed back then since DanKoloff said he found an issue with random data generated by a partially powered CH340T. Since besides mentioning the CH340T-RX-data in the other issue they also switch between serial and serial1 during debugging, I suspected a possible connection. Perhaps the BSB-LAN adapter sometimes also generates weird random RX "data"?
However: I slightly changed the schematic to actually be able to reach proper LOW level but still I do have the issue.

...the story I posted above is EXACTLY fitting the issue's description: "Is anyone else having issues if ESP32 is powered solely bei PoE?".

Ok, now I understand what you meant. The difference is that my issue (as well as OLIMEX/ESP32-POE#49) appear immediately and even on new boards once the PoE-ISO board is powered solely by PoE. What is even more interesting, that the same effect appears on the EVB boards (that do not even have PoE) when they get powered solely through their barrel connector (OLIMEX/ESP32-EVB#57). I don't have an EVB so that info was new to me and it clearly points to an issue completely unrelated to PoE. (I think that is why @fredlcore asked whether you have read the thread about the EVB issue ;) )

Nevertheless it's good to know that (by time) the heat of the PoE MPS resistor may cause an issue with a very similar behavior. In the other thread the first assumption was some power issue and later on the suspect was the software acting weird on gibberish serial RX data. While my finding on the schematic might support the serial-RX-therory, your heat dried elkos might support the power-regulator-theory.

... Could you describe in which combination of connections the issues started (solely PoE, with USB brick, some other external 5V input?) and if they also started over time?

On the day I switched from Arduino Due based BSB-LAN setup to the ESP32-PoE-ISO based setup I always had my MacBook connected via USB to the new board (with Ethernet and PoE already connected at the same time). I did that because I wanted to check the serial monitor output to make sure everything works like previously with my old setup. And it did! No issue what so ever as long as my Mac was connected. The trouble started immediately once I disconnected my Mac. But since the issue disappeared as soon as I reconnected my MacBook via USB to the board I had no chance to investigate the serial monitor output. Soon I found out, that the issue also disappeared when I just connected an old iPhone USB charger with an USB-A-to-micro cable. Hence I suspected a power issue with PoE and checked CAT-cable and PoE voltage etc with no finding on that.

Since I already mounted my BSB-LAN on DIN rail in a cabinet I wanted to get rid of the phone charger. I took a KeyStone USB module, cut the 5V path on the tiny PCB inside the KeyStone module in between both USB connectors and soldered a pair of wires to GND and 5V on the USB connector facing the ESP32. These wires are connected to a MeanWell HDR-15-5 that now powers the ESP32 via the USB connector. On the other side of the KeyStone USB module I can connect my Mac. Since 5V is cut only GND, D+ and D- get connected to the Mac which is enough for communication.
At that time I also switched off the PoE supply on the port of my network switch to spare the extra heat.
Switching on or off PoE had no effect on my setup. As long as 5V are supplied on the USB micro connector of the ESP32 board ther is no instability whatsoever.
This is the setup that now has a uptime of 42 days with no issue.

Image
Image
Image
Image

For me on the day of the setup all went well and my solely PoE powered Olimex PoE ISO had no restarts and the BSB communication through the UART was rock stable.

By "communication through the UART" you mean the BSB bus UART with no PC connected via USB to the Olimex, right? That is interesting since I never had that. Perhaps @fredlcore might be interested in the exact BSB software version you used at that point since it perhaps might give a clue if or how the software might be involved in this issue? Do still know that version?

On a side note:
Today I originally started this topic as a discussion because I wanted to know if others had similar problems and hopefully a solution that could help me without spending too much time for an avoidable investigation. But since this was a familiar issue to @fredlcore he immediately converted the discussion into an issue. Which is great! (Albeit a solution would have been even better. 🤪🤣) However this means that I have to review and verify the observations I remember and assumptions I made. That said I hope everything I wrote today is correct and not altered by wrong memories of mine.

@fredlcore
Copy link
Owner Author

To me it looks like there was addditional communication. I did not want to hijack this issue, but the story I posted above is EXACTLY fitting the issue's description: "Is anyone else having issues if ESP32 is powered solely bei PoE?".

It may fit the headline, but not the description of the problem in the threads linked in my first post. But never mind.
And no, the only additional communication is the discussion in the threads that I linked to. The mails that @parsley is referring to were about observations of resistors on the Olimex in combination with the ones on my adapter, but these are irrelevant to this case because - again, as written above - it might well be that the problems are still there even if the adapter is removed because the irregular data is generated due to a hardware issue with the CH304T that is not completely unpowered and thus has some kind of "data leakage" on pin 36 that generates data that BSB-LAN cannot handle. That's how Dan Koloff could reproduce the problem without even having a BSB-LAN adapter, just by running the software. Again, this is all discussed at great length in the linked issues, so let's not do any further guessing here but focus on the steps that I outlined above. Once these are done and my assumptions are wrong, then we can continue with discussing other potential approaches.

For me on the day of the setup all went well and my solely PoE powered Olimex PoE ISO had no restarts and the BSB communication through the UART was rock stable.

If you mean by "communication through the UART" that you connected the Olimex via USB (which is the only way to access the UART, unless you go directly to the RX0/TX0 pins via a USB-TTL-adapter), then you power the Olimex (also) via USB. And again, as described in the linked discussion above, that's when all the problems go away. That is also exactly the workaround solution that I tell people to do if they encounter the problem, but this means that they cannot use PoE anymore – on a regular POE because PoE and USB-power are not allowed at the same time, and on a POE-ISO because it doesn't make sense, of course, to power the device from both sides, even if it is possible to do so.

So it seems to be only affecting some hardware variants

No, according to Dan Koloff in the linked threads, this affects all PoE-powered Olimex devices because of the hardware design. We cannot verify this anymore on your side since it seems that your Ethernet port has been destroyed. That's why, unfortunately, you won't be able to help us here, because in order to do the abovementioned tests, the Olimex has to be powered via PoE. Powering it via 5V USB will "solve" the problem immediately.

@fredlcore
Copy link
Owner Author

I'm afraid it's not that easy in my case: I gather room temperature of all rooms with my Timberwolf Server, calculate a mean value and send it via BSB-LAN to the boiler which heats accordingly. This works so well, that I do not use any per-room-valve-regulation anymore and by that could further increase the efficiency of my boiler.
That's why it's not so easy for me during winter to test this stuff. Sorry.

That's exactly my setup as well. But rest assured, if you set the comfort temperature setpoint to something like 21 degrees and then room influence (parameter 750) to zero, your heater will heat based on the outside temperature and the flow temperature and you won't notice any difference for some time. I frequently have to remove BSB-LAN to do some tests, and with these settings, the overall temperature hardly changes at all during the 30-60 minutes that I usually test - and this is about as long as the tests here will take as well.
Just make sure that you have the original software configuration ready so that you can flash it again once the tests are over. Then the whole tests will be seamless.

@uschindler
Copy link
Contributor

uschindler commented Jan 20, 2025

Hi,

For me on the day of the setup all went well and my solely PoE powered Olimex PoE ISO had no restarts and the BSB communication through the UART was rock stable.

If you mean by "communication through the UART" that you connected the Olimex via USB (which is the only way to access the UART, unless you go directly to the RX0/TX0 pins via a USB-TTL-adapter), then you power the Olimex (also) via USB.

With UART I was referring to the serial chip that handles the BSB communication in combination with your adapter. As you see in the picture above, initially my device was a OLIMEX PoE ISO with only Ethernet connection, powered with 52 Volts (according to switch) and approxy 2.3 Watts. No USB connector (except for debugging and initial software upload). In the first few weeks after installation, there was no reboots and no failures on the BSB communication. The adaptor worked perfectly fine. So it looks like there are still cases where the hardware works correctly. Maybe the higher voltage of 52 Volts on my Netgear PoE instead of standard 48 Volts lead to exactly 5 Volts and not 4.9 Volts like discussed in above threads, so I did not see the issues. This was of course not proofed.

The problems started later (possibly due to the overheating), but there was still no restarts. The only thing that happened for me was random short-time disconnects from Ether net which got more frequent over time till the whole Ethernet socket more or less died / got unuseable.

During the whole time, there was no USB power connected and BSB communication worked well. Of course (as I had no serial connection and the telnet connection was unstable) I cannot 100% say that there may not have been communication failures due to phantome bits on the BSB connection. But the system did not restart while powered via PoE during 4 weeks while I was on vacation in November, it just lost ethernet from time to time.

And again, as described in the linked discussion above, that's when all the problems go away. That is also exactly the workaround solution that I tell people to do if they encounter the problem, but this means that they cannot use PoE anymore – on a regular POE because PoE and USB-power are not allowed at the same time, and on a POE-ISO because it doesn't make sense, of course, to power the device from both sides, even if it is possible to do so.

So when I buy a new PoE ISO, I will try to add the sleeps into the code. Unfortunately for me it is too late, you're right.

My observation with the damaged PoE due to overheating should maybe also be added as a warning. If you look on Amazon reviews and my linked forum threads, many people complain that the devices break after a few weeks with PoE only due to overheating.

So it seems to be only affecting some hardware variants

No, according to Dan Koloff in the linked threads, this affects all PoE-powered Olimex devices because of the hardware design. We cannot verify this anymore on your side since it seems that your Ethernet port has been destroyed. That's why, unfortunately, you won't be able to help us here, because in order to do the abovementioned tests, the Olimex has to be powered via PoE. Powering it via 5V USB will "solve" the problem immediately.

See above. For me I had no obvious instability issues except disconnects on Ethernet. But as you can't see all errors without serial connection, I might have been affected by the issue, too. But I had no restarts. The uptime counter on the "settings" page was several weeks at beginning. Software version was around October 20th.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hardware issue Problems related to (third-party) hardware issues help wanted
Projects
None yet
Development

No branches or pull requests

3 participants