Skip to content

Update Source.pm to address alarm bug #1350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: public/9.1
Choose a base branch
from

Conversation

pcosway
Copy link

@pcosway pcosway commented Mar 13, 2025

Comment out line in _returnPlayMode that returns 'stop' without confirming that player is stopped.
If player actually paused, then alarm process will resume player, not start streaming. Start is required for server to notify the remote device. Without notification the backup alarm.mp3 will start playing after 60 seconds.

Comment out line in _returnPlayMode that returns 'stop' without confirming that player is stopped.
@pcosway pcosway changed the title Update Source.pm Update Source.pm to address alarm bug Mar 13, 2025
@michaelherger
Copy link
Member

Can you describe in simple steps how to reproduce the issue you try to fix with this change? That line has been there for 17 years... I'd be surprised if it caused a problem we haven't figured out yet?

@pcosway
Copy link
Author

pcosway commented Mar 14, 2025

Sure -- but first a request for clarification: do you want steps to reproduce the issue with:

  1. _returnPlayMode returning 'stop' when the correct player state is 'pause'?
  2. playcontrolCommand not 'stopping' a player that is in 'pause' mode because _returnPlayMode returns the incorrect mode?
  3. the fallback alarm (alarm.mp3) playing on a Squeezebox Radio because of 1 & 2?

I was trying to make my task (and yours) easier by focusing on 1 & 2 -- I think both are bugs and are very easy to document. Simply reading the code suggests the possibility for incorrect behavior. It is very easy to show that they produce incorrect results 100% of the time. Of course, other parts of the server code may have workarounds to address problems that originate in these bugs -- and it is possible that fixing these bugs could create new problems (though it seems unlikely that 'stop means stop' and 'pause means pause' would cause a problem).

In my testing, fixing these bugs also addresses #3. I also have a fix by modifying the firmware -- AlarmSnoozeApplet.lua, but that is more brute force. That act of figuring out why that fix works led me to finding the stop/pause bug in the server.

If I need to prove that fixing these bugs also addresses #3 before you want to consider making any changes, I can try to do that. As you know, there is a significant history of problems with the alarms and fallback and the problem is intermittent. Reports on the fallback issue go back to 2009. I think the code I propose fixing was first released in 2008 in 7.3.0. I haven't found clear evidence of problems being reported until 7.4.2, but it's not easy to find relevant old posts. Here are two examples that do establish late 2009 as the origin of a problem (selected because Erland is involved in the discussions and he seems to be very reliable!):

This does seem to suggest that there is a problem dating to at least 2009 that has not yet been figured out. I'm not aware of the 2009 alarm problem being "fixed" and then a new alarm problem arising. While that is possible, I've not seen any comments in the code base or the forum that would suggest the intermittent fallback alarm problem has ever been fixed.

In many of the "alarm problem" threads the investigation has focused on the firmware, as it is a reasonable guess that the device is not in contact with the server --- despite the fact that I have never seen anyone document that connection is the issue. In all of my testing of failing alarms, I have never had a connection issue. The server and the device are connected, just not exchanging the correct information for the SBRadio to play the stream from the server instead of alarm.mp3.

Even with a solid connection, the SBR needs to receive a strm message/frame from the server for it to play the stream. I have never experienced the fallback alarm if the server sends a "strm frame" to the player. This is only sent if the server views the player mode as 'stop'. If the player mode is 'pause', then the server will resume the stream rather than play/start the stream. Resume does not send a strm frame -- it's not in the code.

I can easily send you one repeatable example for #3, but it is essentially the same as showing you that #1 and #2 are repeatable. I cannot promise this will solve all the erroneous alarm fallback behavior -- there may be other complicated timing issues.

Hopefully we can proceed based on evidence for #1 and #2. I will also: 1) disable my firmware fix, 2) make my proposed server fix locally, and 3) set up logging to track the server/alarm state to see if I can identify alarm failures that are not solved by my fix.

@michaelherger
Copy link
Member

michaelherger commented Mar 15, 2025

tl;dr

If a single line change needs that much of a description, and you can't tell me how to reproduce the problem in a couple of lines, then I bet the "fix" is too simplistic.

Please give me instructions in ten lines how to reproduce your problem. Then I can try to reproduce, too, and give you change a test.

@pcosway
Copy link
Author

pcosway commented Mar 15, 2025

That makes sense. Since the general problem is intermittent, I don't know that this fix addresses all fallback alarm issues.

In my debugging process I stumbled on a repeatable failure - not sure it represents a meaningful % of the failure modes.

Since the solution worked on this failure mode and seemed to also address a clear bug in the "logic", I submitted it. I totally understand that it is risky to change a 17 year old chunk of code. LMS does appear to be a very intricate system.

I'll continue to run tests and see if this fix seems to work over time, while also trying to document more scenarios that lead to the fallback alarm failure.

@pcosway
Copy link
Author

pcosway commented Mar 15, 2025

This is the simple (perhaps unrepresentative) repeatable scenario:

  1. Using web interface, set an alarm on a SBRadio
  2. Turn off SBRadio, if not already off
  3. When alarm goes off, click pause button on the web interface. Alarm will stop and SBR will return to off state.
  4. Set an alarm for a minute or two in the future.
  5. When alarm is triggered and menu appeaers on SBR, no sound will come from SBR. After ~1 minute the fallback timer on the radio will trigger and sound.

Note: On my radio, sometimes there is only a second or so of alarm sound at 1 minute, then it plays on a loop after an additoinal minute. I haven't tracked down the cause of that -- might be an odd interaction between the firmware and LMS. (see update below, if interested)

My fix solves this problem because clicking pause on the web interface leaves the player in pause mode and the radio powered off. That means that _returnPlayMode returns 'stop' when the actual mode is 'pause'.

This is the only repeatable failure mode I have discovered so far. It is 100% repeatable for me, but I don't know if it will happen with your set up. I also don't know if it solves all of the intermittent issues, only that it has worked for ~1 week for me.

If you have a chance to test this scenario, please let me know what happens for you. I'll continue to monitor my set up.

Update: I did figure out why the radio initially plays the fallback alarm for only ~1 second, before later playing it in a loop. It's fairly complicated, but essentially the SBRadio 'stops' any local playback before starting the fallback alarm. That 'stop' sends a message to the server which eventually sends a strm 'q' message back to the player, stopping the fallback alarm. However, the fallback timer gets restarted in the process and then restarts the fallback alarm, so the end result is ok, just delayed 60 seconds. That's not related to the bug I'm reporting in this pr - just a different bug that I stumbled on while researching.

@michaelherger
Copy link
Member

I'm not able to reproduce your steps:

This is the simple (perhaps unrepresentative) repeatable scenario:

  1. Using web interface, set an alarm on a SBRadio
  2. Turn off SBRadio, if not already off
  3. When alarm goes off, click pause button on the web interface. Alarm will stop and SBR will return to off state.
  4. Set an alarm for a minute or two in the future.
  5. When alarm is triggered and menu appeaers on SBR, no sound will come from SBR. After ~1 minute the fallback timer on the radio will trigger and sound.

When I pause using the web UI, the Radio would go back to off state. And when the alarm fires, it's on again. And as it's on, your suggested change wouldn't do anything, as the condition would not be met.

Even if your patch "fixed" something, it would only address one symptom, but not the real issue: why would a player which supposedly is playing an alarm sound be in "off" mode? Turning a blind eye and assuming "let's ignore its own off state, we know better!" is asking for trouble in other context. You should try to understand why its power state was wrong. That's the real issue here.

I'd recommend you join https://forums.lyrion.org and post your findings there. As there are more people following, and some certainly have done their own share of research, you'll be in better hands than with me here, who doesn't have enough time to look into this in depth.

@pcosway
Copy link
Author

pcosway commented Mar 20, 2025

Thanks for testing! I have two Squeezebox Radios but was testing on only one of them --- this process fails 100% of the time on that Radio.

But knowing of your test I tried the other -- did not fail.

I thought they were identical. Time to rebuild the one that was failing and see what happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants