Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract packet bitmap, transform and provide to M&C for grafana, etc #26

Open
gsleap opened this issue Jun 4, 2024 · 2 comments
Open
Assignees
Labels
enhancement New feature or request

Comments

@gsleap
Copy link
Member

gsleap commented Jun 4, 2024

@andreww5au @shrydar for your comment/review/suggestions

Intro

We want to pass the packet bitmap info in some form to the M&C system for plotting and historical trend analysis.

How might it work?

MWAX_subfile_processor should, for every subfile it handles:

  • Read the subfile header. See MWAX PSRDADA Header Wiki page for more info.
  • Locate the packet map index (IDX_PACKET_MAP) and length in "block 0" (Block 0 starts at byte 4097 of the file)
  • Read the bytes from the start to the length into an array
  • Summarise: by coarse channel (easy - each subfile is for 1 coarse channel anyway) and receiver?

Perhaps the data can be summarised by rf_input in 1 second averages? e.g. mwax_mover on each MWAX server would produce, per 8 seconds:

  • N rf_inputs x 8 float values- each float is 0-1 where 1=all packets received in that second and 0=all packets lost in that second

So it will be trivial to average 625 (or 800) bits per second per rf_input and store the result.

This would mean that at 256T:

  • 512 "rows", with each row representing an rf_input and having 8 32-bit floats (maybe we only need 16 bit float?) == 512 * 8 * 4 == 16,384 bytes of data every 8 seconds, per MWAX server or
  • 393,216 bytes for all 24 coarse channels per 8 seconds == 49,152 bytes per second

Maybe this could be fed into influx db and averaged down as it ages?

Questions

  • What format is best for M&C? A dumb, bespoke binary file? numpy or pandas dataframe to disk?
  • Should I dump it onto vulcan like we do with the mwax_stats phase plot data?
  • Would it help if the "data" I give the M&C system contains a dump of the rf_input tile names, pol and receiver within the data file (like a header)? So that way M&C code doesn't need to do interpretation every 8 seconds?
  • Hmm considering our packet loss is mostly minimal, it would mean that even if we didn't summarise the data at all and just dumped it into a file, but compressed it (lossless integer compression) it should compress really well and may be smaller than even the summarised data? hmmm. e.g. no packet loss for all 8 seconds for 1 rf_input would be 625 bits of 1 every 1 second which means 625 bytes in a row which have a uint8 value of 255 in a row would compress REALLY well! Then again, it would be a job on the M&C end to decompress that and then be dealing with that large amount of data. So maybe we should still average it in mwax_mover!
@gsleap gsleap added the enhancement New feature or request label Jun 4, 2024
@gsleap gsleap self-assigned this Jun 4, 2024
@shrydar
Copy link

shrydar commented Jun 4, 2024

Yes, an average per second sounds good - or even just a raw count of lost packets (so, up to 625 or 800) would only need 10 bits at most, so a 16 bit integers would be just fine.

I'd been considering runlength encoding the lost packet map, but a count would be much simpler to implement and in practice gives about the same amount of information. We could perhaps also include a count of the number of runs of missing packets though? That's only one extra number per input per second, would max out at at most 400 groups per input per second, and would give us some idea about whether we're seeing a random scattering of unreliable transfers or just a few sizable blocks of downtime.

I do like the idea of using a format that at least encapsulates the array dimensions, so the reader doesn't have to rummage elsewhere to determine the shape. It'd be nice to include the mapping from channel index to receiver id too.

Slightly leaning towards FITS just to try to avoid further proliferation of container formats? I've not looked at how efficient writers are for that yet, nor do I know what M&C would be happiest with. If we could compress it fast enough we could just thrown in the entire map for later investigation, but that's probably more information than we really need for a dashboard.

@shrydar
Copy link

shrydar commented Jun 5, 2024

(Also note that reporting per second rather than per eight second subobservation would require some careful bit masking when reading the packet map, as each eight seconds' worth of flags is stored as one 625 byte bitmap per input, and the one second breaks don't land on byte boundaries)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants