problems with wmo2msc

I made some changes to wmo2msc today, but now I've noticed some problems and I'm a bit confused about how to proceed. As far as I can tell, the same problems are present in the original v2 version we're currently using. We are *not* using the conversion to MSC in the feed I'm working on, we're only using the rename and tree functionality.

Binary bulletin detection doesn't work properly. Manually running the code, it's because we are comparing byte-strings with regular strings, e.g.:

```
>>> bulletin[1].lstrip()[:4]
b'BUFR'
>>> bulletin[1].lstrip()[:4] in ['BUFR', 'GRIB', '\211PNG']
False

>>> bulletin[0][:11]
b'ISXX14 EGRR'
```

https://github.com/MetPX/sarracenia/blob/e0f8eb578dba85171aad05ac57f14d71f542bd35/sarracenia/flowcb/filter/wmo2msc.py#L245-L259

So everything is detected as a wmo-alphanumeric bulletin.

This second problem isn't a problem at the moment due to the first problem, but *when a binary bulletin is detected*, we replace `\r` with nothing. But later we just link the new filename to the old file, so this replacement never gets used. Either we should only do the replacement when convert to MSC is enabled, or we should continue to do the replacement even when not converting to MSC format, but in that case we have to remove the linking code and always write out the modified file.

	# Determine file format (fmt) and apply transformation.
	if self.bulletin[1].lstrip()[:4] in ['BUFR', 'GRIB', '\211PNG']:
	fmt = 'wmo-binary'
	self.replaceChar('\r', '')
	elif self.bulletin[0][:11] in ['SFUK45 EGRR']:
	# This file is encoded in an indecipherably non-standard format.
	fmt = 'unknown-binary'

	#self.replaceChar('\r','',2) replace only the first 2 carriage returns.
	self.bintxt = self.bintxt.replace(bytearray('\r', 'latin_1'),
	bytearray('', 'latin_1'), 2)
	else:
	fmt = 'wmo-alphanumeric'
	if self.o.filter_wmo2msc_convert:
	self.doSpecificProcessing()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

problems with wmo2msc #1498

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

problems with wmo2msc #1498

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions