Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract engine noise from IGC, add to KML multipath #1054

Merged
merged 22 commits into from
Apr 6, 2023

Conversation

PacketFiend
Copy link
Contributor

@PacketFiend PacketFiend commented Mar 27, 2023

Scans the IGC "I" record to determine which extensions are present in the log, and extracts that data into struct igc_fsdata : public FormatSpecificData.

Currently only engine noise levels (ENL) are extracted, but the skeleton is there for any number of extensions.

If engine noise level data is present when generating a KML output file, we also now output that as part of a multipath ExtendedData section. This way, it can be overlaid onto the Google Earth elevation profile, alongside either altitude or speed.

I'm not sure exactly which string libraries to use (there's an ugly sscanf that should probably be changed) or whether or not I'm following the general culture of the project, so feel free to change this.

Running const auto* fs_igc = reinterpret_cast<igc_fsdata*>(wpt->fs.FsChainFind(kFsIGC)); on every datum of a log file is probably not very efficient. I have some IGC files with over 25,000 entries. Is there a better way to do that?

Thank you Robert for your detailed explanation, this would have not been possible otherwise. I haven't set up tests yet, so I'm hoping all the standard tests will pass.

(Also adds .vscode/, bld/, and build/ to .gitignore. That's a personal preference but I think it's a good idea.)

Copy link
Collaborator

@tsteven4 tsteven4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It can be difficult to detect our preferred standards as most of this code started out in C in different times. I have tried to add comments in accordance with our current practices.

This also needs a test case, see testo.d/igc.test

we prefer short test cases if possible, e.g. not your 25,000 record case if a shorter one is available.

formspec.h Outdated Show resolved Hide resolved
igc.cc Outdated
@@ -44,11 +44,14 @@
#include <QTime> // for operator<, operator==, QTime
#include <Qt> // for UTC, SkipEmptyParts
#include <QtGlobal> // for foreach, qPrintable
#include <QStringRef> // for substring
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QStringRef goes away in Qt6, although it is in Qt5Compat. We have been taking out usages of QStringRef.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with Qt, QStringRef does the job well without creating a new QString.

As you mentioned below, does QByteArray perform the same function (i.e. I want to use a pointer to an existing string, rather than creating a new one)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bigger save is QByteArray doesn't do an 8 bit to UTF-16 conversion (and back).

I would advice against worrying about the copy. Eventually we will use QStringView more, but at the moment we are supporting both Qt5 and Qt6. QByteArray::fromRawData can avoid the copy, but can lead to other issues so we haven't used it.

igc.cc Outdated
Comment on lines 263 to 270
std::string s;
std::string s2;
TaskRecordReader task_record_reader;
int begin = 0;
int end = 0;
char ext_record_type[4];
igc_fs_flags_t igc_fs_flags;
QStringRef ext_data;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would prefer to minimize the scope of variables when possible by declaring them when we first use them or inside the smallest scope they are used in.

In general we use QStrings, or sometimes QByteArray instead of stdlib strings, e.g.
sscanf(QByteArray(ibuf).mid(i,7).constData(), "%2i%2i%3s", &begin, &end, ext_record_type);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very old habit that I'm trying to fix. Back when I used C/C++ regularly, I always declared all variables at the start of a function, which is not consistent with modern practices, generally. Easy to fix this one.

igc.cc Outdated Show resolved Hide resolved
igc.cc Outdated
Comment on lines 448 to 449
// QString ibuf_q = QString::fromLocal8Bit(ibuf);
// QStringList ext_data = ibuf_q.split(QRegularExpression ("[A-Z]+"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extraneous comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More like unfinished work. I left it there as a "this is probably a better way to do this" comment, but forgot to comment my comments.

igc.cc Outdated
for (unsigned i = 3; i< strlen(ibuf); i+=7){
s = ibuf;
s2 = s.substr(i, 7);
sscanf(s2.c_str(), "%2i%2i%3s", &begin, &end, ext_record_type);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check and warn on sscanf not converting 3 fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, although I'm hoping the sscanf can be eliminated entirely and replaced with some Qt-ness, much the same way as the individual extension records are scanned.

The sscanf here was written before I had any idea how to accomplish this with Qt. I assume that using a Qt function would be preferable to sscanf.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we started over with this module we wouldn't use sscanf, but it is used extensively in this format, and I recently left that alone when I was in here. I'd leave sscanf replacment for another day myself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense - but if I'm adding new code, then why not use Qt?

igc.h Outdated Show resolved Hide resolved
kml.cc Outdated
@@ -1467,9 +1468,11 @@ void KmlFormat::kml_mt_simple_array(const route_head* header,
{
writer->writeStartElement(QStringLiteral("gx:SimpleArrayData"));
writer->writeAttribute(QStringLiteral("name"), name);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete trailing space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops.

kml.cc Outdated
@@ -1833,7 +1850,7 @@ void KmlFormat::write()
writer->writeEndElement(); // Close Folder tag
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete trailing space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

.gitignore Outdated Show resolved Hide resolved
@PacketFiend
Copy link
Contributor Author

I do have shorter IGC (under 1k lines) files with engine noise data, among other extension data, but they're not my flights.

The issue is sharing them. While it's certainly possible to remove anything personally identifiable from them (e.g. pilot name or aircraft registration), this would invalidate the security fields, which would make them an ineffective test case if that functionality is added at a later time. I will ask.

Copy link
Collaborator

@robertlipe robertlipe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very cool!
Can you please add a (short) file that has these extensions and a kml file that can serve as a master test file? Just something that shows a mixture of the extensions being present or missing to give the code below good test coverage. Add them to reference/tracks/ and put an entry in testo.d/kml.test that does a diff on the generated and the reference file.

Once we have that, we can patch this in to our local trees and try to help things along.

I think you definitely have the rhythm here. This is the basic shape of how this should look. The surrounding code (Very c89-ish instead of Qt-oriented) has some bit rot that's lead you astray, but that's the part we can help with.

igc.cc Outdated Show resolved Hide resolved
igc.cc Outdated Show resolved Hide resolved
igc.cc Outdated Show resolved Hide resolved
igc.cc Outdated
igc_fs_flags.enl = true;
igc_fs_flags.enl_start = begin;
igc_fs_flags.enl_end = end;
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we think about ways to drive this from a table where
.tag = "TAS"
.tas = begin or something.
I'll kick this around later. This block is just a little chewy. It's hard to spot the differences in these blocks.

Copy link
Contributor Author

@PacketFiend PacketFiend Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, "chewy" would be the description I would use for this block.

If I did this in Python, I'd use a list of dictionaries, something like:
igc_ext_data [ { "ext": "TAS", "begin": 40, "end": 42 },{ "ext": "ENL", "begin": "43", "end": "45" }]
Which could then be iterated on where individual elements would be accessed as igc_ext_data[i]['begin'] (for example). I'm not sure of the best way to do this either in C++ or with gpsbabel norms.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't visualize that yet, but we have QHash, QList, QMap, and all the comforts of a civilized language. We'll find our common parlance soon!

I think a QHash is what you're looking for. We have examples all over the code, such as in xcsv.cc or

igc.h Outdated
trt(0),
gsp(0),
fxa(0),
enl_start(0),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instread of start and tag points to a 3 byte value, can we just put that 3 byte data right in the fs_flags?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, is the common case to have a lot of these or a few of these? We mght be able to just use an optional in here and either store the value or nothing. Then our has() becomes "is that in the optional" and start and end fold together.

Copy link
Contributor Author

@PacketFiend PacketFiend Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure I follow your comment.

I did consider simply holding the value and not the boolean, however it is quite possible that a recorder will record an engine noise (or air temperature for that matter) of zero. So a zero value does not at all mean "no data". So in this case, the bool tells us whether or not the recorder wrote the data. If it did, I assume the data is valid, even if it's zero.

A lot or a few? I'm not entirely sure. I only started on this recently and I'm still soliciting IGC files from my soaring club and others. They're slow in the coming, as the prime flying season starts in vogue in May here in Canada.

There is also a route object that I haven't examined yet. It would be far better to store these booleans as some kind of igc_exts struct in a route object, rather than in each individual waypoint. If a recorder has temperature data, it has it for the entire log, so there's no point keeping that info in every waypoint. I just haven't got that far yet.

(edit)Oh, I think I get you now. I didn't know optional elements are a thing with C++ these days. Perhaps it's my C heritage rearing its ugly head. That might work but I'll have to look it up.

foreach (const Waypoint* wpt, header->waypoint_list) {

const auto* fs_igc = reinterpret_cast<igc_fsdata*>(wpt->fs.FsChainFind(kFsIGC));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that doing this lookup 15,000 times, once for each point, doesn't seem very awesome. The reality is that MOST points will have only one type of data in the chain and two or three at worst. A sequential rip through three searches is pretty fast. (Waving away what we do to poor cache coherenc and trashing branch prediction...)

kml.cc Show resolved Hide resolved
@PacketFiend
Copy link
Contributor Author

I added two IGC files and the resulting KML output, and put the tests in kml.test.

I have no idea if it will function as intended as I'm not set up to run these tests yet, and these tests will begin to fail when the KML output is changed, which is something I'm actively working on.

@PacketFiend PacketFiend force-pushed the master branch 3 times, most recently from df5cb9d to f2a67b4 Compare March 29, 2023 07:13
@PacketFiend
Copy link
Contributor Author

PacketFiend commented Mar 29, 2023

I added two new classes, "IgcMetaData" and "ExtensionDefinition".
IgcMetaData objects contain a struct of flags (which still need work) noting which extensions are present, and also if any extensions are present at all. Optionally, an IgcMetaData object can contain ExtensionDefinition subobjects, which each contain the data necessary to parse individual B records when extensions are present.

The ExtensionDefinition subobjects are required to parse the IGC file, but can be discarded when we leave the IGC module. They are optional. The IgcMetaData object (which may or may not carry a variable number of ExtensionDefinitions), however, contains data telling us which extensions are present. This way, the IgcMetaData object can be attached to a route definition somehow, without carrying all the information we once needed to parse the IGC file.

I addressed a few complaints regarding variable scope, cleaned up a bit of trailing whitespace, and fixed up my additions to .gitignore. I also added a dump() method in the IgcMetaData class to help with debugging. Also removed the use of std::string and replaced it with QString.

Much better so far, but this still needs cleanup. The variable/class/struct names are becoming rather confusing. But for now, this compiles and builds KML files as it should.

I won't lie, ChatGPT was of great assistance with this commit. We had a very long conversation.

Copy link
Collaborator

@tsteven4 tsteven4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at all of this, but I am repeating my earlier comments on the igc data structures.

  1. don't define constructors we don't need. The need for a user ctor can often be eliminated by using default member initializers, or by the ctor of the class we are using.
  2. the use of parallel flags and values can be eliminated by using std::optional.

igc.cc Outdated
} else if (ext_record_type.toString() == "VAT") {
fsdata->vat = ext_data.toInt();
fsdata->igc_fs_flags.vat = fsdata->igc_fs_flags.vat || true;
} else if (ext_record_type.toString() == "OAT") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think OAT, Outside air temperature (Celsius), after scaling, can go into Waypoint::temperature, a float. (pres_wpt->set_temperature(ext_data.toInt()/scale). It looks like there is an implied decimal point if the OAT field has more than three characters, see spec A2.4.1. Therefore, we don't need igc_fs_flags.oat.

igc.cc Outdated
} else if (ext_record_type.toString() == "OAT") {
fsdata->oat = ext_data.toInt();
fsdata->igc_fs_flags.oat = fsdata->igc_fs_flags.oat || true;
} else if (ext_record_type.toString() == "TRT") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TRT = True Track. likewise I think this goes into Waypoint::course, a float. It is unclear what the units of TRT are, course has units of degrees.

igc.cc Outdated
} else if (ext_record_type.toString() == "TRT") {
fsdata->trt = ext_data.toInt();
fsdata->igc_fs_flags.trt = fsdata->igc_fs_flags.trt || true;
} else if (ext_record_type.toString() == "GSP") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GSP = groundspeed, a possible fit for Waypoint::speed which is a float with units of m/s.

igc.cc Outdated
} else if (ext_record_type.toString() == "FXA") {
fsdata->fxa = ext_data.toInt();
fsdata->igc_fs_flags.fxa = fsdata->igc_fs_flags.fxa || true;
} else if (ext_record_type.toString() == "GFO") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GFO isn't in the spec.

igc.cc Outdated
} else if (ext_record_type.toString() == "GFO") {
fsdata->gfo = ext_data.toInt();
fsdata->igc_fs_flags.gfo = fsdata->igc_fs_flags.gfo || true;
} else if (ext_record_type.toString() == "SIU") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIU = sattelites in use, seems like a fit for Waypoint::sat

igc.h Outdated
* and present in individual B records.
*/

struct igc_fs_flags_t {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as my earlier comment, we can skip defining the constructor and just use default member init. Or, as I also commented on and IMO a better solution, use std::optional which combines information on existence (the flag) with the value itself.

igc.h Outdated
bool acz;
};

class ExtensionDefinition {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also as I commented before, could use std::optional<igc_defn_t>, where igc_defn_t contains name, start, end:

struct igc_defn_t
{
Qstring name;
short start;
short end;
}

igc.h Outdated
}
};

struct igc_fsmetadata_t {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again we don't need the constructor at all. can be done by adding a default member initializer (or std::optional, which internalizes the flag).

igc.h Outdated
QMap<QString, ExtensionDefinition*> extensions;
public:
igc_fs_flags_t flags;
IgcMetaData() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this ctor is unnecessary. igc_fs_flags_t will be initailzied by it's ctor (or default initiailizers) anyway.

igc.cc Outdated
*/
// TODO: Return this hash map for later use
QHash<QString, short>::const_iterator i;
printf("Extension types:"); //DELETEME development
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or qualify, where n is some positive small integer. global_opts.debug is set by -D switch on command line, debug_level is zero by default(no debug).
if (global_opts.debug_level >= n) {

@tsteven4
Copy link
Collaborator

This implementation needs work related to the igc data structures/classes, and this is a blocker for me.

  1. why are we building ext_types_hash? Do we ever use the value?
  2. you attempted avoid a copy with QStringView on ext_rec_type. but every one of the ~18 .toString() uses is going to allocate and deep copy. Really, worrying about a few copies for each type of extension is not worth the complexity and bugs it entails. You will not be able to measure any performance improvement. If it was a per waypoint copy it MIGHT be worth optimizing.
  3. IgcMetaData seems more complicated that necessary. Can we create this class and adhere to the rule of zero?

How about if we had
QList ext_types_hash; (which isn't a hash, but I left the name the same.)
When we get to the I record we just add entries to the list. After we get all of the entries we can iterate through the list and add entries to
QList IgcMetaData;

we don't need an ExtensionDefinition::exists, if an entry is in the list it exists.

igc_metadata.flags.has_igc_exts becomes !igc_metadata.isEmpty()

things like
fsdata->igc_fs_flags.enl = fsdata->igc_fs_flags.enl || true;
are equivalent to
fsdata->igc_fs_flags.enl = true
but, if igc_fs_data is a bunch of std::optional then we don't need the flag at all. I'm not convinced T is always int, e.g. OAT seems like it is a float.

the only thing ExtensionData::end is used for is to calculate a length. Can we store the length in ExtensionData instead of end?

igc_fsmetadata_t is defined but not used.

Please ask for clarification if necessary.

@PacketFiend
Copy link
Contributor Author

Most if not all of the "deficiencies" in my suggested additions are a result of me not having written anything in C++ in probably 20 years. It's been a long, long time since I worked with a compiled language.

I'm currently working on eliminating my proposed IGC classes, as they're really only useful while parsing IGC files. Most of the metadata is not relevant after that.

What I would like to avoid is excessive if ladders in igc.cc and kml.cc. The **ExtensionDefinition subobject seems to accomplish that, in that due it its inclusion in IgcMetaData, an unknown number of extensions can be included, but I would agree - it's a "heavy" way to accomplish what I'm trying to do.

ext_types_hash is built while parsing the I record (which defines which extensions are present), and then used while parsing individual B records (haven't pushed that yet). It may be used unnecessarily because I don't completely understand the benefits and limitations of switch statements in place of if ladders, and it's largely a way to use an implied goto when parsing a large IGC file, as a way to save processing time.

Ideally, I would like to define extensions in igc.h, so if and when I get new data that suggests another extension would be useful, I would not need to alter anything in other files - just in the IGC header file. My thinking is that it's more maintainable this way. Much effort has been put into making it easier to quickly add new extensions. Iterating through ext_types_hash is an example of that, and it makes it possible to avoid excessive if statements. I don't know how much time it will save, though.

I am also trying to avoid checking for every individual extension with the inclusion of some kind of has_igc_exts flag, either an actual boolean, or a function that returns a boolean. The IGC extension data is irrelevant to most formats, so why not include a way to skip IGC specific logc?

Yes, we can just store begin and length in the igc metadata, that's enough information to parse the file with. Agreed there.

I'm working on using std::optional instead of those flags.

And I suppose a lot of the difficulty is because the number of extensions is unknown - it varies by flight recorder. I don't want to force us to check for each extension every time we examine a waypoint.

(And keep the criticisms coming)

@tsteven4
Copy link
Collaborator

tsteven4 commented Mar 31, 2023

I am also trying to avoid checking for every individual extension with the inclusion of some kind of has_igc_exts flag, either an actual boolean, or a function that returns a boolean. The IGC extension data is irrelevant to most formats, so why not include a way to skip IGC specific logc?

.FsChainFind(kFsIGC) is usually very fast because the list is empty.

Ideally, I would like to define extensions in igc.h, so if and when I get new data that suggests another extension would be useful, I would not need to alter anything in other files - just in the IGC header file.`

That's a tall order. I'd be happy to put a win on the scoreboard with a less ambitious implementation. It is going to be more difficult because the IGC extension data is sometimes an integer, sometimes an integer that needs to be converted to a float (e.g. OAT), sometimes a float (I think), sometimes a string (e.g. SIT), sometimes encodes multiple things (e.g. WVE). While I applaud the public spec for IGC, it does seem many of these extensions lack clear information on the format of the data.

I did download a number of igc files from last years world championship. these are the I records. I did this to get an idea of how many extensions are typically used and what they are. I imagined competitors at the world championships would tend to have newer equipment that might use more extensions. It's great to have your involvement, we are not pilots and don't really know what the users need/want.
I113638FXA3941ENL4246TAS4751GSP5254TRT5559VAT6063OAT6468NET6972ACZ7376AOR7779AOP
I123638FXA3941ENL4246TAS4751GSP5254HDT5557TRT5862VAT6366OAT6771NET7275ACZ7679AOR8082AOP
I113638FXA3941ENL4246TAS4751GSP5254TRT5559VAT6063OAT6468NET6972ACZ7376AOR7779AOP
I113638FXA3941ENL4246TAS4751GSP5254TRT5559VAT6063OAT6466MOP6770ACZ7174AOR7577AOP
I093638FXA3941ENL4246TAS4751GSP5254TRT5559VAT6063OAT6466MOP6770ACZ
I083638FXA3941ENL4246TAS4751VAT5255OAT5659ACZ6063AOR6466AOP
I093638FXA3941ENL4246TAS4751GSP5254TRT5559VAT6063OAT6468NET6972ACZ
I113638FXA3941ENL4246TAS4751GSP5254TRT5559VAT6063OAT6468NET6972ACZ7376AOR7779AOP
I083638FXA3941ENL4246TAS4751GSP5254TRT5559VAT6063OAT6467ACZ

@PacketFiend
Copy link
Contributor Author

PacketFiend commented Mar 31, 2023

This latest commit should address a lot of Steve's concerns. Likely not all of them, but hopefully most.

I think I've managed to avoid checking in igc.cc for every extension multiple times (which would necessitate a lot of code changes if and when new extensions are added), but I wasn't able to do that for kml.cc yet. Doing so would require some refactoring.

There's some member functions of these new structs that get and set values. This is to make it possible to do so without all those ugly if ladders in igc.cc. There may well be a better way to accomplish this.

I added fld_igc_begin and fld_fld_end to the wp_field struct and made that struct public. I'm not entirely sure this is the best way to go about this. It allows me to use a single if/else for all the IGC extensions in kml_mt_simple_array(), although it may introduce a bug that I'm not yet aware of.

Scaling and conversion issues, particularly with temperature, still need to be addressed. Any data from the B records that are either non-numeric, or that encode multiple values (e.g. WVE) would break this implementation as it stands, however it's simple enough for now to just not support those extensions.

igc.cc Outdated Show resolved Hide resolved
igc.cc Outdated
Comment on lines 513 to 516
short begin = QStringView(i.key().mid(0,2)).toInt() - 1;
short end = QStringView(i.key().mid(2,2)).toInt() - 1;

QStringView ext_record_type = QStringView(i.key().mid(4,3)).toString();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, here we are using the keys. But we still aren't using the hash to find values from keys => we don't need a hash. We could use a QList<std::pair<QString, short>>.

QStringView(i.key().mid(0,2)).toInt() - 1;
unnecessarily constructs a QStringView. just convert the QString that mid constructed.
short begin = i.key().mid(0,2).toInt() - 1;

QStringView ext_record_type = QStringView(i.key().mid(4,3)).toString();
i.key().mid(4,3) is a QString, which we convert to a QStringView, which we convert back to a QString, which we convert back to a QStringView!
instead:
QString ext_record_type = i.key().mid(4,3);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PacketFiend , please see the attached implementation of these ideas
igc.patch

@robertlipe
Copy link
Collaborator

robertlipe commented Apr 1, 2023 via email

@PacketFiend
Copy link
Contributor Author

PacketFiend commented Apr 1, 2023

Oh don't worry about beating me up, I knew what I was getting into. Yes, pretty much all of my C/C++ experience comes out of college and one or two positions I held afterwards. It's very out of date. These days I'm an operations (devops) guy who writes a lot of templates and Python. Basically, a data plumber who needn't worry about memory allocation (until I step into anything written in Java, but that's another story all about garbage collection). It's a happy coincidence that I was looking for a way to sharpen my C skills at the same time I was looking for a way to import IGC files to Earth.

Your comment re: most formats not having extensions and when they do, having very few of them, is pert. IGC stands alone in this regard, I'm pretty sure (correct me if I'm wrong, because I'd love to see the examples). I can process ~300 IGC files in 30 seconds. I full well may be putting too much effort into making the IGC parser efficient, trying to use pointers to strings instead of copying them, etc. (and coming from a lot of Python recently, I really miss proper pointers)

I eliminated QHashes in igc.cc and replaced them with QLists. I was using hashes because I can kinda switch on a hash, and I much prefer switches to lf ladders. If it's possible to switch on strings, the method eludes me, so I've put in a couple of enums.

Then, re: the IGC extensions being per trackpoint, I should elaborate. If an IGC file has extension data defined in an I record, then that data will be present for every waypoint (in the rare case it's invalid, it can be discarded). It would be far, far better, IMO, to attach that metadata to the route, rather than iterating through every waypoint and setting flags for the presence of various extension data, as is currently done in kml_mt_header().

One issue I'm running into is that there is "extra" data passed into kml.cc (heart rate, cadence, etc) that appears to have been attached to each waypoint, where in my opinion, that data should be kept as FormatSpecificData. I'm assuming it comes from FitBits and dive watches and similar things. So I'm sending all this IGC specific stuff to the final output, and running into logic that looks like it was hacked into the Waypoint object. In short, I need to fit my IGC stuff into a subroutine that's been hacked up to high heaven to fit other "extensions".

Perhaps what is missing is a "baseline", whereby all waypoints will contain some common denominator of data - and all other data shall be passed as FormatSpecificData. I dunno, I'm just spitballing here.

re: "Not sure exactly what that means" - I'm adding comments like that here and there where I wrote something, but don't entirely understand it. I'm just hoping for inline comments here on Github or a link that can explain it better if anyone comes across it; and, I think it's good practice to comment my code with "I don't know how this works", when the situation warrants.

And the dump functions - I'm only beginning to realize how good qDebug() is. As much as I hate how heavy Qt is (and the dependencies I need to install on my system), by God it's a fully featured library that I struggle to wrap my head around. Give me back my C strings!

P.S. If anyone could recommend a good book along the lines of "The Evolution Of C++", I'll read it.

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 1, 2023

P.S. If anyone could recommend a good book along the lines of "The Evolution Of C++", I'll read it.

These are some good references:

  1. https://isocpp.github.io/CppCoreGuidelines/
  2. https://www.aristeia.com/books.html "Effective Modern C++". We are using C++17, so it may not cover things like structured binding
  3. https://en.cppreference.com/w/cpp

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 1, 2023

@PacketFiend
The test cases are failing. I think you are confusing KmlFormat::wp_field and IgcFormat::igc_ext_type_t in KmlFormat::kml_mt_simple_array. This is one of the problems that scoped enumerations can prevent.

If you are building in bld you can run the kml tests from the top with
./testo -p bld/gpsbabel kml

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 1, 2023

@PacketFiend This is going to take some more spins, things are improving but there is still lots to be done. I'll get back to you with details later, I'm going to be out most of the day. I am willing to continue to provide feedback with the goal of sharpening up your c++ skills and getting some clean maintainable code. But if either one of us gets frustrated with this process we need to communicate and see if we can do something different. Thanks for your effort and sticking with it.

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 1, 2023

I'm about out, but here are some comments to contemplate:

  1. In IgcFormat::read, case rec_fix_defn
    a) make ext_types_list local to this case scope.
    b) create a new list, ext_defs_list, scoped to IgcFormat::read, that has all the information needed for IgcFormat::read case rec_fix. This can replace fs_metadata. By construction it will only contain items that we want to process from the I record, so we don't need std::optional here or any flags.
    The goal is to avoid any parsing of the 7 characters from each I record extension for EVERY B record in IgcFormat::read case rec_fix.
  2. In IgcFormat::read case
    a) Only use ext_defs_list.
    b) Instead of
    if (fs_metadata.has_data())
    we can use
    if (!ext_defs_list.isEmpty())
  3. When writing to the console from a QString convert it to 8-bit with qPrintable. This will use local8bit, which is very likely to be utf-8, but may not be. It will match the encoding expected by the console.
  4. please just forget QStringView for now.
    Instead of
    QStringView(ibuf_q).mid(start, len)
    just use QString::mid, i.e.
    ibuf_q.mid(start,len)
    The existing code is unnecessarily creating a QStringView. There is absolutely no advantage to doing this in this case.
    We can do this kind of optimization later. The optimizations available are often dependent on the versions of Qt we support, currently 5.12 and greater including Qt6.

kml.cc Outdated
Comment on lines 1480 to 1481
short value;
value = fs_igc->get_value(member).value();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

join declaration and assignment, hopefully with type double.

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 5, 2023

All your Array data disappeared from your *.igc.kml reference files. When we change reference files it is important to do a diff and make sure the changes are correct.

So there is a bug here, perhaps from all the template confusion.

igc.h Outdated
// Will return zero if no match
igc_ext_type_t get_ext_type(const QString& type) const {
IgcFormat::igc_ext_type_t ret;
if (igc_extension_map.contains(type)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the bug that lost all the Arrays in the kml output. You meant to negate contains, or switch the two clauses. Anyway, with my suggestion #1054 (comment) you don't have to check to see if the value is contained at all.

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 5, 2023

If all the ext values are saved as doubles then factor should be a double in ext_types_list
QList<std::tuple<QString, igc_ext_type_t, int, int, double>> ext_types_list;
, ext_data should be a double instead of an int
double ext_data = ibuf_q.mid(start,len).toInt() / factor;
set_value should take a double
bool set_value(IgcFormat::igc_ext_type_t type, double value) {
get_value should use doubles

    std::optional<double> get_value(KmlFormat::wp_field defn_type) const {
    std::optional<double> ret;

and after cleaning up the template mess you should see floating point values in your kml:

+              <gx:SimpleArrayData name="Otsd Air Temp">
+                <gx:value>26.1</gx:value>
+                <gx:value>26.1</gx:value>
+                <gx:value>26.1</gx:value>

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 5, 2023

Again, assuming we can settle on double for all extension values
it is better to leave factor as an int
QList<std::tuple<QString, igc_ext_type_t, int, int, int>> ext_types_list;
and cast the arguments of the division to double.
double ext_data = static_cast<double>(ibuf_q.mid(start,len).toInt()) / static_cast<double>(factor);
at least one of the arguments of the divide needs to be a double so we do floating point division. As long as one is the other will be converted implicitly.

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 5, 2023

In my previous suggestion #1054 (comment) you don't need to qualify the default value with IgcFormat since this is defined in class. So the return line becomes
return igc_extension_map.value(type, igc_ext_type_t::ext_rec_unknown);

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 5, 2023

Sorry for the template confusion. Please work through all the comments after your last commit. In case they aren't clear, here are the changes I propose.
comment_answers.txt

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 5, 2023

For those following along this image shows a display of the enhanced kml file. This used my comment_answer version, I assume @PacketFiend will commit fixes for those issues. The GE elevation profile is plotting elevation and outside air temperature. Outside air temperature is properly scaled, and has the full resolution supplied in the igc file, 0.1°C. Other extension data from the igc file can be plotted as well such as Engine Noise, True Air Speed, Compensated Variometer Vertical Speed, Ground Speed, Fix Accuracy, Z Acceleration.

Nice enhancement @PacketFiend !

92HV66G1

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 5, 2023

I think VAT should have a scale factor of 10, not 1. See AL8, pg 38, VAT. With a factor of 10 we should display vertical speed in meters per second.

@PacketFiend
Copy link
Contributor Author

PacketFiend commented Apr 5, 2023

It might make sense to have a kml option to enable igc extension data. The only argument against this is the extension data will only exist if we read an igc file, and given that will the user want to ignore it? perhaps.

There's no reason not to read it in when parsing the IGC file. But when writing the KML file, there's two issues I see that including all of it can cause:

  1. The generated KML file size can exceed file attachment limits on various email platforms
  2. The number of SimpleArrays can cause cause those little buttons in the elevation profile to require more space than is available on a 1920x1080 display:
    Google Earth Pro_001

Currently there's 8 supported extensions, and I think there should be enough screen space for that on most desktops. The screenshot includes SIU and TRT which are currently #ifdef'd out, as they're not very useful in the KML output. Once more extensions are added this issue will become quite noticeable with several models of flight recorders/gliders.

There's a few possible solutions:

  1. If there's fewer than, say, 5 or 6 SimpleArrays, include them all
  2. Include a "sane" set of defaults
  3. If more than 5 or 6 SimpleArrays are present, drop extensions in some predefined order of "usefulness". E.g. SIU, TRT, and FXA can all be dropped, but perhaps SIU is more important than FXA, which is more important than TRT, so drop them in that order until a maximum number of SimpleArrays is reached.
  4. Provide a warning when the generated KML exceeds some size with a helpful prompt on how to reduce it.
  5. Include none by default but present options in the GUI and a CLI console message when initially converting to KML
  6. Some combination of the above
  7. Do nothing and wait until Google wraps those buttons on multiple lines, or push to have that feature added to Earth.

Of these, 1,2, and 3 would make it much easier to convert large numbers of files programmatically. 4 and 5 would work better if we're trying to keep KML sizes down so the output can be more easily shared and transmitted.

It's probably better to leave this specific discussion until this can be merged as it is. It's more of a frontend than backend problem, and it's better to keep this PR confined to the beck end logic, I think.

@PacketFiend
Copy link
Contributor Author

PacketFiend commented Apr 5, 2023

Latest commit addresses Steve's latest comments, I think. **int** ext_data = ibuf_q.mid(start,len).toInt() / factor; was a rather glaring error to miss.

I still need to integrate temperature and satellites in use, but I'm leaving that for another commit so it's more readable.

That templating of igc_fsdata was causing some really weird stuff with the <reinterpret_cast> call when pulling it back out. I'd put in data as ENL and it would come out as FXA. A bunch of the metrics did that, seemingly randomly. Glad to have it gone.

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 5, 2023

So if we defer limiting of kml data until later are you ready for this commit to be merged?
Are you OK if I push to the astyle changes to PacketFiend:master?

I think this PR may have the record for the number of comments, it's getting hard to check them off.

@PacketFiend
Copy link
Contributor Author

PacketFiend commented Apr 5, 2023

So if we defer limiting of kml data until later are you ready for this commit to be merged?

Yes, I think so. A few metrics, notably OAT and SIU, should still be integrated with the Waypoint objects, however that can be considered a separate problem to just making sure this works. We're at almost 100 comments now, this PR may be getting difficult to follow.

Are you OK if I push to the astyle changes to PacketFiend:master?

No problem there.

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 6, 2023

Thanks @PacketFiend. I tweaked KmlFormat::wp_field. Note that after this merges there will be pre-release (unsupported, unarchived, continuously changing) installers at
https://github.com/GPSBabel/gpsbabel/releases/tag/Continuous-Windows and https://github.com/GPSBabel/gpsbabel/releases/tag/Continuous-macOS. Perhaps some of you club members can use these to provide feedback for the next round.

@tsteven4 tsteven4 merged commit cad16e1 into GPSBabel:master Apr 6, 2023
@GPSBabelDeveloper
Copy link
Collaborator

GPSBabelDeveloper commented Apr 6, 2023 via email

@tsteven4
Copy link
Collaborator

tsteven4 commented Apr 7, 2023

I'm pretty happy as well. I wouldn't let the switch statements bum you out.

We likely will have to deal with kml bloat. If it was just turn off igc arrays that would be easy, but it sounds like it might be more of a selection of a group of igc extension data which is a bit more involved. I guess there is a limit though: I think I records are limited to 99 characters, and there can only be one I record => at most 13 extensions in one file. Of course we could read multiple files and encounter more types between them.

It would be good to get some of the IGC data into Waypoint fields. We've spent a lot of effort getting better resolution of time, down to a millisecond. If IGC files exist with TDS data it would be good to use that data. etc. for SIU, LOD, LAD, OAT. I haven't found a file with TDS, LOD or LAD, but perhaps you can. No that other writers like unicsv and xcsv have a chance at using any data member from Waypoint, but using format specific data requires work like you just did for the kml writer.

Stylistically, I'd write has_igc_tas = fs_igc->tas.has_value() instead of if (thing that might be true) foo = true

That would introduce a bug, has_igc_tas would just be set based only on the last waypoint instead of being true if any waypoint has the data. These statements are in a loop over waypoints. The input can be accumulated from various read operations, so the values might not be the same for every waypoint.

But it is tempting to
has_igc_type |= fs_igc->tas.has_value();

with both sides being of type bool

This is a bitwise inclusive or compound assignment operator, not a logical or.
bitwise inclusive or (8.13) does the usual arithmetic conversions.
usual arithmetic conversion (11.5) calls for integral promotions (7.6), which converts false to integer 0 and true to integer 1.
so it should work, but that's a lot of chasing through the "Standard for Programming Language C++".

@PacketFiend
Copy link
Contributor Author

PacketFiend commented Apr 7, 2023

I wouldn't count on millisecond resolution in any IGC files. I have a few hundred that I've been testing with - not one has resolution finer than one second (but it's not impossible).

I don't like the switches and if ladders either. A lot of effort went into eliminating them, or at least not introducing more of them.

The KML bloat does worry me insofar as it could unnecessarily expand the repository size with every commit I and others make that work on the IGC module. I seem to need to commit a new reference file with every commit.

I have more thoughts, I'll take them to the mailing list though.

@GPSBabelDeveloper
Copy link
Collaborator

GPSBabelDeveloper commented Apr 7, 2023 via email

@PacketFiend
Copy link
Contributor Author

It is possible that subsecond resolution is present in some files. I'm not seeing that because generally, that requires more power to the flight recorder, and gliders don't have a power source - just batteries, and you can't simply install a bigger battery, as it fubars your weight&balance. But batteries are getting better all the time, as are flight recorders. I would not be surprised if this happens at some point in the foreseeable future. I'm just not seeing it now.

I'd also like to experiment with some recorders we're building ourselves at my club, with subsecond resolution, for analyzing training and aerobatic flights.

It's not that it's imperative the reference files don't change - the issue is more that every time they do, the repository grows by around 5MB. That adds up over the years.

tsteven4 added a commit to tsteven4/gpsbabel that referenced this pull request Sep 12, 2023
@PacketFiend
Copy link
Contributor Author

Just wanted say thanks for getting this merged.

A lot of the extensions are "extra", but I was really keen on getting engine noise added in. It's used in competition scoring, because if you're flying a glider competitively, then obviously, using an engine is cheating (many gliders have either self launching or sustainer engines). The engine noise extension is how we measure that.

Exhaust gas temperature serves a similar purpose - some gliders can launch with onboard turbine engines, and EGT is how we measure that. I'll do some work on including it when I get the requisite IGC files.

tsteven4 added a commit that referenced this pull request Sep 14, 2023
* fix SchemaArrayField/SchemaArrayData mismatch

related to IGC additions from #1060 and #1054.

* retire global trait collection

* use class member to store track traits instead of fs data.

* fiddle with kml accumlate hash

* compute number of wp_fields

* correct bug concerning track specific traits.

* simplify multi-track array decisions.

* simplify kml multi-track schema decisions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants