-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed modification to xsd to clarify use of Count objects #8
base: version2
Are you sure you want to change the base?
Conversation
This PR proposes some changes to simplify the use of the Counts objects. The change is directly to the XSD, so is only meant to make the proposal easily understandable. We propose the following: 1. Splitting out "SummaryCounts" into "ErrorCounts" and "BallotCounts". 2. "Election" will add a new reference to "BallotCounts", "Contest" will replace its reference to "SummaryCounts" with "ErrorCounts", "GpUnit" will remove its reference to "SummaryCounts" 3. "GpUnitId" will be made required for the "Counts" object. There are two items in particular which still need some discussion: 1. "ErrorCounts" should be renamed to something more accurate, as "Undervotes" in most elections won't be considered an error. 2. Should "WriteIn" be removed from "ErrorCounts"? It is already a valid "CountItemType" for the parent "Counts" object.
Hi Mark – I'm going to work today on updating the model according to my understanding of your comments, since I do better these days looking at the model than at the schema. I don't always guarantee my modeling work so Sam will have to have the last word. I agree with you that "error counts" is not the right term; I'll ask around and see if someone can suggest something else that does the job, I can't think of anything to use right off the bat. Thanks again for your comments, John |
I've uploaded a revised UML model and picture which I think implements your changes, but I left writeins in the ballotcounts class for the time being. I am seeing the major difference being that it's no longer possible to associate the error counts with a gpunit such as a precinct nor a device, and that it does remove the possibility of a circular reference to gpunits. Is this what you intended? |
Yes, the removal of any "Counts" object as a child from GpUnit is intended -- as instead we made GpUnitId be a "required" field within the Counts object -- which hopefully will help clear up some of the ambiguity that Justin had pointed out. |
Just FYI the proposal Mark posted does retain that capability. Overvotes, undervotes, and write-ins are in the probably-should-be-renamed-to-something-better <Contest xsi:type="CandidateContest" objectId="cc1">
<BallotSelection>...</BallotSelection>
<BallotSelection>...</BallotSelection>
<ElectoralDistrictId>state1</ElectoralDistrictId>
<ErrorCounts>
<GpUnitId>state1</GpUnitId>
<Type>errors</Type> <!-- This is a new type. Or can be optional -->
<Overvotes>4</Overvotes>
<Undervotes>5</Undervotes>
<WriteIns>6</Undervotes>
</ErrorCounts>
<ErrorCounts>
<GpUnitId>precinct1</GpUnitId>
<Type>errors</Type>
<Overvotes>1</Overvotes>
<Undervotes>2</Undervotes>
<WriteIns>3</Undervotes>
</ErrorCounts>
<ErrorCounts>
<GpUnitId>precinct2</GpUnitId>
<Type>errors</Type>
<Overvotes>3</Overvotes>
<Undervotes>3</Undervotes>
<WriteIns>3</Undervotes>
</ErrorCounts>
</Contest> |
Justin, apologies, I missed that ErrorCounts included GpUnit. Before I continue, I see an ambiguity in the model we should correct - Device ought to be renamed to something like DeviceType or DeviceClass, because it serves as a filter by a type of device on a count item. With the current name, I confuse it sometimes with ReportingDevice. So, let me make sure I understand - in the current model, GpUnit-->SummaryCounts allows one to associate summary ballot counts directly with the geography represented by that GpUnit. GpUnit-->Counts(Type=SummaryCounts) allows one to filter the summary ballot counts for that geography by device type and count item type. And there lies the problem - Counts can reference GpUnit in a circular way and one needs to know NOT to do that. A schematron ruleset would help -- if used. Also in the current model, Contest-->SummaryCounts allows one to report on how summary counts for a contest. Contest-->Counts(Type=SummaryCounts) allows one to filter the report by device type and count item type. Contest-->Counts(Type=SummaryCounts)-->GpUnit allows one to to filter by device type and count and also by geography. So, one can report on ballot summaries per contest, or per contest by geography. By removing the association from GpUnit to SummaryCounts, it is no longer possible to associate summaries of ballot counts with a geography - unless one does this by reporting on summary counts for all contests for that geography, which would lead to ambiguities. Do I have this right? If so, I know for sure we'd get complaints if we remove this capability. To retain the current capability and also remove the possibility of the circular reference, one possibility is to make VoteCounts and SummaryCounts standalone classes, and make Counts a subclass that each class can include, renaming it to something like CountFilter, since that is mainly what it does. SummaryCounts does not need to include GpUnit, whereas VoteCounts does. I could be entirely wrong - I've thought about this so much my brain hurts! |
I agree with renaming this to something more descriptive. I like "DeviceClass"
Additionally, it allows you to report SummaryCounts (BallotsCast, BallotsRejected, etc.) at both the Contest and the GpUnit level. Even if the producer doesn't end up creating a circular dependency, it's very possible that they provide conflicting values at the Contest level vs. the GpUnit level. Or it's possible that certain producers only create Contest-level SummaryCounts, and others only produce GpUnit-level SummaryCounts. It ends up making it difficult to consume this data.
I don't think I understand why this would lead to ambiguities. In the original proposal, "BallotsCast", "BallotsRejected", and "BallotsOutstanding" would be reported at the Election-level as part of an unbounded list of "BallotCount" objects. These can each be associated with a GpUnit via the "GpUnitId" field, so if ReportingUnit-level or ReportingDevice-level metrics are required, you could have one "BallotCount". Justin and I were also thinking that by definition, Undervotes and Overvotes occur at the contest level. As Justin pointed out with his example, these can also be associated with a GpUnit via the "GpUnitId" field in "ErrorCounts" (inherited from "Counts"). If you wanted the total amount of overvotes for a specific GpUnitId, you could aggregate across the various contests, looking for ErrorCounts with that Id. As you pointed out, currently it is possible to provide these counts at both the contest and the gpunit level, which creates an ambiguity on the consumer's standpoint of where to look when trying to report on device-level aggregations. What if the contest-level and the GpUnit-level stats conflict? We're attempting to come up with a change to the schema that makes it clear from a producer/consumer standpoint of where to put these per-GpUnit and per-contest metrics. If our proposal doesn't address that point (and the point you raised about being able to associate ballot counts with a geography) then we should continue to iterate until we get it right. Perhaps we need to carve out some more time over a phone call to discuss further? |
This PR proposes some changes to simplify the use of the Counts objects. NOTE: this PR should not directly be submitted. The change is directly to the XSD, so is only meant to make the proposal easily understandable.
We propose the following:
There are two items in particular which still need some discussion: