Fix word-captialization XSL error #22

waldoj · 2016-07-06T02:00:05Z

Error on line 68 of decoded.xsl:
  XTTE0790: An empty sequence is not allowed as the first argument of fn:capitalize_phrase()
  at xsl:apply-templates (file:/vol/vacode.org/decoded.xsl#73)
     processing /legislativeDoc/metadata[1]/hierarchy[1]/hierarchyLevel[1]/hierarchyLevel[1]
  at xsl:apply-templates (file:/vol/vacode.org/decoded.xsl#27)
     processing /legislativeDoc/metadata[1]/hierarchy[1]/hierarchyLevel[1]
  at xsl:for-each (file:/vol/vacode.org/decoded.xsl#26)
     processing /legislativeDoc/metadata[1]/hierarchy[1]
  in built-in template rule
While processing 5K0Y-PTF0-004G-K44X-00000-00.xml: Run-time errors were reported

In the file in question, I can't identify any actual problems. At least in reading it (as opposed to parsing it on a per-character level for e.g. a zero-width Unicode character), everything seems fine.

Note that the outcome of this error is severe—it yields a zero-length XML file, as opposed to just a file with a minor error.

The text was updated successfully, but these errors were encountered:

waldoj · 2016-07-28T02:29:12Z

Ahhh, I see what the problems are. (There are two) There are a few dozen Lexis-Nexis XML files that lack a crucial tag:

<hierarchy>
   <hierarchyLevel levelType="title">
      <heading>
         <desig>TITLE 58.1.</desig>
         <title>TAXATION</title>
      </heading>
      <hierarchyLevel levelType="title">
         <heading>
            <desig>GENERAL PROVISIONS OF TITLE 58.1</desig>
         </heading>
         <hierarchyLevel levelType="article">
            <heading>
               <desig>ARTICLE 2.</desig>
               <title>RESPONSIBILITY OF FIDUCIARIES IN TAX MATTERS</title>
            </heading>
         </hierarchyLevel>
      </hierarchyLevel>
   </hierarchyLevel>
</hierarchy>

At every level, we have desig and title tags...except for "General Provisions of Title 58.1." But it gets weirder: that's a title. So we have a title nested inside of a...title? With an article inside of that? I don't have the foggiest idea of what to do here.

waldoj · 2016-07-28T02:36:04Z

Ohhh, I've got it.

The XML on the official site helped me to figure it out. This law is in Chapter 0. Which is pretty weird, because I've never seen any Chapters 0. Looking at the Title 58.1 page on the official site, I see that the title is divided into Subtitles I–IV. Those are subdivided into Chapters. Subtitle I contains no Chapter 0.

Chapter 0 is a fiction. Chapter 0 is a placeholder. In fact, Chapter 0 means that these laws are children of Title 58.1 (divided into Article 1 and Article 2). The official site tucks these up top.

But Lexis-Nexis' XML is silent on chapter 0. There's no designation at all.

I gotta chew over how to best handle these.

krusynth · 2017-03-06T17:02:45Z

Seems pretty similar to what I ran into with statedecoded/statedecoded#574

krusynth · 2017-03-09T16:56:45Z

Given the source:

<hierarchy>
  <hierarchyLevel levelType="title">
    <heading>
      <desig>TITLE 58.1.</desig>
      <title>TAXATION</title>
    </heading>
    <hierarchyLevel levelType="title">
      <heading>
        <desig>GENERAL PROVISIONS OF TITLE 58.1</desig>
      </heading>
      <hierarchyLevel levelType="article">
        <heading>
          <desig>ARTICLE 1.</desig>
          <title>IN GENERAL</title>
        </heading>
      </hierarchyLevel>
    </hierarchyLevel>
  </hierarchyLevel>
</hierarchy>

For the XSLT, this makes sense to me:

<unit label="title" level="1" identifier="58.1">Taxation</unit>
<unit label="title" level="2">General Provisions Of Title 58.1</unit>
<unit label="article" level="3" identifier="1">In General</unit>

And is achieved thus:

<xsl:choose>
  <xsl:when test="heading/title">
    <xsl:attribute name="identifier">
      <xsl:value-of select="replace(replace(normalize-space(heading/desig), '^(TITLE|SUBTITLE|ARTICLE|CHAPTER|SUBCHAPTER|PART) ', '' ), '.$', '')"/>
    </xsl:attribute>
    <xsl:value-of select="fn:capitalize_phrase(heading/title)"/>
  </xsl:when>

  <xsl:otherwise>
    <xsl:value-of select="fn:capitalize_phrase(heading/desig)"/>
  </xsl:otherwise>
</xsl:choose>

This leaves it up the Parser to handle these "administrative" divisions. We'll still need to generate an identifier for these in the short-term, since SD requires this. Since I've run into this a lot, I have hacked around it by custom rules – but ideally, we should support structures that do not have real identifiers.

waldoj · 2017-03-10T03:54:17Z

A very sensible solution!

Legal codes are made up entirely of edge cases, I believe we've learned.

krusynth · 2017-03-10T13:44:45Z

Well, moreover – trying to nicely fit legal text into abstract node-based tree models just doesn't work. Whenever this thing gets a rewrite, it needs to be waaaaaay more flexible in how things are actually composed.

This also is a massive cleanup of the structure class, to reduce the magic and sprawling code. In particular: * Removes magic global * Prefers passing in arguments to functions over context-specific magic properties. * Removes duplicate code in ancestor handling between id_ancestry and get_current * Replaces url mangling with a consistent interface to permalinks * Replaces inconsistent, renamed properties with standard internal naming * Eliminates abusing objects as arrays with real arrays.

Handle structures without identifiers. Fixes #574 openva/va-decoded#22

krusynth pushed a commit to krusynth/va-decoded that referenced this issue Mar 9, 2017

Adding support for administrative divisions. openva#22

70d6c87

krusynth mentioned this issue Mar 9, 2017

Various XSLT fixes #39

Merged

krusynth mentioned this issue Mar 19, 2017

Handle structures without identifiers. Fixes #574 openva/va-decoded#22 statedecoded/statedecoded#689

Merged

waldoj added a commit to statedecoded/statedecoded that referenced this issue Mar 20, 2017

Merge pull request #689 from statedecoded/issue-574

9707926

Handle structures without identifiers. Fixes #574 openva/va-decoded#22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix word-captialization XSL error #22

Fix word-captialization XSL error #22

waldoj commented Jul 6, 2016

waldoj commented Jul 28, 2016

waldoj commented Jul 28, 2016 •

edited

Loading

krusynth commented Mar 6, 2017

krusynth commented Mar 9, 2017 •

edited

Loading

waldoj commented Mar 10, 2017

krusynth commented Mar 10, 2017

Fix word-captialization XSL error #22

Fix word-captialization XSL error #22

Comments

waldoj commented Jul 6, 2016

waldoj commented Jul 28, 2016

waldoj commented Jul 28, 2016 • edited Loading

krusynth commented Mar 6, 2017

krusynth commented Mar 9, 2017 • edited Loading

waldoj commented Mar 10, 2017

krusynth commented Mar 10, 2017

waldoj commented Jul 28, 2016 •

edited

Loading

krusynth commented Mar 9, 2017 •

edited

Loading