Skip to content

allow etree.XMLParser to resolve external entities#17504

Open
jancrichter wants to merge 3 commits intogalaxyproject:devfrom
jancrichter:patch-2
Open

allow etree.XMLParser to resolve external entities#17504
jancrichter wants to merge 3 commits intogalaxyproject:devfrom
jancrichter:patch-2

Conversation

@jancrichter
Copy link
Copy Markdown
Contributor

We use XML external (file) entities to include .rst files for the help section in tool XMLs. U ntil recently this wasn't a problem, but the default in lxml was recently changed and there is no way in planemo linting or testing to set this flag for the XMLParser.

This change mainly makes parse_xml() use the old default by setting resolve_entities=True for etree.XMLParser

Without this change planemo 75.20 on Galaxy 22.05 fails with the following error message. The same error message also appears when running planemo serve (at the very beginining) but does not make it fail. Tools load fine and work normally.

galaxy.util ERROR: Error parsing file <REDACTED>/xml/defaults.xml
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/galaxy/util/__init__.py", line 305, in parse_xml
    tree = etree.parse(str(fname), parser=parser)
  File "src/lxml/etree.pyx", line 3547, in lxml.etree.parse
  File "src/lxml/parser.pxi", line 1952, in lxml.etree._parseDocument
  File "src/lxml/parser.pxi", line 1978, in lxml.etree._parseDocumentFromURL
  File "src/lxml/parser.pxi", line 1881, in lxml.etree._parseDocFromFile
  File "src/lxml/parser.pxi", line 1200, in lxml.etree._BaseParser._parseDocFromFile
  File "src/lxml/parser.pxi", line 633, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 743, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 672, in lxml.etree._raiseParseError
  File "<REDACTED>/xml/defaults.xml", line 38
lxml.etree.XMLSyntaxError: Entity 'help_text' not defined, line 38, column 18
Error loading tool with path <REDACTED>.xml
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/galaxy/tool_util/loader_directory.py", line 103, in _load_tools_from_path
    tool_element = loader_func(possible_tool_file)
  File "<REDACTED>/xml/defaults.xml", line 38
lxml.etree.XMLSyntaxError: Entity 'help_text' not defined, line 38, column 18

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:

The tools use a doctype declaration and XML macro to include .rst files as help section:

xml/defaults.xml:

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE documentation [
  <!ENTITY help_text SYSTEM "../README.rst">
  <!ENTITY changes_text SYSTEM "../CHANGES.rst">
]>

<macros>
  <xml name="help"> 
    <help>
      &help_text;
      &changes_text;
    </help>
  </xml>
</macros>

tool.xml:

...

  <macros>
    <import>xml/defaults.xml</import>
  </macros>

  <expand macro="help"/>

...

Running planemo lint or test on this causes the above error.

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@github-actions github-actions bot added this to the 24.0 milestone Feb 20, 2024
@jdavcs jdavcs removed this from the 24.0 milestone Mar 18, 2024
@mvdbeek mvdbeek self-requested a review June 4, 2025 07:31
@mvdbeek mvdbeek force-pushed the patch-2 branch 2 times, most recently from 1ee533f to 66bc0b5 Compare June 4, 2025 09:07
Copy link
Copy Markdown
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've limited the location of external entities to the same directory of the file that is currently being loaded and added test cases for blocked and resolved entities. I think this is surely fine, certainly since this was possible in an unrestricted way before lxml 5.

@mvdbeek mvdbeek force-pushed the patch-2 branch 3 times, most recently from e03473c to 51705f9 Compare June 9, 2025 13:20
jancrichter and others added 3 commits January 30, 2026 10:22
We use XML external (file) entities to include .rst files for the help section in tool XMLs. U ntil recently this wasn't a problem, but the default in lxml was recently changed and there is no way in planemo linting or testing to set this flag for the XMLParser. 

This change mainly makes parse_xml() use the old default.
Thanks Nicola!

Co-authored-by: Nicola Soranzo <nicola.soranzo@gmail.com>
@mvdbeek mvdbeek self-assigned this Jan 30, 2026
@guerler guerler added this to the 26.1 milestone Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants