Releases: jgm/pandoc
pandoc 2.8.0.1
- List
pdf
in--list-output-formats
. - EPUB writer: Fix regression with
--css
(#5937). In 2.8--css
would not have an effect on EPUB output. - RST writer: Use grid tables for one-column tables, since simple tables clash with heading syntax in this case (#5936).
- Add unexported module Text.Pandoc.Readers.Metadata (see #5914).
- Use doctemplates 0.7.2, which adds the
nowrap
filter to templates. - Update default man template using
nowrap
for .TH heading (#5929). - HTML templates: Add support for
toc-title
variable (#5930, Alexandre Franke). - Remove
grffile
(LaTeX package) requirement in MANUAL.txt (#5927, Ian Max Andolina). - Use skylighting 0.8.3.
pandoc 2.8
-
Improvements in templates system (from doctemplates):
- Pandoc templates now support a number of new features that have been added in doctemplates: notably,
elseif
,it
, partials, filters, and syntax to control nesting and reflowing of text. These changes make pandoc more suitable out of the box for generating plain-text documents from data in YAML metadata. It can create enumerated lists and even tabular structures. - We now used templates parameterized on doclayout Doc types. The main impact of this change is better reflowing of content interpolated into templates. Previously, interpolated variables were rendered independently and intepolated as strings, which could lead to overly long lines. Now the templates interpolated as Doc values which may include breaking spaces, and reflowing occurs after template interpolation rather than before.
- Remove code from the LaTeX, Docbook, and JATS writers that looked in the template for strings to determine whether it is a book or an article, or whether csquotes is used. This was always kludgy and unreliable.
- Change template code to use new API for doctemplates.
- Pandoc templates now support a number of new features that have been added in doctemplates: notably,
-
Add
--defaults
/-d
option. This adds the ability to specify a collection of default values for options in a YAML file. For example, one might define a set of defaults for letters, and then dopandoc -d letter myletter.md -o myletter.pdf
. See the documentation of this feature in MANUAL.txt. -
Raise error on unsupported extensions (#4338).
-
The
--list-extensions[=FORMAT]
option now lists only extensions that affect the given FORMAT. -
Add
-L
option as shortcut for--lua-filter
. -
Add
--shift-heading-level-by
option and deprecate--base-heading-level
(#5615). The new option does everything the old one does, but also allows negative shifts. It also promotes the document metadata (if not null) to a level-1 heading with a +1 shift, and demotes an initial level-1 heading to document metadata with a -1 shift. This supports converting documents that use an initial level-1 heading for the document title. -
Allow
--metadata-file
to be used repeatedly to include multiple metadata files (Owen McGrath, #5702). Values in files specified first will be overridden by those in later files. -
--ascii
now uses numerical hex character references (#5718). -
Allow PDF output to stdout (#5751). PDF output now behaves like other binary formats: it will not be output to the terminal, but can be sent to stdout using either
-o -
or a pipe. The intermediate format will be determined based on the setting of--pdf-engine
. -
Make some writers sensitive to ‘unlisted’ class on headings (#1762). If this is present on a heading with the ‘unnumbered’ class, the heading won’t appear in the TOC. This class has no effect if ‘unnumbered’ is not also specified. This affects HTML-based writers (including slide shows and EPUB), LateX (including beamer), RTF, and PowerPoint. Other writers do not yet support
unlisted
. -
Fix
gfm_auto_identifiers
behavior with emojis (#5813). Note that we also now use emoji names for emojis whenascii_identifiers
is enabled. -
When
--ipynb-output
is used with the default “best” format, strip ANSI escape codes for non-ipynb output (#5633). These cause problems in many formats, including LaTeX. -
Don’t look for template files remotely for remote input (#5579). Previously pandoc would look for the template at a remote URL when a URL was used for the input file, instead of taking it from the data directory.
-
Allow combining
-Vheader-includes
and--include-in-header
(#5904). Previouslyheader-includes
set as a variable would be clobbered by material included using--include-in-header
. -
Change merge behavior for metadata. Previously, if a document contained two YAML metadata blocks that set the same field, the conflict would be resolved in favor of the first. Now it is resolved in favor of the second (due to a change in pandoc-types). This makes the behavior more uniform with other things in pandoc (such as reference links and
--metadata-file
). -
Don’t add a newline to fragment output if there’s already one.
-
Change exit codes and document in MANUAL.txt:
PandocAppError
was 1, is now 4PandocOptionError
was 2, is now 6PandocMakePDFError
was 65, is now 66
-
Switch to new pandoc-types and use Text instead of String [API change]. (Christian Despres, #5884).
-
HTML reader:
- Better handling of
<q>
with cite attribute (#5798, Ole Martin Ruud). If a<q>
tag has acite
attribute, we interpret it as a Quoted element with an inner Span. - Add support for HTML
<samp>
element (#5792, Amogh Rathore). The<samp>
element is parsed as Code with classsample
. - Add support for HTML
<var>
element (#5799, Amogh Rathore). The<var>
element is parsed as Code with classvariable
. - Add support for
<mark>
elements (Florian B, #5797). Parse<mark>
elements from HTML as Spans with classmark
. - Add support for
<kbd>
elements, parsing them as Span with classkbd
(Daniele D’Orazio, #5796). - Add support for
<dfn>
, parsing this as a Span with classdfn
(#5882, Florian Beeres).
- Better handling of
-
Markdown reader:
- Headers: don’t parse content over newline boundary (#5714).
- Handle inline code more eagerly within lists (Brian Leung, #5627).
- Removed some needless lookaheads.
- Don’t parse footnote body unless extension enabled.
- Fix small super/subscript issue (#5878). Superscripts and subscripts cannot contain spaces, but newlines were previously allowed (unintentionally). This led to bad interactions in some cases with footnotes. With this change newlines are also not allowed inside super/subscripts.
- Use
take1WhileP
forstr
, table row. This yields a small but measurable performance improvement.
-
LaTeX reader:
- Fix parsing of optional arguments that contain braced text (#5740).
- Don’t try to parse includes if
raw_tex
is set (#5673). When theraw_tex
extension is set, we just carry through\usepackage
,\input
, etc. verbatim as raw LaTeX. - Properly handle optional arguments for macros (#5682).
- Fix
\\
in\parbox
inside a table cell (#5711). - Improve
withRaw
so it can handle cases where the token string is modified by a parser (e.g. accent when it only takes part of a Word token) (#5686). This fixes a bug that caused the ends of certain documents to be dropped. - Handle
\passthrough
macro used by latex writer (#5659). - Support tex
\tt
command (#5654). - Search for image with list of extensions like latex does, if an extension is not provided (#4933).
- Handle
\looseness
command values better (#4439). - Add
mbox
andhbox
handling (Vasily Alferov, #5586). When+raw_tex
is enabled, these are passed through literally. Otherwise, they are handled in a way that emulates LaTeX’s behavior. - Properly handle
\providecommand
and\provideenvironment
(#5635). They are now ignored if the corresponding command or environment is already defined. - Support epigraph command in LaTeX Reader (oquechy, #3523).
- Ensure that expanded macros in raw LaTeX end with a space if the original did (#4442).
- Treat
ly
environment from lilypond as verbatim (Urs Liska, #5671). - Add
tikzcd
to list of special environments (Eigil Rischel). This allows it to be processed by filters, in the same way that one can do fortikzpicture
.
-
Roff reader:
- Better support for
while
. - More improvements in parsing conditionals.
- Fix problem parsing comments before macro.
- Improve handling of groups.
- Better parsing of groups (#5410). We now allow groups where the closing
\\}
isn’t at the beginning of a line.
- Better support for
-
RST reader:
- Keep
name
property inimgAttr
(Brian Leung, #5619). - Fixed parsing of indented blocks (#5753). We were requiring consistent indentation, but this isn’t required by RST.
- Use title, not admonition-title, for admonition title. This puts RST reader into alignment with docbook reader.
- Don’t strip final underscore from absolute URI (#5763).
- Avoid spurious warning when resolving links to internal anchors ending with
_
(#5763).
- Keep
-
Org reader:
- Accept
ATTR_LATEX
in block attributes (Albert Krewinkel, #5648). Attributes for LaTeX output are accepted as valid block attributes; however, their values are ignored. - Modify handling of example blocks (Brian Leung, #5717).
- Allow the
-i
switch to ignore leading spaces (Brian Leung). - Handle awkwardly-aligned code blocks within lists (Brian Leung). Code blocks in Org lists must have their
#+BEGIN_
aligned in a reasonable way, but their other components can be positioned otherwise. - Fix parsing of empty comment lines (#5856, Albert Krewinkel). Comment lines in Org-mode can be completely empty.
- Accept
-
Muse reader (Alexander Krotov):
-
DokuWiki reader:
- Parse markup inside monospace (’’) (#5916, Alexander Krotov).
-
Docx reader:
- Move style-parsing-specific code to a new unexported module, Text.Pandoc.Readers.Docx.Parse.Styles (Nikolay Yakimov).
- Move StyleMap to docx writer (Nikolay Yakimov).
- Only use LTR when it is overriding BiDi setting (#5723, Jesse Rosenthal). The left-to-right direction setting in docx is used in the spec only for overriding an explicit right-to-left setting. We only process ...
pandoc 2.7.3
-
Add
jira
(Atlassian’s Jira wiki markup) as output format (#2497, Albert Krewinkel). -
Add
tex_math_dollars
tomultimarkdownExtensions
(#5512). This form is now supported in multimarkdown, in addition totex_math_double_backslash
. -
Fix
--self-contained
so it works when output format has extensions. Previously if you used--self-contained
withhtml-smart
orhtml+smart
, it wouldn’t work. -
Add template variable
curdir
with working directory from which pandoc is run (#5464). -
Markdown reader: don’t create implicit reference for empty header (#5549).
-
Muse reader: allow images inside link descriptions (Alexander Krotov).
-
HTML reader: epub related fixes.
- With epub extensions, check for
epub:type
in addition totype
. - Fix problem with noteref parsing which caused block-level content to be eaten with the noteref.
- Rename
pAnyTag
topAny
. - Refactor note resolution.
- Trim definition list terms (Alexander Krotov).
- With epub extensions, check for
-
LaTeX reader:
- Add braces when resolving
\DeclareMathOperator
(#5441). These seem to be needed for xelatex but not pdflatex. - Allow newlines in
\mintinline
. - Pass through unknown listings language as class (#5540). Previously if the language was not in the list of languages supported by listings, it would not be added as a class, so highlighting would not be triggered.
rawLaTeXInline
: Include trailing{}
s in raw latex commands (#5439). This change affects the markdown reader and other readers that allow raw LaTeX. Previously, trailing{}
would be included for unknown commands, but not for known commands. However, they are sometimes used to avoid a trailing space after the command. The chances that a{}
after a LaTeX command is not part of the command are very small.
- Add braces when resolving
-
MediaWiki reader: handle multiple attributes in table row (#5471, chinapedia).
-
Docx reader: Add support for
w:rtl
(#5545). Elements with this property are put into Span inlines withdir="rtl"
. -
DocBook reader: Issue
IgnoredElement
warnings. -
Org reader (Albert Krewinkel):
- Fix planning elements in headers level 3 and higher (#5494). Planning info is now always placed before the subtree contents. Previously, the planning info was placed after the content if the header’s subtree was converted to a list, which happens with headers of level 3 and higher per default.
- Omit, but warn about unknown export options. Unknown export options are properly ignored and omitted from the output.
- Prefer plain symbols over math symbols (#5483). Symbols like
\alpha
are output plain and unemphasized, not as math. - Recognize emphasis after TODO/DONE keyword (#5484).
-
FB2 reader:
- Skip unknown elements rather than throwing errors (#5560). Sometimes custom elements are used (e.g.
id
element insideauthor
); previously the reader would halt with an error. Now it skips the element and issues anIgnoredElement
warning. - Parse notes (#5493, Alexander Krotov).
- Internal improvements (Alexander Krotov).
- Skip unknown elements rather than throwing errors (#5560). Sometimes custom elements are used (e.g.
-
OpenDocument writer: Roll back automatic figure/table numbering (#5474). This was added in pandoc 2.7.2, but it makes it impossible to use pandoc-crossref. So this has been rolled back for now, until we find a good solution to make this behavior optional (or a creative way to let pandoc-crossref and this feature to coexist).
-
New module Text.Pandoc.Writers.Jira, exporting
writeJira
[API change] (Albert Krewinkel). -
EPUB writer:
- Don’t include ‘landmarks’ if there aren’t any. Previously we could get an empty ol element, which caused validation errors with epubcheck.
- Ensure unique ids for styleesheets in content.opf (#5463).
- Make stylesheet link compatible with kindlegen (#5466, Eric Schrijver). Pandoc omitted
type="text/css"
from both<style>
and<rel="stylesheet">
elements in all templates, which is valid according to the spec. However, Amazon’s kindlegen software relies on this attribute on<link>
elements when detecting stylesheets to include.
-
HTML writer:
- Output video and audio elements depending on file extension of the image path (Mauro Bieg).
- Emit empty alt tag in figures (#5518, Mauro Bieg). The same text is already in the and screen-readers would read it twice, see #4737.
- Don’t add variation selector if it’s already there. This fixes round-trip failures.
- Prevent gratuitious emojification on iOS (#5469). iOS chooses to render a number of Unicode entities, including ‘↩’, as big colorful emoji. This can be defeated by appending Unicode VARIATION SELECTOR-15’/‘VARIATION SELECTOR-16’. So we now append this character when escaping strings, for both ‘↩’ and ‘↔’. If other characters prove problematic, they can simply be added to
needsVariationSelector
. - Add
class="heading"
to level 7+ Headers rendered as<p>
elements (#5457).
-
RST writer: treat Span with no attributes as transparent (#5446). Previously an Emph inside a Span was being treated as nested markup and ignored. With this patch, the Span is just ignored.
-
LaTeX writer:
- Include inline code attributes with
--listings
(#5420). - Don’t produce columns environment unless beamer (#5485).
- Fix footnote in image caption. Regression: the fix for #4683 broke this case.
- Don’t highlight code in headings (#5574). This causes compilation errors.
- Use
\mbox
to get proper behavior inside\sout
(#5529).
- Include inline code attributes with
-
EPUB writer: Fix document section assignments (#5546). For example, introduction should go in bodymatter, not frontmatter, and epigraph, conclusion, and afterward should go in bodymatter, not backmatter. For the full list of assignments, see the manual.
-
Markdown writer:
- Add backslashes to avoid unwanted interpretation of definition list terms as other kinds of block (#554).
- Ensure the code fence is long enough (#5519). Previously too few backticks were used when the code block contained an indented line of backticks. (Ditto tildes.)
- Handle labels with integer names (Jesse Rosenthal, #5495). Previously if labels had integer names, it could produce a conflict with auto-labeled reference links. Now we test for a conflict and find the next available integer. This involves adding a new state variable
stPrevRefs
to keep track of refs used in other document parts when using--reference-location=block|section
-
Textile writer: fix closing tag for math output (Albert Krewinkel). Opening and closing tag for math output match now.
-
Org writer: always indent src blocks content by 2 spaces (#5440, Albert Krewinkel). Emacs always uses two spaces when indenting the content of src blocks, e.g., when exiting a
C-c '
edit-buffer. Pandoc used to indent contents by the space-equivalent of one tab, but now always uses two spaces, too. -
Asciidoc writer:
- Use
`+...+`
form for inline code. The old`a__b__c`
yields emphasis inside code in asciidoc. To get a pure literal code span, use`+a__b__c+`
. - Use proper smart quotes with asciidoctor (#5487). Asciidoctor has a different format for smart quotes.
- Use doubled ## when necessary for spans (#5566).
- Ensure correct nesting of strong/emph (#5565): strong must be the outer element.
- Use
-
JATS writer:
- Wrap elements with p when needed (#5570). The JATS spec restricts what elements can go inside
fn
andlist-item
. So we wrap other elements inside<p specific-use="wrapper">
when needed. - Properly handle footnotes (#5511) according to “best practice.” (Group them at the end in
<fn-group>
and use<xref>
elements to link them.) - Fix citations with PMID so they validate (#5481). This includes an update to data/jats.csl.
- Ensure validity of
<pub-date>
by parsing the date and extracting year, month, and day, as expected. Also add an iso-8601-date attribute automatically. - Don’t use
<break>
element for LineBreak. It is only allowed in a few special contexts, and not in<p>
elements. - Don’t make
<string-name>
a child of<string>
, which is illegal.
- Wrap elements with p when needed (#5570). The JATS spec restricts what elements can go inside
-
FB2 writer:
- Do not wrap note references into
<sup>
and brackets (Alexander Krotov). Existing FB2 readers, such as FBReader, already display links with type=“note” as a superscript. - Use genre metadata field (#5478).
- Do not wrap note references into
-
Muse writer: do not escape empty line after
<br>
(Alexander Krotov). -
Add unicode code point in “Missing character” warning (#5538). If the character isn’t in the console font, the message is pretty useless, so we show the code point for anything non-ASCII.
-
Lua: add Version type to simplify comparisons (Albert Krewinkel). Version specifiers like
PANDOC_VERSION
andPANDOC_API_VERSION
are turned intoVersion
objects. The objects simplify version-appropriate comparisons while maintaining backward-compatibility. A functionpandoc.types.Version
is added as part of the newly introduced modulepandoc.types
, allowing users to create version objects in scripts. -
pandoc lua module (Albert Krewinkel):
- Fix deletion of nonexistent attributes (#5569).
- Better tests for Attr and AttributeList.
-
pandoc.mediabag lua module (Albert Krewinkel):
- Add function
delete
for deleting a single item. - Add function
empty
for removing all entries. - Add function
items
for iterating over mediabag.
- Add function
-
Text.Pandoc.Class: Fix handling of
file:
URL scheme indownloadOrRead
(#5517, Mauro Bieg). Previouslyfile:/
URLs were handled wrongly and pandoc attempted to make HTTP requests, which failed. -
Text.Pandoc.MIME: add
mediaCategory
[API change] (Mauro Bieg).
...
pandoc 2.7.2
-
Add XWiki writer (#1800, Derek Chen-Becker). Add
Text.Pandoc.Writers.XWiki
, exportingwriteXWiki
[API change]. -
Dokuwiki Reader: parse single curly brace (#5416, Mauro Bieg).
-
Vimwiki reader: improve handling of internal links (#5414). We no longer append
.html
to link targets, and we add a titlewikilink
. This mirrors behavior of other wiki readers. Generally the.html
extension is not wanted. It may be important for output to HTML in certain circumstances, but it can always be added using a filter that matches on links with titlewikilink
.If your workflow requires the current behavior, here is a lua filter that will add the
.html
extension:function Link(el) if el.title == 'wikilink' then el.target = el.target .. ".html" end return el end
-
ipynb reader:
- Use format
ipynb
for raw cell where no format given. - Avoid introducing spurious
.0
on integers in metadata.
- Use format
-
Markdown reader: fenced div takes priority over setext header.
-
HTML reader: read
data-foo
attribute intofoo
(#5392). The HTML writer adds thedata-
prefix for HTML5 for nonstandard attributes. But the attributes are represented in the AST without thedata-
prefix, so we should strip this when reading HTML. -
LaTeX reader: Improve autolink detection (#5340).
-
PowerPoint writer (Jesse Rosenthal):
- Expand builtin reference doc to model all layouts. The previous built-in reference doc had only title and content layouts. Add in a section-header slide and a two-content slide, so users can more easily modify it to build their own templates.
- Always open up in slide view. When editing a template/reference-doc, the user might be in Master view, but when producing a slide show, it is assumed that slide view will be desired.
- Remove
handoutsMasterList
from template presentation.xml - Fix numerous errors in templating (#5402). Previously, some templates produced by Office 365 (MacOS) would not render with
--reference-doc
correctly. We now apply correct shapes for content, and build shape trees correctly. - Make default placeholder type for template lookup.
- Apply speaker notes to metadata slide if applicable.
- Test for speaker notes after breaking header.
- Correctly handle notes after section-title header. Previously, if notes came after a section-title header (ie, a level-1 header in a slide-level=2 presentation), they would go on the next slide. This keeps them on the slide with the header.
- Internal improvements.
-
ipynb writer:
- Use format
ipynb
for raw cell where no format given. According to nbformat docs, this is supposed to render in every format. We don’t do that, but we at least preserve it as a raw block in markdown, so you can round-trip. - Consolidate adjacent raw blocks. Sometimes pandoc creates two HTML blocks, e.g. one for the open tag and one for a close tag. If these aren’t consolidated, only one will show up in output cell.
- Fixed carry-over of nbformat from metadata.
- Preserve
nbformat_minor
if it’s given. This helps with round-tripping.
- Use format
-
LaTeX writer:
- Avoid inadvertently creating ?
or !
ligatures (#5407). These are upside down ? and !, resp. - Fix footnotes in table caption and cells (#5367). This fixes a bug wherein footnotes appeared in the wrong order, and with duplicate numbers, when in table captions and cells. We now use regular
\footnote
commands, even in the table caption and the minipages containing cells. Apparently longtable knows how to handle this.
- Avoid inadvertently creating ?
-
HTML writer: Don’t add data- prefix to RDFa attributes (#5403).
-
JATS writer: Ensure that plain strings go inside
<pub-id>
tag (#5397). -
Markdown writer:
- Better rendering of numbers (#5398). If the number is integral, we render it as an integral not a float.
- Proper rendering of empty map in YAML metadata (#5398). Should be
{}
, not empty string. - Properly escape attributes in Markdown writer (#5369).
- Be sure implicit figures work in list contexts (#5368). Previously they would sometimes not work: e.g., when they occured in final paragraphs in lists that were originally parsed as Plain and converted later using PlainToPara.
-
Docx writer: Use
w:br
without attributes for line breaks (#5377). We previously added the attributetype="textWrapping"
, but this causes problems on Word Online. -
LaTeX template (Andrew Dunning):
- Ensure correct heading/table order (#5365). Improve workaround (#1658) for tables following headings. The new solution works whether or not the
indent
variable is enabled. - Remove
subparagraph
variable. The default is now to use run-in style for level 4 and 5 headings (\paragraph
and\subparagraph
). To get the previous default behavior (where these were formatted as blocks, like\subsubsection
), set theblock-headings
variable. - Add pandoc to PDF metadata (#5388).
- Group graphics-related code (#5389).
- Move
\setstretch
after front matter (#5179). Ensures that\maketitle
,\tableofcontents
, and so forth are not affected by changes to line spacing.
- Ensure correct heading/table order (#5365). Improve workaround (#1658) for tables following headings. The new solution works whether or not the
-
Update data/jats.csl to avoid commas between name-part elements (#5397).
-
Add support for golang (
go
) with--listings
(#5427). -
Text.Pandoc.Shared - improve
metaToJSON
behavior with numbers. We now do a better job marshalling numbers from MetaString or MetaInlines into JSON Number. -
Text.Pandoc.Writers.Shared:
metaValueToJSON
: use Number Values for integers. Pandoc’s MetaValue doesn’t have a distinguished number type, so numbers are put in MetaStrings. If the MetaString consists entirely of digits, we convert it to a Number. We should probably consider adding a MetaNumber constructor to MetaValue, for better round-tripping with JSON etc. This change aids round-tripping in ipynb metadata fields, liketoc_depth
. -
Text.Pandoc.Class:
fetchItem
: don’t treat UNC paths as protocol-relative URLs (#5127). These are paths beginning//?/UNC/...
. -
Text.Pandoc.ImageSize: Improve
pdfSize
so it handles a wider range of PDFs (#4322, with help from Richard Davis). -
Text.Pandoc.Pretty: avoid stack overflow by using strict sum (#5401).
-
Fix harmless error in file-scope code (#5422).
-
MANUAL.txt:
- Improve ‘header’ and ‘heading’ usage (#5423, Andrew Dunning). The term ‘header’ was being used where ‘heading’ is more appropriate.
- Add paragraph on options affecting markdown in ipynb.
-
stack.yaml - remove -Wmissing-home-modules This seems to cause problems with stack ghci. Remove RTS options.
-
Add ghc-options to cabal.project.
-
appveyor.yml - use ghc 8.6.4. Fixes segfault issues on Windows (#5037).
-
linux build process: Remove clone of pandoc-citeproc (#5366). It wasn’t being used; cabal.project specifies the version to use.
pandoc 2.7.1
-
Add tectonic as an option for –pdf-engine (#5345, Cormac Relf). Runs tectonic on STDIN instead of a temporary .tex file, so that it looks in the working directory for
\include
and\input
like the rest of the engines. Allows overriding the output directory args with--pdf-engine-opt=--outdir --pdf-engine-opt="$DIR"
. -
Allow
-o/--output
to be used with--print-default-data-file
,--print-highlighting-style
,--print-default-template
. Note that-o
must occur BEFORE the--print*
command on the command line (this is documented, #5357). -
LaTeX reader:
- Support
\underline
,\ul
,\uline
(#5359, Paul Tilley). These are parsed as a Span with classunderline
, as with other readers. - Ensure that
\Footcite
and\Footcites
get put in a note.
- Support
-
ipynb reader:
- Remove sensitivity to
raw_html
,raw_tex
extensions. We now include every output format. Pruning is handled by--ipynb-output
. - Better handling of cell metadata. We now include even complex cell metadata in the Div’s attributes (as JSON, in complex cases, or as plain strings in simple cases).
- Remove sensitivity to
-
ipynb writer:
- Recurse into native divs for output cell data (#5354).
- Render cell metadata fields from div attributes.
-
Docx writer: avoid extra copy of abstractNum and num elements in numbering.xml. This caused pandoc-produced docx files to be uneditable using Word Online (#5358).
-
Markdown writer: improve handling of raw blocks/inline. We now emit raw content using
raw_attribute
when no more direct method is available. Use ofraw_attribute
can be forced by disablingraw_html
andraw_tex
. -
LaTeX writer: Add classes for frontmatter support (#5353, Andrew Dunning) and remove frontmatter from
scrreprt
. -
LaTeX template:
- Improve readability (#5363, Andrew Dunning).
- Robust section numbering removal (#5351, Andrew Dunning). Ensures that section numbering does not reappear with custom section levels. See https://tex.stackexchange.com/questions/473653/.
- Better handling of front/main/backmatter (#5348). In pandoc 2.7 we assumed that every class with chapters would accept
\frontmatter
,\mainmatter
, and\backmatter
. This is not so (e.g. report does not). So pandoc 2.7 breaks on report class by including an unsupported command. Instead of thebook-class
variable, we use two variables,has-chapters
andhas-frontmatter
, and set these intelligently in the writer.
-
Text.Pandoc.Shared: Improve
filterIpynbOutput
. Ensure that images are prioritized over text.best
should include everything for ipynb. -
Tests.Old: specify
--data-dir=../data
to ensure tests can find data files even if they haven’t been installed. Remove oldpandoc_datadir
environment variable, which hasn’t done anything for a long time. -
MANUAL.txt: Add recommendation to use
raw_attribute
with ipynb (#5354). -
Use cmark-gfm-hs 0.1.8 (note that 0.1.7 is buggy).
-
Use latest pandoc-citeproc, texmath.
pandoc 2.7
-
Use XDG data directory for user data directory (#3582). Instead of
$HOME/.pandoc
, the default user data directory is now$XDG_DATA_HOME/pandoc
, whereXDG_DATA_HOME
defaults to$HOME/.local/share
but can be overridden by setting the environment variable. If this directory is missing, then$HOME/.pandoc
is searched instead, for backwards compatibility. However, we recommend moving local pandoc data files from$HOME/.pandoc
to$HOME/.local/share/pandoc
. On Windows the default user data directory remains the same. -
Slide show formats behavior change: content under headers less than slide level is no longer ignored, but included in the title slide (for HTML slide shows) or in a slide after the title slide (for beamer). This change makes possible 2D reveal.js slideshows with content in the top slide on each stack (#4317, #5237).
-
Add command line option
--ipynb-output=all|none|best
(#5339). Output cells in ipynb notebooks often contain several different versions of an output, with different MIME types, e.g. an HTML table and a plain-text fallback. Specifying--ipynb-output=best
(the default) ensures that the best version for the output format is used.all
includes all versions, andnone
suppresses them all, leaving output cells empty. -
asciidoctor
is now an output format separate fromasciidoc
, to accommodate some minor implementation-specific differences (currently just in the treatment of display math). -
Add
latexmk
as an option for--pdf-engine
(#3195). Note that you can use--pdf-engine-opt=-outdir=bar
to specify a persistent temp directory. -
Markdown reader:
- Improve tight/loose list handling (#5285). Previously the algorithm allowed list items with a mix of Para and Plain, which is never wanted.
- Add newline when parsing blocks in YAML (#5271). Otherwise last block gets parsed as a Plain rather than a Para. This is a regression in pandoc 2.x. This patch restores pandoc 1.19 behavior.
- Make
yamlToMeta
respect extensions (#5272, Mauro Bieg). This adds aReaderOptions
parameter toyamlToMeta
[API change]. - Fix bug parsing fenced code blocks (#5304). Previously parsing would break if the code block contained a string of backticks of sufficient length followed by something other than end of line.
-
LaTeX reader: don’t let
\egroup
match{
.braced
now actually requires nested braces. Otherwise some legitimate command and environment definitions can break. -
Docx reader (Jesse Rosenthal):
- Rename
getDocumentPath
asgetDocumentXmlPath
. - Use field notation for setting
ReaderEnv
. - Figure out
document.xml
path once at the beginning of parsing, and add it to the environment, so we can avoid repeated lookups. - Dynamically determine main document xml path (#5277). The desktop Word program places the main document file in
word/document.xml
, but the online word places it inword/document2.xml
. This file path is actually stated in the root_rels/.rels
file, in theRelationship
element with anhttp://../officedocument
type. - Fix paths in archive to prevent Windows failure (#5277). Some paths in archives are absolute (have an opening slash) which, for reasons unknown, produces a failure in the test suite on MS Windows. This fixes that by removing the leading slash if it exists.
- Add comments to aid code readability.
- Trim space inside the last inline (#5273).
- Unwrap sdt elements in footnotes and comments (#5302).
- Rename
-
Muse reader (Alexander Krotov):
- Test that block level markup does not break
<verbatim>
. - Add secondary note support.
- Test that block level markup does not break
-
ipynb reader: handle images referring to attachments. Previously we didn’t strip off the attachment: prefix, so even though the attachment was available in the mediabag, pandoc couldn’t find it.
-
JATS reader:
- Fix parsing of figures (#5321). This ensures that a figure containing a single image is parsed as a pandoc “implicit figure” (i.e., a Para with a single Image whose title attribute begins with
fig:
). More complex figures will still be parsed as divs. - Support
fig-group
block element (#5317). - Handle citations with multiple references (#5310). The
rid
attribute can have a space-separated list of ids.
- Fix parsing of figures (#5321). This ensures that a figure containing a single image is parsed as a pandoc “implicit figure” (i.e., a Para with a single Image whose title attribute begins with
-
AsciiDoc Writer: Add
writeAsciiDoctor
[API change, Tarik Graba]. Handle display math appropriately for Asciidoctor. -
JATS writer: wrap figure caption in
<p>
to fix validation (#5290, Mauro Bieg). -
HTML writer:
-
ipynb writer:
- Ensure final newline.
- Only include metadata under
jupyter
field. - Don’t create attachments for images with absolute URIs, including data: URIs (#5303).
- Keep plain text fallbacks in output even if a richer format is included (#5293). We don’t know what output format will be needed. See the
--ipynb-output
command line option for a way to control what formats are included in the output.
-
Markdown writer: use
markdown="1"
when appropriate for Divs: whennative_divs
andmarkdown_in_html_blocks
are disabled butraw_html
andmarkdown_attribute
are enabled. -
LaTeX writer:
- Use right fold for
escapeString
. This is more elegant than the explicit recursive code we were using. - Avoid
{}
after control sequences when escaping.\ldots{}.
doesn’t behave as well as\ldots.
with the latex ellipsis package. This patch causes pandoc to avoid emitting the{}
when it is not necessary. Now\ldots
and other control sequences used in escaping will be followed by either a{}
, a space, or nothing, depending on context. - For beamer, include contents under headers superordinate to slidelevel (#4317). Currently we keep the fancy title slide, and add a new slide with the same title and whatever content was under the header.
- Use right fold for
-
Powerpoint writer (Jesse Rosenthal): support underlines. Use span with single class “underline” as in docx writer.
-
Muse writer: escape secondary notes (Alexander Krotov).
-
FB2 writer: add section identifiers support (#5229, John KetzerX).
-
Make
--fail-if-warnings
work for PDF output (#5343). -
Lua filters (Albert Krewinkel):
- Load module
pandoc
before callinginit.lua
(#5287). The fileinit.lua
in pandoc’s data directory is run as part of pandoc’s Lua initialization process. Previously, thepandoc
module was loaded ininit.lua
, and the structure for marshaling was set up after. This allowed simple patching of element marshaling, but made usinginit.lua
more difficult. Now, all required modules are now loaded before callinginit.lua
. The file can be used entirely for user customization. Patching marshaling functions, while discouraged, is still possible via thedebug
module. - All Lua modules bundled with pandoc, i.e.,
pandoc.List
,pandoc.mediabag
,pandoc.utils
, andtext
are re-exported from thepandoc
module. They are assigned to the fieldsList
,mediabag
,utils
, andtext
, respectively.
- Load module
-
Text.Pandoc.Lua (Albert Krewinkel):
- Split
StackInstances
into smaller Marshaling modules. - Get
CommonState
from Lua global. This allows more control over the common state from within Lua scripts.
- Split
-
LaTeX template:
-
epub3 template: Add titlepage class to section (#5269).
-
HTML5 template: Add ARIA role
doc-toc
for table of contents (#4213). -
Make –metadata-file use pandoc-markdown (#5279, #5272, Mauro Bieg).
-
Text.Pandoc.Shared:
- Remove
withTempDir
[API change]. - Add new exported function
defaultUserDataDirs
[API change]. - Add
filterIpynbOutput
[API change]. compactify
: Avoid lists with a mix of Plain and Para elements (#5285).
- Remove
-
Text.Pandoc.Translations: reorder alphabetically and remove
Author
(#5334, Mauro Bieg). -
Text.Pandoc.Extensions:
- More carefully groom ipynb default extensions.
- Add
all_symbols_escapable
togithubMarkdownExtensions
.
-
Text.Pandoc.PDF:
- Use system temp directory when possible (#1192). Previously we created temp dirs in the working directory, partly (a) because there were problems using the system temp directory on Windows, when their pathnames included tildes, and partly (b) because programs like
epstopdf.pl
would not be allowed to write to directories outside the working directory in restricted mode. We now (a) use the system temp dir except when the path includes tildes, and (b) setTEXMFOUTPUT
when creating the PDF, so that subsidiary programs can use the system temp directory. This addresses problems that occurred when pandoc was used in a synced directory (such as Dropbox). - Change types of subsidiary functions to PandocIO, to allow warnings to be threaded through (#5343).
- Use system temp directory when possible (#1192). Previously we created temp dirs in the working directory, partly (a) because there were problems using the system temp directory on Windows, when their pathnames included tildes, and partly (b) because programs like
-
Text.Pandoc.MIME: add WebP (#5267, Mauro Bieg).
-
Tests: avoid calling
findPandoc
multiple times. -
Old tests: remove need for temp files by using
pipeProcess
. -
Added simple ipynb reader/writer tests (#5274).
-
Rearrange
--help
output in a more rational way, with common options at the beginning and options grouped by function (#5336). -
trypandoc: Add JATS and other missing formats (Arfon Smith, #5291).
-
Add missing copyright notices and remove license boilerplate (#4592, Albert Krewinkel).
-
Use latest basement/foundation on 32bi...
pandoc 2.6
-
Support ipynb (Jupyter notebook) as input and output format.
- Add
ipynb
as input and output format (extension.ipynb
). - Added Text.Pandoc.Readers.Ipynb [API change].
- Added Text.Pandoc.Writers.Ipynb [API change].
- Add
PandocIpynbDecodingError
constructor to Text.Pandoc.Error.Error [API change]. - Depend on ipynb library.
- Note: there is no template for ipynb.
- Add
-
Add DokuWiki reader (#1792, Alexander Krotov). This adds Text.Pandoc.Readers.DokuWiki [API change], and adds
dokuwiki
as an input format. -
Implement task lists (#3051, Mauro Bieg). Added
task_lists
extension. Task lists are supported from markdown and gfm input. They should work, to some degree, in all output formats, though in most formats you’ll get a bullet list with a unicode character for the box. In HTML, you get checkboxes and in LaTeX/PDF output, a box is used as the list marker. API changes:- Added constructor
Ext_task_lists
toExtension
. - Added
taskListItemFromAscii
andtaskListItemToAscii
to Text.Pandoc.Shared.
- Added constructor
-
Allow some command line options to take URL in addition to FILE.
--include-in-header
,--include-before-body
,--include-after-body
. -
HTML reader:
-
RST reader:
- Change treatment of
number-lines
directive (Brian Leung, #5207). Directives of this type without numeric inputs should not have astartFrom
attribute; with a blank value, the writers can produce extra whitespace. - Removed superfluous
sourceCode
class on code blocks (#5047). - Handle
sourcecode
directive as synonynm forcode
(#5204).
- Change treatment of
-
Markdown reader:
-
Org reader:
- Handle
minlevel
option differently (#5190, Brian Leung). Whenminlevel
exceeds the original minimum level observed in the file to be included, every heading should be shifted rightward. - Allow for case of
:minlevel == 0
(#5190). - Fix treatment of links to images (#5191, Albert Krewinkel). Links with descriptions which are pointing to images are no longer parsed as inline images, but as links.
- Add support for #+SELECT_TAGS (Brian Leung).
- Separate filtering logic from conversion function (Brian Leung).
- Handle
-
TWiki reader: Fix performance issue with underscores (#3921).
-
MediaWiki reader: use
_
instead of-
in auto-identifiers (#4731). We may not still be exactly matching mediawiki’s algorithm. -
LaTeX reader:
- Remove
sourceCode
class for literate Haskell code blocks (#5047). Reverse order ofliterate
andhaskell
classes on code blocks when parsing literate Haskell, sohaskell
is first. - Support
\DeclareMathOperator
(#5149). - Support
\inputminted
(#5103). - Support
\endinput
(#5233). - Allow includes with dots like
cc_by_4.0
. Previously the.0
was interpreted as a file extension, leading pandoc not to add.tex
(and thus not to find the file). The new behavior matches tex more closely.
- Remove
-
Man reader:
- Use
mapLeft
from Shared instead of defining own.
- Use
-
Docx reader (Jesse Rosenthal):
- Handle level overrides (#5134).
-
Docx writer:
-
ICML writer (Mauro Bieg):
-
Texinfo writer: Use header identifier for anchor if present (#4731). Previously we were overwriting an existing identifier with a new one.
-
Org writer: Preserve line-numbering for example and code blocks (Brian Leung).
-
Man/Ms writers: Don’t escape
-
as\-
. The\-
gets rendered in HTML and PDF as a unicode minus sign. -
Ms writer: Ensure we have a newline after .EN in disply math (#5251).
-
RST writer: Don’t wrap simple table header lines (#5128).
-
Asciidoc writer: Shorter delimiters for tables, blockquotes (#4364). This matches asciidoctor reference docs.
-
Dokuwiki writer: Remove automatic
:
prefix before internal image links (#5183, Damien Clochard). This prevented users from making relative image links. -
Zimwiki writer: remove automatic colon prefix before internal images (#5183, Damien Clochard).
-
MediaWiki writer: fix caption, use ‘thumb’ instead of ‘frame’ (#5105). Captions used to have the word ‘caption’ prepended; this has been removed. Also, ‘thumb’ is used instead of ‘frame’ to allow images to be resized.
-
reveal.js writer:
-
Markdown writer:
- Make
plain
RawBlocks pass through inplain
output. - Include needed whitespace after HTML figure (#5121). We use HTML for a figure in markdown dialects that can’t represent it natively.
- Make
-
Commonmark writer:
-
EPUB writer:
- Ensure that picture transforms are done on metadata too.
- Small fixes to
nav.xhtml
: Add ‘landmarks’ id attribute to the landmarks nav. Replace old default CSS removing numbers from ol.toc li with new rules that matchnav#toc ol, nav#landmarks ol
. We keep thetoc
class onol
for backwards compatibility.
-
LaTeX writer:
- Make raw content marked
beamer
pass through inbeamer
output (pandoc/lua-filters#40). - Beamer: avoid duplicated
fragile
property in some cases (#5208). - Add
#
special characters for listings (#4939). This character needs special handling in\lstinline
.
- Make raw content marked
-
RTF writer: use
toTableOfContents
from Shared to replace old duplicated code. -
Pptx writer:
- Support custom properties. Also supports additional core properties:
subject
,category
,description
(#5252, Agustín Martín Barbero). - Use
toTableOfContents
from Shared to replace old duplicated code.
- Support custom properties. Also supports additional core properties:
-
ODT writer (Augustín Martín Barbero):
-
Custom writers:
-
reveal.js template: Add
zoomKey
config (#4249). -
HTML5 template: Remove unnecessary type=“text/css” on style and link for HTML5 (#5146).
-
LaTeX template (Andrew Dunning, except where noted):
- Prevent fontspec from scaling
mainfont
to match the default font, Latin Modern. A main font set to 12pt could previously appear between 11pt to 13pt depending on its design. To return to the earlier rendering, use-V mainfontoptions="Scale=MatchLowercase"
(#5212, #5218). - Display monospaced fonts without TeX ligatures when using
--pdf-engine=lualatex
. It now matches the behaviour of other engines (#5212, #5218). - Remove the deprecated
romanfont
variable. The functionality ofmainfont
is identical (#5218). - Render
\subtitle
with the standard document classes. Previously,subtitle
only appeared when using the KOMA-Script classes or Beamer (#5213, #5244). - Use Babel instead of Polyglossia for LuaLaTeX. This avoids several language selection problems, notably with retaining French spacing conventions when switching to a verbatim environment or another language; and in printing Greek text without hyphenation (#5193).
- Use the
xurl
package if available, improving the appearance of URLs by allowing them to break at additional points (#5193). - Use
bookmark
if available to correct heading levels in PDF bookmarks: see the KOMA-Script 3.26 release notes (#5193). - Require the
xcolor
package to avoid a possible error when using additional packages alongside footnotes in tables (#5193, closes #4861). - Remove obsolete
fixltx2e
package, which has no functionality with TeX Live 2015 or later (#5193). - Allow multiple
fontfamilies.options
(#5193, closes #5194). - Restrict
institute
variable to Beamer (#5219). - Use
footnotehyper
package if available to make footnotes in tables compatible withhyperref
(#5234). - Number parts and chapters in book classes only if the
numbersections
variable is set, for consistency with other output formats. To return to the previous behaviour, use-V numbersections -V secnumdepth=0
(#5235). - Reindent file (#5193).
- Use built-in parskip handling with KOMA-Script classes (#5143, Enno).
- Set default listings language for lua, assembler (#5227, John MacFarlane). Otherwise we get an e...
- Prevent fontspec from scaling
pandoc 2.5
-
Text.Pandoc.App: split into several unexported submodules (Albert Krewinkel): Text.Pandoc.App.FormatHeuristics, Text.Pandoc.App.Opt, Text.Pandoc.App.CommandLineOptions, Text.Pandoc.App.OutputSettings. This is motivated partly by the desire to reduce recompilations when something is modified, since App previously depended on virtually every other module.
-
Text.Pandoc.Extensions
- Semantically,
gfm_auto_identifiers
is now a modifier ofauto_identifiers
; for identifiers to be set,auto_identifiers
must be turned on, and then the type of identifier produced depends ongfm_auto_identifiers
andascii_identifiers
are set. Accordingly,auto_identifiers
is now added togithubMarkdownExtensions
(#5057). - Remove
ascii_identifiers
fromgithubMarkdownExtensions
. GitHub doesn’t seem to strip non-ascii characters any more.
- Semantically,
-
Text.Pandoc.Lua.Module.Utils (Albert Krewinkel)
- Test AST object equality via Haskell (#5092). Equality of Lua objects representing pandoc AST elements is tested by unmarshalling the objects and comparing the result in Haskell. A new function
equals
which performs this test has been added to thepandoc.utils
module. - Improve stringify. Meta value strings (MetaString) and booleans (MetaBool) are now converted to the literal string and the lowercase boolean name, respectively. Previously, all values of these types were converted to the empty string.
- Test AST object equality via Haskell (#5092). Equality of Lua objects representing pandoc AST elements is tested by unmarshalling the objects and comparing the result in Haskell. A new function
-
Text.Pandoc.Parsing: Remove Functor and Applicative constraints where Monad already exists (Alexander Krotov).
-
Text.Pandoc.Pretty: Don’t render BreakingSpace at end of line or beginning of line (#5050).
-
Text.Pandoc.Readers.Markdown
- Fix parsing of citations, quotes, and underline emphasis after symbols. Starting with pandoc 2.4, citations, quoted inlines, and underline emphasis were no longer recognized after certain symbols, like parentheses (#5099, #5053).
- In pandoc 2.4, a soft break after an abbreviation would be relocated before it to allow for insertion of a nonbreaking space after the abbreviation. This behavior is here reverted. A soft break after an abbreviation will remain, and no nonbreaking space will be added. Those who care about this issue should take care not to end lines with an abbreviation, or to insert nonbreaking spaces manually.
-
Text.Pandoc.Readers.FB2: Do not throw error for unknown elements in
<body>
(Alexander Krotov). Some libraries include custom elements in their FB2 files. -
Text.Pandoc.Readers.HTML
-
Text.Pandoc.Readers.LaTeX
- Cleaned up handling of dimension arguments. Allow decimal points, preceding space.
- Don’t allow arguments for verbatim, etc.
- Allow space before bracketed options.
- Allow optional arguments after
\\
in tables. - Improve parsing of
\tiny
,\scriptsize
, etc. Parse as raw, but know that these font changing commands take no arguments.
-
Text.Pandoc.Readers.Muse
- Trim whitespace before parsing grid table cells (Alexander Krotov).
- Add grid tables support (Alexander Krotov).
-
Text.Pandoc.Shared
- For bibliography match Div with id
refs
, not classreferences
. This was a mismatch between pandoc’s docx, epub, latex, and markdown writers and the behavior of pandoc-citeproc, which actually looks for a div with idrefs
rather than one with classreferences
. - Exactly match GitHub’s identifier generating algorithm (#5057).
- Add parameter for
Extensions
touniqueIdent
andinlineListToIdentifier
(#5057). [API change] This allows these functions to be sensitive to the settings ofExt_gfm_auto_identifiers
andExt_ascii_identifiers
, and allows us to useuniqueIdent
in the CommonMark reader, replacing custom code. It also means thatgfm_auto_identifiers
can now be used in all formats.
- For bibliography match Div with id
-
Text.Pandoc.Writers.AsciiDoc
-
Text.Pandoc.Writers.CommonMark
- Respect
--ascii
(#5043, quasicomputational). - Make sure
--ascii
affects quotes, super/subscript.
- Respect
-
Text.Pandoc.Writers.Docx
- Fix bookmarks to headers with long titles (#5091). Word has a 40 character limit for bookmark names. In addition, bookmarks must begin with a letter. Since pandoc’s auto-generated identifiers may not respect these constraints, some internal links did not work. With this change, pandoc uses a bookmark name based on the SHA1 hash of the identifier when the identifier isn’t a legal bookmark name.
- Add bookmarks to code blocks (Nikolay Yakimov).
- Add bookmarks to images (Nikolay Yakimov).
- Refactor common bookmark creation code into a function (Nikolay Yakimov).
-
Text.Pandoc.Writers.EPUB: Handle calibre metadata (#5098). Nodes of the form
<meta name="calibre:series" content="Classics on War and Politics"/>
are now included from an epub XML metadata file. You can also include this information in your YAML metadata, like so:
calibre: series: Classics on War and Policitics
In addition, ibooks-specific metadata can now be included via an XML file. (Previously, it could only be included via YAML metadata, see #2693.)
-
Text.Pandoc.Writers.HTML: Use plain
"
instead of"
outside of attributes. -
Text.Pandoc.Writers.ICML: Consolidate adjacent strings, inc. spaces. This avoids splitting up the output unnecessarily into separate elements.
-
Text.Pandoc.Writers.LaTeX: Don’t emit
[<+->]
unless beamer output, even ifwriterIncremental
is True (#5072). -
Text.Pandoc.Writers.Muse (Alexander Krotov).
- Output tables as grid tables if they have multi-line cells.
- Indent simple tables only on the top level.
- Output tables with one column as grid tables.
- Add support for
--reference-location
. - Internal improvements.
-
Text.Pandoc.Writers.OpenDocument: Fix list indentation (Nils Carlson, #5095). This was a regression in pandoc 2.4.
-
Text.Pandoc.Writers.RTF: Fix warnings for skipped raw inlines.
-
Text.Pandoc.Writers.Texinfo: Add blank line before
@menu
section (#5055). -
Text.Pandoc.XML: in
toHtml5Entities
, prefer shorter entities when there are several choices for a particular character. -
data/abbreviations
- Add additional abbreviations (Andrew Dunning) Many of these borrowed from the Chicago Manual of Style 10.42, ‘Scholarly abbreviations’.
-
Templates
- Asciidoc template: add :lang: to title header is lang is set in metadata (#5088).
-
pandoc.cabal: Add cabal flag
derive_json_via_th
(Albert Krewinkel) Disabling the flag will cause derivation of ToJSON and FromJSON instances via GHC Generics instead of Template Haskell. The flag is enabled by default, as deriving via Generics can be slow (see #4083). -
trypandoc:
- Tweaked drop-down lists.
- Put link to site in footer.
- Preselect output format.
- Update on change of in or out format.
- Add man input format.
-
MANUAL.txt:
- Fix outdated description of latex_macros extension.
- Clarified placement of bibliography.
- Added “A note on security.”
- Fix note on curly brace syntx for locators.
- Document new explicit syntax for citeproc locators.
- Remove confusing cross-links for some extensions.
- Don’t put pandoc in code ticks in heading.
- Document that
--ascii
works for gfm and commonmark too. - Add
man
to--from
options.
-
doc/customizing-pandoc.md: various improvements (Mauro Bieg).
pandoc 2.4
pandoc (2.4)
[new features]
- New input format
man
(Yan Pashkovsky, John MacFarlane).
[behavior changes]
-
--ascii
is now implemented in the writers, not in Text.Pandoc.App, via the newwriterPreferAscii
field inWriterOptions
. Now thewrite*
functions for Docbook, HTML, ICML, JATS, LaTeX, Ms, Markdown, and OPML are sensitive towriterPreferAscii
. Previously the to-ascii translation was done in Text.Pandoc.App, and thus not available to those using the writer functions directly. -
--ascii
now works with Markdown output. HTML5 character reference entities are used. -
--ascii
now works with LaTeX output. 100% ASCII output can’t be guaranteed, but the writer will use commands like\"{a}
and\l
whenever possible, to avoid emiting a non-ASCII character. -
For HTML5 output,
--ascii
now uses HTML5 character reference entities rather than numerical entities. -
Improved detection of format based on extension (in Text.Pandoc.App). We now ensure that if someone tries to convert a file for a format that has a pandoc writer but not a reader, it won’t just default to markdown.
-
Add viz. to abbreviations file (#5007, Nick Fleisher).
-
AsciiDoc writer: always use single-line section headers, instead of the old underline style (#5038). Previously the single-line style would be used if
--atx-headers
was specified, but now it is always used. -
RST writer: Use simple tables when possible (#4750).
-
CommonMark (and gfm) writer: Add plain text fallbacks. (#4528, quasicomputational). Previously, the writer would unconditionally emit HTML output for subscripts, superscripts, strikeouts (if the strikeout extension is disabled) and small caps, even with
raw_html
disabled. Now there are plain-text (and, where possible, fancy Unicode) fallbacks for all of these corresponding (mostly) to the Markdown fallbacks, and the HTML output is only used whenraw_html
is enabled. -
Powerpoint writer: support raw openxml (Jesse Rosenthal, #4976). This allows raw openxml blocks and inlines to be used in the pptx writer. Caveats: (1) It’s up to the user to write well-formed openxml. The chances for corruption, especially with such a brittle format as pptx, is high. (2) Because of the tricky way that blocks map onto shapes, if you are using a raw block, it should be the only block on a slide (otherwise other text might end up overlapping it). (3) The pptx ooxml namespace abbreviations are different from the docx ooxml namespaces. Again, it’s up to the user to get it right. Unzipped document and ooxml specification should be consulted.
-
With
--katex
in HTML formats, do not use the autorenderer (#4946). We no longer surround formulas with\(..\)
or\[..\]
. Instead, we tell katex to convert the contents of span elements with class “math”. Since math has already been identified, this avoids wasted time parsing for LaTeX delimiters. Note, however, that this may yield unexpected results if you have span elements with class “math” that don’t contain LaTeX math. Also, use latest version of KaTeX by default (0.9.0). -
The man writer now produces ASCII-only output, using groff escapes, for portability.
-
ODT writer:
- Add title, author and date to metadata; any remaining metadata fields are added as
meta:user-defined
tags. - Implement table caption numbering (#4949, Nils Carlson). Captioned tables are numbered and labeled with format “Table 1: caption”, where “Table” is replaced by a translation, depending on the value of
lang
in metadata. Uncaptioned tables are not enumerated. - OpenDocument writer: Implement figure numbering in captions (#4944, Nils Carlson). Figure captions are now numbered 1, 2, 3, … The format in the caption is “Figure 1: caption” and so on (where “Figure” is replaced by a translation, depending on the value of
lang
in the metadata). Captioned figures are numbered consecutively and uncaptioned figures are not enumerated. This is necessary in order for LibreOffice to generate an Illustration Index (Table of Figures) for included figures.
- Add title, author and date to metadata; any remaining metadata fields are added as
-
RST reader: Pass through fields in unknown directives as div attributes (#4715). Support
class
andname
attributes for all directives. -
Org reader: Add partial support for
#+EXCLUDE_TAGS
option. (#4284, Brian Leung). Headers with the corresponding tags should not appear in the output. -
Log warnings about missing title attributes now include a suggestion about how to fix the problem (#4909).
-
Lua filter changes (Albert Krewinkel):
-
Report traceback when an error occurs. A proper Lua traceback is added if either loading of a file or execution of a filter function fails. This should be of help to authors of Lua filters who need to debug their code.
-
Allow access to pandoc state (#5015). Lua filters and custom writers now have read-only access to most fields of pandoc’s internal state via the global variable
PANDOC_STATE
. -
Push ListAttributes via constructor (Albert Krewinkel). This ensures that ListAttributes, as present in OrderedList elements, have additional accessors (viz.
start
,style
, anddelimiter
). -
Rename ReaderOptions fields, use snake_case. Snake case is used in most variable names, using camelCase for these fields was an oversight. A metatable is added to ensure that the old field names remain functional.
-
Iterate over AST element fields when using
pairs
. This makes it possible to iterate over all ield names of an AST element by using a genericfor
loop with pairs`:for field_name, field_content in pairs(element) do ... end
Raw table fields of AST elements should be considered an implementation detail and might change in the future. Accessing element properties should always happen through the fields listed in the Lua filter docs.
Note that the iterator currently excludes the
t
/tag
field. -
Ensure that MetaList elements behave like Lists. Methods usable on Lists can also be used on MetaList objects.
-
Fix MetaList constructor (Albert Krewinkel). Passing a MetaList object to the constructor
pandoc.MetaList
now returns the passed list as a MetaList. This is consistent with the constructor behavior when passed an (untagged) list.
-
-
Custom writers: Custom writers have access to the global variable
PANDOC_DOCUMENT
(Albert Krewinkel, #4957). The variable contains a userdata wrapper around the full pandoc AST and exposes two fields,meta
andblocks
. The field content is only marshaled on-demand, performance of scripts not accessing the fields remains unaffected.
[API changes]
-
Text.Pandoc.Options: add
writerPreferAscii
toWriterOptions
. -
Text.Pandoc.Shared:
- Export
splitSentences
. This was previously duplicated in the Man and Ms writers. - Add
ToString
typeclass (Alexander Krotov).
- Export
-
New exported module Text.Pandoc.Filter (Albert Krewinkel).
-
Text.Pandoc.Parsing
- Generalize
gridTableWith
to anyChar
Stream (Alexander Krotov). - Generalize
readWithM
from[Char]
to anyChar
Stream that is aToString
instance (Alexander Krotov).
- Generalize
-
New exposed module Text.Pandoc.Filter (Albert Krewinkel).
-
Text.Pandoc.XML: add
toHtml5Entities
. -
New exported module Text.Pandoc.Readers.Man (Yan Pashkovsky, John MacFarlane).
-
Text.Pandoc.Writers.Shared
- Add exported functions
toSuperscript
andtoSubscript
(quasicomputational, #4528). - Remove exported functions
metaValueToInlines
,metaValueToString
. Add new exported functionslookupMetaBool
,lookupMetaBlocks
,lookupMetaInlines
,lookupMetaString
. Use these whenever possible for uniformity in writers (Mauro Bieg, #4907). (Note that removed functionmetaValueToInlines
was in previous released versions.) - Add
metaValueToString
.
- Add exported functions
-
Text.Pandoc.Lua
-
Expose more useful internals (Albert Krewinkel):
runFilterFile
to run a Lua filter from file;- data type
Global
and its constructors; and setGlobals
to add globals to a Lua environment.
This module also contains
Pushable
andPeekable
instances required to get pandoc’s data types to and from Lua. Low-level Lua operation remain hidden in Text.Pandoc.Lua. -
Rename
runPandocLua
torunLua
(Albert Krewinkel). -
Remove
runLuaFilter
, merging this into Text.Pandoc.Filter.Lua’sapply
(Albert Krewinkel).
-
[bug fixes and under-the-hood improvements]
-
Text.Pandoc.Parsing
- Make
uri
accept any stream with Char tokens (Alexander Krotov). - Rewrite
uri
withoutwithRaw
(Alexander Krotov). - Generalize
parseFromString
andparseFromString'
to any streams with Char token (Alexander Krotov) - Rewrite
nonspaceChar
usingnoneOf
(Alexander Krotov)
- Make
-
Text.Pandoc.Shared: Reimplement
mapLeft
usingBifunctor.first
(Alexander Krotov). -
Text.Pandoc.Pretty: Simplify
Text.Pandoc.Pretty.offset
(Alexander Krotov). -
Text.Pandoc.App
- Work around HXT limitation for –syntax-definition with windows drive (#4836).
- Always preserve tabs for man format. We need it for tables.
- Split command line parsing code into a separate unexported module, Text.Pandoc.App.CommandLineOptions (Albert Krewinkel).
-
Text.Pandoc.Readers.Roff: new unexported module for tokenizing roff documents.
-
New unexported module Text.Pandoc.RoffChar, provided character escape tables for roff formats.
-
Text.Pandoc.Readers.HTML: Fix
htmlTag
andisInlineTag
to accept processing instructions (#3123, regression since 2.0). -
Text.Pandoc.Readers.JATS: Use
foldl'
instead ofmaximum
to ac...
pandoc 2.3.1
-
RST reader:
-
Markdown reader: distinguish autolinks in the AST. With this change, autolinks are parsed as Links with the
uri
class. (The same is true for bare links, if theautolink_bare_uris
extension is enabled.) Email autolinks are parsed as Links with theemail
class. This allows the distinction to be represented in the AST. -
Org reader:
- Force inline code blocks to honor export options (Brian Leung).
- Parse empty argument array in inline src blocks (Brian Leung).
-
Muse reader (Alexander Krotov):
- Added additional tests.
- Do not allow code markup to be followed by digit.
- Remove heading level limit.
- Simplify
<literal>
tag parsers - Parse Text instead of String. Benchmark shows 7% improvement.
- Get rid of HTML parser dependency.
- Various code improvements.
-
ConTeXt writer: change
\
to/
in Windows image paths (#4918). We do this in the LaTeX writer, and it avoids problems. Note that/
works as a LaTeX path separator on Windows. -
LaTeX writer:
- Add support for multiprenote and multipostnote arguments with
--biblatex
(Brian Leung, #4930). The multiprenotes occur before the first prefix of a multicite, and the multipostnotes follow the last suffix. - Fix a use of
last
that might take empty list. If you ran with--biblatex
and have an empty document (metadata but no blocks), pandoc would previously raise an error because of the use oflast
on an empty list.
- Add support for multiprenote and multipostnote arguments with
-
RTF writer: Fix build failure with ghc-8.6.1 caused by missing MonadFail instance (Jonas Scholl).
-
ODT Writer: Improve table header row style handling (Nils Carlson). This changes the way styles for cells in the header row and normal rows are handled in ODT tables. Previously a new (but identical) style was generated for every table, specifying the style of the cells within the table. After this change there are two style definitions for table cells, one for the cells in the header row, one for all other cells. This doesn’t change the actual styles, but makes post-processing changes to the table styles much simpler as it is no longer necessary to introduce new styles for header rows and there are now only two styles where there was previously one per table.
-
HTML writer:
- Don’t add
uri
class to presumed autolinks. Formerly theuri
class was added to autolinks by the HTML writer, but it had to guess what was an autolink and could not distinguish[http://example.com](http://example.com)
from<http://example.com>
. It also incorrectly recognized[pandoc](pandoc)
as an autolink. Now the HTML writer simply passes through theuri
attribute if it is present, but does not add anything. - Avoid adding extra section nestings for revealjs. Previously revealjs title slides at level (slidelevel - 1) were nested under an extra section element, even when the section contained no additional (vertical) content. That caused problems for some transition effects.
- Omit unknown attributes in EPUB2 output. For example,
epub:type
attributes should not be passed through, or the epub produced will not validate.
- Don’t add
-
JATS writer: remove ‘role’ attribute on ‘bold’ and ‘sc’ elements (#4937). The JATS spec does not allow these.
-
Textile writer: don’t represent
uri
class explicitly for autolinks (#4913). -
Lua filters (Albert Krewinkel):
- Cleanup filter execution code.
- Better error on test failure.
-
HTML, Muse reader tests: reduce time taken by round-trip test.
-
Added cabal.project.
-
MANUAL:
epub:type
is only useful for epub3 (Maura Bieg). -
Use hslua v1.0.0 (Albert Krewinkel).
-
Fix
translations/ru
to use modern Russian orthography (Ivan Trubach). -
Build Windows binary using ghc 8.6.1 and cabal new-build. This fixes issues with segfaults in the 32-bit Windows binaries (#4283).