Hacker Newsnew | past | comments | ask | show | jobs | submit | fiddlosopher's commentslogin

Pandoc does know how to expand LaTeX macros. For example, given the LaTeX

  \newcommand{\pair}[2]{\langle #1, #2\rangle}
  $$\pair{a^2}{\frac{\pi}{2}}$$
pandoc will give you the Typst

  $ chevron.l a^2 \, pi / 2 chevron.r $
which is correct. Tylax, on the other hand, seems to have problems with this example, producing

  $ angle.l^()frac(pi,)angle.r  $
which does not compile with typst. Going the other direction, pandoc also understands typst scripting. For example, from

  #let count = 8
  #let nums = range(1, count + 1)
  #let fib(n) = (
    if n <= 2 { 1 }
    else { fib(n - 1) + fib(n - 2) }
  )

  The first #count numbers of the sequence are:

  #align(center, table(
    columns: count,
    ..nums.map(n => $F_#n$),
    ..nums.map(n => str(fib(n))),
  ))
pandoc produces this LaTeX:

  The first 8 numbers of the sequence are:

  {\def\LTcaptype{none} % do not increment counter
  \begin{longtable}[]{@{}llllllll@{}}
  \toprule\noalign{}
  \endhead
  \bottomrule\noalign{}
  \endlastfoot
  \(F_{1}\) & \(F_{2}\) & \(F_{3}\) & \(F_{4}\) & \(F_{5}\) & \(F_{6}\) &
  \(F_{7}\) & \(F_{8}\) \\
  1 & 1 & 2 & 3 & 5 & 8 & 13 & 21 \\
  \end{longtable}
  }
With the same input, Tylax produces:

  The first 8 numbers of the sequence are:

  \begin{center}

  \begin{tabular}{|c|}
  \hline
  \hline
  \end{tabular}\end{center}
which is just an empty table.


You are absolutely right. Thank you for pointing this out! Regarding your questions: 1. The `\pair` issue: This is definitely a bug. My macro expander is based on text replacement and obviously cannot handle nested parameters. I will fix the recursive logic. 2. The `fib` loop: Pandoc seems to use `typst-hs`, which contains a complete Typst evaluator. Tylax is strictly designed as a static AST transformer. We haven't implemented the Typst virtual machine, so loops or recursive functions cannot be executed. This will be gradually improved later to make it more usable; my claim of "better macro support" was clearly premature. This was a big mistake on my part, and we will strive to achieve this goal in future updates! Thank you very much for your feedback and for pointing out the bug!


Pandoc developer here. Two comments: (1) to convert equations pandoc uses the texmath library, which I also wrote (https://github.com/jgm/texmath). Compiling texmath with `-fexecutable` will give you a standalone executable that just converts the equation (and doesn't add the `<p>` element or anything extraneous). Compiling it with `-fserver` will give you a webserver that converts equations. (2) Regarding the bug, note that it's not an empty `<mo></mo>`. There's an invisible U+2061 character ("function application") inside. We don't want to take that out, but it looks like putting the `<mi>` and the `<mo>` together in an `<mrow>` will also solve the problem. I'll fix this.


An even older one is my gitit, started in 2008!

https://hackage.haskell.org/package/gitit-0.15.1.2

It doesn't limit itself to markdown, nor to git (you can use darcs, hg, or even sqlite). A bit long in the tooth, though -- I stopped working on it once spam started to make self-hosted public wikis untenable.


This was my first computer, too. I have fond memories of programming it in 6502 machine language and saving programs on cassette tape. I still have it, along with the power supply my grandfather built from the specs they provided in the user manual. And it still works!


If you don't want pandoc's fenced divs, just choose a markdown dialect that doesn't support them: e.g.,

    pandoc -f html -t gfm https://www.fsf.org
Or, to get rid of the divs and spans altogether, disable raw HTML as well:

    pandoc -f html -t gfm-raw_html https://www.fsf.org


those have even raw div's then.

    ...
    </div>
    <div id="sitemap-2" class="yui3-u-1-2 first">
    ...


As I explained, if you don't want raw HTML, use the second form:

  pandoc -f html -t gfm-raw_html https://www.fsf.org | grep div
gives no output. (If you get different results, it is possible that you are using an earlier pandoc version and something changed in this regard.)


In fact, both claims are misleading. There are two implementations, and they implement distinct flavors with syntactic differences: https://consolelog.gitee.io/docs-asciidoctor/asciidoc-asciid... Markdown has a smaller flavors/implementations ratio!


In fact, also this claim is wrong, because there are three :D

1. https://asciidoctor.org/

2. https://github.com/asciidoc-py/asciidoc-py

3. https://asciidoc3.org/

1 and 2 seem to hate 3 (see issue trackers / web sites of all three) and meanwhile probably also vice versa. The discussion was quickly dragged into the legal realm by 1 in particular, which very obviously dampened number 3's initial enthusiasm. Additionally, 2 describes itself somewhat prominently as a "legacy processor" for Python (technically correct in the current version, but legacy's meaning here is the relationship to the new Asciidoctor-specific constructs). At the same time it promises further development but nothing usable has come out of it so far.

As a Python programmer, I would simply like to see a pure Python3 toolchain. It is quite an absurd situation at this moment. For example, my blog is supporting Markdown and ReST natively (Pelican-based). For Asciidoc - my preferred language - Pelican has a plugin, supporting different Asciidoc processors, but only at the first glance. It has also to support KaTeX. This on the other hand is no problem for Pelican's native Markdown languages (simply another plugin), but the Asciidoc plugin is too high-level. It can only use Asciidoctor in this case, requiring Ruby's KaTeX gem as an extra dependency. This gem seems to be abandoned and has compatibility issues with newer Asciidoctor versions ...

2 is no option for its installation hell alone (Asciidoc3 is pure Python, simply a pip install). I don't know, if it is able to interact with KaTeX.

From what I can see, 3 would technically offer the best initial platform for further development as a package. Could be wrong, of course.


You just convinced me never wanting to install any of these


Oh, but there is!

    pandoc --shift-heading-level-by=-1 input.md -o output.docx
This will promote level-2 headings to level-1, and promote a level-1 heading at the top of the document to the document's title.


Oh really? I've tried --shift-heading in the past and it worked to move headings up a level, but not to the title. I'll have to read the docs more carefully and give it another go. Thank you.


The motivation for this choice is not "convenient parsing" but deeper considerations of language design. As explained at https://github.com/jgm/djot#rationale , this choice follows from two desiderata: (1) "The syntax should compose uniformly, in the following sense: if a sequence of lines has a certain meaning outside a list item or block quote, it should have the same meaning inside it." (2) "The syntax should be friendly to hard-wrapping: hard-wrapping a paragraph should not lead to different interpretations, e.g. when a number followed by a period ends up at the beginning of a line." The document explains the compromise we made in commonmark to avoid the need for blank lines. Djot tries to be more principled.


> Markup language which completely falls over this is Markdown. There’s no way to express generic tree structure, conversion to HTML with specific browser tags is hard-coded.

This isn't really a fair criticism. True, the original Markdown.pl did not produce a generic tree structure, but that's a fact about the program, not the syntax it parses. Many Markdown and Commonmark implementations do support creation of an abstract syntax tree. Pandoc has done this for the last 17 years. It also provides nestable, generic containers as a syntax extension.

> It feels like there’s a smaller, simpler language somewhere

Here's my attempt: <https://djot.net>.


Wait a minute, I recognize that GitHub handle! Are you John McFarlane? What a mind-blowing day to come across the creator of pandoc. You've saved many a student from pain. Thanks for everything.


>It also provides nestable, generic containers as a syntax extension.

That’s what I’ve meant: markdown syntax doesn’t allow for this, which requires every extension to be syntax extension. Djot’s span&div do address this problem.


See also markdoc.dev for instances where some commonality with Markdown is desired or required.

Markdoc—> ast —> renderable tree —> HTML or React

> we are seriously considering the possibility of drafting a specification for the JSON representation of Markdoc's Abstract Syntax Tree (AST) in order to facilitate interoperability between Markdoc tools


Note also that TeX math can contain \text{..} which can itself contain $-delimited TeX math, e.g. $x = \text{my $y$}$. This currently breaks the GitHub implementation.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: