markdown的图片语法为

1
{% image src title %}

pandoc 3.x,将markdown导出为docx时,会把图片的caption默认设置为alt文本

见issue

但是目前的笔记软件,比如思源笔记、语雀是用markdown图片语法中的title作为图片的标题的

所以我需要尝试修改pandoc的默认行为,使得docx图片标题与笔记软件里的预览效果一样

参考这个issue提供的代码进行改进,Lua filter on image captions does not work. · Issue #8974 · jgm/pandoc

改进的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
-- image-title-to-caption.lua
-- A Pandoc Lua filter to set image title as caption
-- Format: ![alt](src "title")

-- Function to create a new caption from text
function create_caption(text)
-- For simple captions, we can just use a Str element
if text:find("^%s*$") then
-- Empty caption
return {}
else
-- Non-empty caption
return {pandoc.Str(text)}
end
end

-- Function to process the image and set title as caption
function process_image(img)

-- Check if title exists and is not empty
if img.title and img.title ~= "" then

-- Set the title as the image caption
img.caption = create_caption(img.title)

-- Clear the title to avoid duplication
img.title = ""

-- Return the modified image
return img
end

-- If no title, return the image unchanged
return img
end

-- Handler for standalone images
function Image(img)
return process_image(img)
end

-- Handler for Figure blocks (which may contain images)
function Figure(fig)
-- Check if the figure has an image
for i, block in ipairs(fig.content) do
if block.t == "Plain" then
for j, inline in ipairs(block.content) do
if inline.t == "Image" then
-- Process the image
local processed_image = process_image(inline)
-- Update the image in the figure
block.content[j] = processed_image

-- Also update the figure caption
if processed_image.caption and #processed_image.caption > 0 then
-- Extract the caption text
local caption_text = pandoc.utils.stringify(processed_image.caption)
-- Update the figure caption
fig.caption = create_caption(caption_text)
end
end
end
end
end

return fig
end

-- Return the filter
return {{Image = Image}, {Figure = Figure}}

给思源笔记的文献引用插件也贡献了这个代码:导出图片标题使用title而不是alt文本 by Achuan-2 · Pull Request #104 · WingDr/siyuan-plugin-citation

笔记

  • Lua filter on image captions does not work. · Issue #8974 · jgm/pandoc提供的根据![Caption text|width](image.png)中的alt文本修改图片大小修改caption的代码

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    -- image-width.lua
    -- A Pandoc Lua filter to adjust image widths based on caption format
    -- Format: ![Caption text|width](image.png)
    -- Function to create a new caption from text
    function create_caption(text)
    -- For simple captions, we can just use a Str element
    if text:find("^%s*$") then
    -- Empty caption
    return {}
    else
    -- Non-empty caption
    return pandoc.Inlines(pandoc.Str(text))
    end
    end

    -- Function to process the image and extract width from caption
    function process_image(img)

    -- Convert the caption to a single string
    local caption_text = pandoc.utils.stringify(img.caption)

    -- Check if the caption contains a pipe followed by a number
    -- Using a more flexible pattern that looks for a pipe character followed by digits
    local pipe_pos = caption_text:find("|")

    if pipe_pos then

    -- Extract the parts before and after the pipe
    local new_caption = caption_text:sub(1, pipe_pos - 1)
    local width_part = caption_text:sub(pipe_pos + 1)

    -- Extract the width number from the width part
    local width = width_part:match("(%d+)")

    if width then

    -- Update the image caption without the width part
    img.caption = create_caption(new_caption)

    -- Set the width attribute for the image (in pixels)
    img.attributes.width = width .. "px"

    -- Log the attributes
    for k, v in pairs(img.attributes) do end

    -- Return the modified image
    return img
    else
    end
    else
    end

    -- If no width specification was found, return the image unchanged
    return img
    end

    -- Handler for standalone images
    function Image(img) return process_image(img) end

    -- Handler for Figure blocks (which may contain images)
    function Figure(fig)

    -- Check if the figure has an image
    for i, block in ipairs(fig.content) do
    if block.t == "Plain" then
    for j, inline in ipairs(block.content) do
    if inline.t == "Image" then
    -- Process the image
    local processed_image = process_image(inline)
    -- Update the image in the figure
    block.content[j] = processed_image

    -- Also update the figure caption if needed
    if processed_image.caption then
    -- Extract the caption text
    local caption_text =
    pandoc.utils.stringify(processed_image.caption)
    -- Update the figure caption
    fig.caption = create_caption(caption_text)
    end
    end
    end
    end
    end

    return fig
    end

    -- Log when the filter is loaded

    -- Return the filter
    return {{Image = Image}, {Figure = Figure}}
  • pandoc 3新增了Figure对象,所以不能仅仅用img.caption来修改图片标题

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    function Image(img)
    -- 打印调试信息(可选,用于调试)
    -- print("Image found:", img.src, "Title:", img.title, "Caption:", pandoc.utils.stringify(img.caption))

    -- 检查是否存在 title 属性且不为空
    if img.title and img.title ~= "" then
    -- 创建新的 caption,使用 pandoc.Inlines 包装
    local new_caption = pandoc.Inlines({pandoc.Str(img.title)})

    -- 替换原有的 caption
    img.caption = new_caption

    -- 清空 title 以避免重复显示
    img.title = ""
    end

    return img
    end
  • 发现pandoc有时候没有生成图片标题,是因为图片前后没有空一行,像img1就没有caption

    1
    2
    3
    4
    5
      测试
    {% image preview.png title1 %}

    测试
    {% image preview.png title2 %}