pdf2docx.common.docx module#
docx operation methods based on python-docx
.
- pdf2docx.common.docx.add_float_image(p, image_path_or_stream, width, pos_x=None, pos_y=None)#
Add float image behind text.
- Args:
p (Paragraph):
python-docx
Paragraph object this picture belongs to. image_path_or_stream (str, bytes): Image path or stream. width (float): Displaying width of picture, in unit Pt. pos_x (float): X-position (English Metric Units) to the top-left point of page valid region pos_y (float): Y-position (English Metric Units) to the top-left point of page valid region
- pdf2docx.common.docx.add_hyperlink(paragraph, url, text)#
Create a hyperlink within a paragraph object.
Reference:
- Args:
paragraph (Paragraph):
python-docx
paragraph adding the hyperlink to. url (str): The required url. text (str): The text displayed for the url.- Returns:
Run: A Run object containing the hyperlink.
- pdf2docx.common.docx.add_image(p, image_path_or_stream, width, height)#
Add image to paragraph.
- Args:
p (Paragraph):
python-docx
paragraph instance. image_path_or_stream (str, bytes): Image path or stream. width (float): Image width in Pt. height (float): Image height in Pt.
- pdf2docx.common.docx.delete_paragraph(paragraph)#
Delete a paragraph.
- pdf2docx.common.docx.indent_table(table, indent: float)#
Indent a table.
- Args:
table (Table):
python-docx
Table object. indent (float): Indent value, the basic unit is 1/20 pt.
- pdf2docx.common.docx.reset_paragraph_format(p, line_spacing: float = 1.05)#
Reset paragraph format, especially line spacing.
Two kinds of line spacing, corresponding to the setting in MS Office Word:
line_spacing=1.05: single or multiple
line_spacing=Pt(1): exactly
- Args:
p (Paragraph):
python-docx
paragraph instance. line_spacing (float, optional): Line spacing. Defaults to 1.05.- Returns:
paragraph_format: Paragraph format.
- pdf2docx.common.docx.set_cell_border(cell: _Cell, **kwargs)#
Set cell`s border.
- Reference:
- Args:
cell (_Cell):
python-docx
Cell instance you want to modify. kwargs (dict): Dict with keys: top, bottom, start, end.
Usage:
set_cell_border( cell, top={"sz": 12, "val": "single", "color": "#FF0000", "space": "0"}, bottom={"sz": 12, "color": "#00FF00", "val": "single"}, start={"sz": 24, "val": "dashed", "shadow": "true"}, end={"sz": 12, "val": "dashed"}, )
- pdf2docx.common.docx.set_cell_margins(cell: _Cell, **kwargs)#
Set cell margins. Provided values are in twentieths of a point (1/1440 of an inch).
Reference:
- Args:
cell (_Cell):
python-docx
Cell instance you want to modify. kwargs (dict): Dict with keys: top, bottom, start, end.
Usage:
set_cell_margins(cell, top=50, start=50, bottom=50, end=50)
- pdf2docx.common.docx.set_cell_shading(cell: _Cell, srgb: int)#
Set cell background-color.
- Reference:
https://stackoverflow.com/questions/26752856/python-docx-set-table-cell-background-and-text-color
- Args:
cell (_Cell):
python-docx
Cell instance you want to modify srgb (int): RGB color value.
- pdf2docx.common.docx.set_char_scaling(p_run, scale: float = 1.0)#
Set character spacing: scaling.
Manual operation in MS Word: Font | Advanced | Character Spacing | Scaling.
- Args:
p_run (docx.text.run.Run): Proxy object wrapping <w:r> element. scale (float, optional): scaling factor. Defaults to 1.0.
- pdf2docx.common.docx.set_char_shading(p_run, srgb: int)#
Set character shading color, in case the color is out of highlight color scope.
- Reference:
- Args:
p_run (docx.text.run.Run): Proxy object wrapping <w:r> element. srgb (int): Color value.
- pdf2docx.common.docx.set_char_spacing(p_run, space: float = 0.0)#
Set character spacing.
Manual operation in MS Word: Font | Advanced | Character Spacing | Spacing.
- Args:
p_run (docx.text.run.Run): Proxy object wrapping <w:r> element. space (float, optional): Spacing value in Pt. Expand if positive else condense. Defaults to 0.0.
- pdf2docx.common.docx.set_char_underline(p_run, srgb: int)#
Set underline and color.
- Args:
p_run (docx.text.run.Run): Proxy object wrapping <w:r> element. srgb (int): Color value.
- pdf2docx.common.docx.set_columns(section, width_list: list, space=0)#
Set section column count and space.
- Args:
section :
python-docx
Section instance. width_list (list|tuple): Width of each column. space (int, optional): Space between adjacent columns. Unit: Pt. Defaults to 0.
Scheme:
<w:cols w:num="2" w:space="0" w:equalWidth="0"> <w:col w:w="2600" w:space="0"/> <w:col w:w="7632"/> </w:cols>
- pdf2docx.common.docx.set_equal_columns(section, num=2, space=0)#
Set section column count and space. All the columns have same width.
- Args:
section :
python-docx
Section instance. num (int): Column count. Defaults to 2. space (int, optional): Space between adjacent columns. Unit: Pt. Defaults to 0.
Hide paragraph. This method just sets the paragraph property, while the added text must be hided explicitly.
r = p.add_run() r.text = “Hidden” r.font.hidden = True
- Args:
p (Paragraph): python-docx created paragraph.
- pdf2docx.common.docx.set_vertical_cell_direction(cell: _Cell, direction: str = 'btLr')#
Set vertical text direction for cell.
- Reference:
https://stackoverflow.com/questions/47738013/how-to-rotate-text-in-table-cells
- Args:
direction (str): Either “tbRl” (top to bottom) or “btLr” (bottom to top).