pdf2docx.common.Block module#

Base class for text/image/table blocks.

class pdf2docx.common.Block.Block(raw: Optional[dict] = None, parent=None)#

Bases: Element

Base class for text/image/table blocks.

Attributes:

raw (dict): initialize object from raw properties. parent (optional): parent object that this block belongs to.

property is_float_image_block#

Whether float image block.

property is_image_block#

Whether inline or float image block.

property is_inline_image_block#

Whether inline image block.

property is_lattice_table_block#

Whether lattice table (explicit table borders) block.

property is_stream_table_block#

Whether stream table (implied by table content) block.

property is_table_block#

Whether table (lattice or stream) block.

property is_text_block#

Whether test block.

property is_text_image_block#

Whether text block or inline image block.

make_docx(*args, **kwargs)#

Create associated docx element.

Raises:

NotImplementedError

parse_horizontal_spacing(bbox, *args)#

Set left alignment, and calculate left space.

Override by pdf2docx.text.TextBlock.

Args:

bbox (fitz.rect): boundary box of this block.

set_float_image_block()#

Set block type.

set_inline_image_block()#

Set block type.

set_lattice_table_block()#

Set block type.

set_stream_table_block()#

Set block type.

set_text_block()#

Set block type.

store()#

Store attributes in json format.