Skip to content

MAINT: Use consistent method keywords #1187

@mtd91429

Description

@mtd91429

Explanation

As PyPDF2 has evolved, some of the internal naming conventions have not been consistent. One recently addressed issue was the outline/bookmark component; I think that was the largest example. As I have been exploring the code base, I have noticed others. I propose that these keywords be systematically implemented across the library and that the deprecation process utilizes an internally consistent process.

Below are the list of keywords that I can see which have inconsistent implementation (only including non-deprecated methods):

✔️ pagenum, page_number, pageNumber (see #1365)

  • pagenum is used in:
    • PdfWriter.addBookmark
    • PdfWriter.add_outline_item
    • PdfWriter.add_named_destination
    • PdfWriter.add_uri
    • PdfWriter.add_link
  • pageNumber is used in:
    • PdfWriter.get_page
  • page_number is used in:
    • PdfWriter.add_annotation
    • PdfWriter.get_page

✔️ pagedest and dest (see #1467)

  • pagedest is used in:
    • PdfWriter.add_link
  • dest is used in:
    • PdfWriter.add_outline_item_destination
    • PdfWriter.add_named_destination_object

✔️ indirect_ref and ido (see #1467 and #1484)

  • indirect_ref is used in:
    • PdfReader._get_page_number_by_indirect
    • PdfReader._flatten
  • indirect_reference is used in:
    • PdfReader._get_object_from_stream
  • ido is used in:
    • PdfWriter.get_object

✔️ pwd and password (see #1483)

  • password is used in:
    • PdfReader.decrypt
  • pwd is used in:
    • PdfWriter.encrypt (as user_pwd, owner_pwd)

Extrinsically inconsistent methods

There are some deprecated methods with keywords that have been changed in the updated function. For example, PdfWriter.addAttachment uses fname and PdfWriter.add_attachment uses filename or PdfWriter.removeImages uses ignoreByteStringObject and PdfWriter.remove_images uses ignore_byte_string_object. These changes are silent.

Intrinsically inconsistent methods

There are some methods which use keywords that are both snake_case and "smooshed" case. For example, PdfReader._read_xref_tables_and_trailers has both startxref and xref_issue_nr,

Inconsistent deprecation implementation

Some of the keywords have been deprecated within the methods. For example, PdfWriter.get_page. Whereas other keywords have been deprecated via decorator; for example, PdfWriter.add_outline_item_dict.

Concluding remarks (for this long winded post)

I don't think all of these examples necessarily need to be changed; I've included them to be as thorough as possible. However, in my opinion, snake_case is best and explicit is better than implied via abbreviation (usually). Please list additional examples that you see and any proposed approaches.

Metadata

Metadata

Assignees

Labels

is-maintenanceAnything that is just internal: Simplifying code, syntax changes, updating docs, speed improvements

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions