Skip to content

Commit 4507d49

Browse files
gh-145980: Add support for alternative alphabets in the binascii module (GH-145981)
* Add the alphabet parameter in functions b2a_base64(), a2b_base64(), b2a_base85(), and a2b_base85(). * And a number of "*_ALPHABET" constants. * Remove b2a_z85() and a2b_z85().
1 parent d357a7d commit 4507d49

File tree

11 files changed

+471
-350
lines changed

11 files changed

+471
-350
lines changed

Doc/library/binascii.rst

Lines changed: 64 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -48,12 +48,15 @@ The :mod:`!binascii` module defines the following functions:
4848
Added the *backtick* parameter.
4949

5050

51-
.. function:: a2b_base64(string, /, *, strict_mode=False)
52-
a2b_base64(string, /, *, strict_mode=True, ignorechars)
51+
.. function:: a2b_base64(string, /, *, alphabet=BASE64_ALPHABET, strict_mode=False)
52+
a2b_base64(string, /, *, ignorechars, alphabet=BASE64_ALPHABET, strict_mode=True)
5353
5454
Convert a block of base64 data back to binary and return the binary data. More
5555
than one line may be passed at a time.
5656

57+
Optional *alphabet* must be a :class:`bytes` object of length 64 which
58+
specifies an alternative alphabet.
59+
5760
If *ignorechars* is specified, it should be a :term:`bytes-like object`
5861
containing characters to ignore from the input when *strict_mode* is true.
5962
If *ignorechars* contains the pad character ``'='``, the pad characters
@@ -76,10 +79,10 @@ The :mod:`!binascii` module defines the following functions:
7679
Added the *strict_mode* parameter.
7780

7881
.. versionchanged:: 3.15
79-
Added the *ignorechars* parameter.
82+
Added the *alphabet* and *ignorechars* parameters.
8083

8184

82-
.. function:: b2a_base64(data, *, wrapcol=0, newline=True)
85+
.. function:: b2a_base64(data, *, alphabet=BASE64_ALPHABET, wrapcol=0, newline=True)
8386

8487
Convert binary data to a line(s) of ASCII characters in base64 coding,
8588
as specified in :rfc:`4648`.
@@ -95,7 +98,7 @@ The :mod:`!binascii` module defines the following functions:
9598
Added the *newline* parameter.
9699

97100
.. versionchanged:: 3.15
98-
Added the *wrapcol* parameter.
101+
Added the *alphabet* and *wrapcol* parameters.
99102

100103

101104
.. function:: a2b_ascii85(string, /, *, foldspaces=False, adobe=False, ignorechars=b"")
@@ -148,7 +151,7 @@ The :mod:`!binascii` module defines the following functions:
148151
.. versionadded:: 3.15
149152

150153

151-
.. function:: a2b_base85(string, /)
154+
.. function:: a2b_base85(string, /, *, alphabet=BASE85_ALPHABET)
152155

153156
Convert Base85 data back to binary and return the binary data.
154157
More than one line may be passed at a time.
@@ -158,49 +161,25 @@ The :mod:`!binascii` module defines the following functions:
158161
characters). Each group encodes 32 bits of binary data in the range from
159162
``0`` to ``2 ** 32 - 1``, inclusive.
160163

164+
Optional *alphabet* must be a :class:`bytes` object of length 85 which
165+
specifies an alternative alphabet.
166+
161167
Invalid Base85 data will raise :exc:`binascii.Error`.
162168

163169
.. versionadded:: 3.15
164170

165171

166-
.. function:: b2a_base85(data, /, *, pad=False)
172+
.. function:: b2a_base85(data, /, *, alphabet=BASE85_ALPHABET, pad=False)
167173

168174
Convert binary data to a line of ASCII characters in Base85 coding.
169175
The return value is the converted line.
170176

171-
If *pad* is true, the input is padded with ``b'\0'`` so its length is a
172-
multiple of 4 bytes before encoding.
173-
174-
.. versionadded:: 3.15
175-
176-
177-
.. function:: a2b_z85(string, /)
178-
179-
Convert Z85 data back to binary and return the binary data.
180-
More than one line may be passed at a time.
181-
182-
Valid Z85 data contains characters from the Z85 alphabet in groups
183-
of five (except for the final group, which may have from two to five
184-
characters). Each group encodes 32 bits of binary data in the range from
185-
``0`` to ``2 ** 32 - 1``, inclusive.
186-
187-
See `Z85 specification <https://rfc.zeromq.org/spec/32/>`_ for more information.
188-
189-
Invalid Z85 data will raise :exc:`binascii.Error`.
190-
191-
.. versionadded:: 3.15
192-
193-
194-
.. function:: b2a_z85(data, /, *, pad=False)
195-
196-
Convert binary data to a line of ASCII characters in Z85 coding.
197-
The return value is the converted line.
177+
Optional *alphabet* must be a :term:`bytes-like object` of length 85 which
178+
specifies an alternative alphabet.
198179

199180
If *pad* is true, the input is padded with ``b'\0'`` so its length is a
200181
multiple of 4 bytes before encoding.
201182

202-
See `Z85 specification <https://rfc.zeromq.org/spec/32/>`_ for more information.
203-
204183
.. versionadded:: 3.15
205184

206185

@@ -300,6 +279,55 @@ The :mod:`!binascii` module defines the following functions:
300279
but may be handled by reading a little more data and trying again.
301280

302281

282+
.. data:: BASE64_ALPHABET
283+
284+
The Base 64 alphabet according to :rfc:`4648`.
285+
286+
.. versionadded:: next
287+
288+
.. data:: URLSAFE_BASE64_ALPHABET
289+
290+
The "URL and filename safe" Base 64 alphabet according to :rfc:`4648`.
291+
292+
.. versionadded:: next
293+
294+
.. data:: UU_ALPHABET
295+
296+
The uuencoding alphabet.
297+
298+
.. versionadded:: next
299+
300+
.. data:: CRYPT_ALPHABET
301+
302+
The Base 64 alphabet used in the :manpage:`crypt(3)` routine and in the GEDCOM format.
303+
304+
.. versionadded:: next
305+
306+
.. data:: BINHEX_ALPHABET
307+
308+
The Base 64 alphabet used in BinHex 4 (HQX) within the classic Mac OS.
309+
310+
.. versionadded:: next
311+
312+
.. data:: BASE85_ALPHABET
313+
314+
The Base85 alphabet.
315+
316+
.. versionadded:: next
317+
318+
.. data:: ASCII85_ALPHABET
319+
320+
The Ascii85 alphabet.
321+
322+
.. versionadded:: next
323+
324+
.. data:: Z85_ALPHABET
325+
326+
The `Z85 <https://rfc.zeromq.org/spec/32/>`_ alphabet.
327+
328+
.. versionadded:: next
329+
330+
303331
.. seealso::
304332

305333
Module :mod:`base64`

Doc/whatsnew/3.15.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -649,13 +649,16 @@ binascii
649649

650650
- :func:`~binascii.b2a_ascii85` and :func:`~binascii.a2b_ascii85`
651651
- :func:`~binascii.b2a_base85` and :func:`~binascii.a2b_base85`
652-
- :func:`~binascii.b2a_z85` and :func:`~binascii.a2b_z85`
653652

654653
(Contributed by James Seo and Serhiy Storchaka in :gh:`101178`.)
655654

656655
* Added the *wrapcol* parameter in :func:`~binascii.b2a_base64`.
657656
(Contributed by Serhiy Storchaka in :gh:`143214`.)
658657

658+
* Added the *alphabet* parameter in :func:`~binascii.b2a_base64` and
659+
:func:`~binascii.a2b_base64`.
660+
(Contributed by Serhiy Storchaka in :gh:`145980`.)
661+
659662
* Added the *ignorechars* parameter in :func:`~binascii.a2b_base64`.
660663
(Contributed by Serhiy Storchaka in :gh:`144001`.)
661664

Include/internal/pycore_global_objects_fini_generated.h

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_global_strings.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,7 @@ struct _Py_global_strings {
307307
STRUCT_FOR_ID(all)
308308
STRUCT_FOR_ID(all_threads)
309309
STRUCT_FOR_ID(allow_code)
310+
STRUCT_FOR_ID(alphabet)
310311
STRUCT_FOR_ID(any)
311312
STRUCT_FOR_ID(append)
312313
STRUCT_FOR_ID(arg)

Include/internal/pycore_runtime_init_generated.h

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_unicodeobject_generated.h

Lines changed: 4 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Lib/base64.py

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -56,11 +56,13 @@ def b64encode(s, altchars=None, *, wrapcol=0):
5656
If wrapcol is non-zero, insert a newline (b'\\n') character after at most
5757
every wrapcol characters.
5858
"""
59-
encoded = binascii.b2a_base64(s, wrapcol=wrapcol, newline=False)
6059
if altchars is not None:
61-
assert len(altchars) == 2, repr(altchars)
62-
return encoded.translate(bytes.maketrans(b'+/', altchars))
63-
return encoded
60+
if len(altchars) != 2:
61+
raise ValueError(f'invalid altchars: {altchars!r}')
62+
alphabet = binascii.BASE64_ALPHABET[:-2] + altchars
63+
return binascii.b2a_base64(s, wrapcol=wrapcol, newline=False,
64+
alphabet=alphabet)
65+
return binascii.b2a_base64(s, wrapcol=wrapcol, newline=False)
6466

6567

6668
def b64decode(s, altchars=None, validate=_NOT_SPECIFIED, *, ignorechars=_NOT_SPECIFIED):
@@ -100,15 +102,10 @@ def b64decode(s, altchars=None, validate=_NOT_SPECIFIED, *, ignorechars=_NOT_SPE
100102
break
101103
s = s.translate(bytes.maketrans(altchars, b'+/'))
102104
else:
103-
trans_in = set(b'+/') - set(altchars)
104-
if len(trans_in) == 2:
105-
# we can't use the reqult of unordered sets here
106-
trans = bytes.maketrans(altchars + b'+/', b'+/' + altchars)
107-
else:
108-
trans = bytes.maketrans(altchars + bytes(trans_in),
109-
b'+/' + bytes(set(altchars) - set(b'+/')))
110-
s = s.translate(trans)
111-
ignorechars = ignorechars.translate(trans)
105+
alphabet = binascii.BASE64_ALPHABET[:-2] + altchars
106+
return binascii.a2b_base64(s, strict_mode=validate,
107+
alphabet=alphabet,
108+
ignorechars=ignorechars)
112109
if ignorechars is _NOT_SPECIFIED:
113110
ignorechars = b''
114111
result = binascii.a2b_base64(s, strict_mode=validate,
@@ -146,7 +143,6 @@ def standard_b64decode(s):
146143
return b64decode(s)
147144

148145

149-
_urlsafe_encode_translation = bytes.maketrans(b'+/', b'-_')
150146
_urlsafe_decode_translation = bytes.maketrans(b'-_', b'+/')
151147

152148
def urlsafe_b64encode(s):
@@ -156,7 +152,8 @@ def urlsafe_b64encode(s):
156152
bytes object. The alphabet uses '-' instead of '+' and '_' instead of
157153
'/'.
158154
"""
159-
return b64encode(s).translate(_urlsafe_encode_translation)
155+
return binascii.b2a_base64(s, newline=False,
156+
alphabet=binascii.URLSAFE_BASE64_ALPHABET)
160157

161158
def urlsafe_b64decode(s):
162159
"""Decode bytes using the URL- and filesystem-safe Base64 alphabet.
@@ -399,14 +396,14 @@ def b85decode(b):
399396

400397
def z85encode(s, pad=False):
401398
"""Encode bytes-like object b in z85 format and return a bytes object."""
402-
return binascii.b2a_z85(s, pad=pad)
399+
return binascii.b2a_base85(s, pad=pad, alphabet=binascii.Z85_ALPHABET)
403400

404401
def z85decode(s):
405402
"""Decode the z85-encoded bytes-like object or ASCII string b
406403
407404
The result is returned as a bytes object.
408405
"""
409-
return binascii.a2b_z85(s)
406+
return binascii.a2b_base85(s, alphabet=binascii.Z85_ALPHABET)
410407

411408
# Legacy interface. This code could be cleaned up since I don't believe
412409
# binascii has any line length limitations. It just doesn't seem worth it

0 commit comments

Comments
 (0)