-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Implement PEP 784 - Adding Zstandard to the Python standard library #132983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It's something I should have asked on discourse, but is |
Yes, the decision was to use |
…dule (GH-133018) * Introduces `compression` package for https://peps.python.org/pep-0784/ This commit introduces the `compression` package, specified in PEP 784 to re-export the `lzma`, `bz2`, `gzip`, and `zlib` modules. Introduction of `compression.zstd` will be completed in a future commit once the `_zstd` module is merged. This commit also moves the `_compression` private module to `compression._common._streams`. * Re-exports existing module docstrings.
Include compression package contents as part of installs.
Add an __init__ file. Fix test_tools.test_freeze.
Add an "__init__.py" file. Fix test_tools.test_freeze.
* Add _zstd module for https://peps.python.org/pep-0784/ This commit introduces the `_zstd` module, with bindings to libzstd from the pyzstd project. It also includes the unix build system configuration. Windows build system support will be integrated independently as it depends on integration with cpython-source-deps. * Add _zstd to modules * Fix path for compression.zstd module * Ignore _zstd module like _io * Expand module state macros to improve code quality Also removes module state references from the classes in the _zstd module and instead uses PyType_GetModuleState() * Remove backticks suggested in review Co-authored-by: Stan Ulbrych <[email protected]> * Use critical sections to lock object state This should avoid races and deadlocks. * Remove compress/decompress and mark module as not reliant on the GIL The `compress`/`decompress` functions will be moved to Python code for simplicity. C implementations can always be re-added in the future. Also, mark _zstd as not requiring the GIL. * Lift critical section to avoid clang warning * Respond to comments by picnixz * Call out pyzstd explicitly in license description Co-authored-by: Adam Turner <[email protected]> * Use a much more robust implementation... ... for `get_zstd_state_from_type` Co-authored-by: Bénédikt Tran <[email protected]> * Use PyList_GetItemRef for thread safety purposes * Use a macro for the minimum supported version * remove const from primivite types * Use PyMem_New in another spot * Simplify error handling in _get_frame_size * Another simplification of error handling in get_frame_info * Rename _module_state to mod_state * Rewrite comment explaining the context of the code * Add link to pyzstd * Add TODO about refactoring dict training code * Use PyModule_AddObjectRef over PyModule_AddObject PyModule_AddObject is soft-deprecated, so we should use PyModule_AddObjectRef * Check result of OutputBufferGrow * Simplify return logic in `add_constant_to_type` Co-authored-by: Bénédikt Tran <[email protected]> * Ignore return value of _zstd_clear() Co-authored-by: Bénédikt Tran <[email protected]> * Remove redundant comments * Remove __reduce__ from ZstdDict We should instead document that to pickle a dictionary a user should use the `.dict_content` attribute. * Use PyUnicode_FromFormat instead of a buffer * Don't use C constants/types in error messages * Make error messages easier to understand for Python users * Lower minimum required version 1.4.0 * Use casts and make slot function signatures correct * Be consistent with CPython on const usage * Make else clauses in line with PEP 7 * Fix over-indented blocks in argument clinic * Add critical section around ZSTD_DCtx_setParameter * Add a TODO about refactoring critical sections * Use Py_UNREACHABLE * Move bytes operations out of Py_BEGIN_ALLOW_THREADS * Add TODO about ensuring a lock is held * Remove asserts that may not be correct * Add TODO to make ZstdDict and others GC objects * Make objects GC tracked * Remove unused include * Fix some memory issues * Fix refleaks on module and in ZstdDict * Update configure to check for ZDICT_finalizeDictionary * Properly check version in configure * exit(1) if check fails * Use AC_RUN_IFELSE * Use a define() to re-use version check * Actually properly set _zstd module status based on version --------- Co-authored-by: Stan Ulbrych <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]>
I've had a go at adding support for building on Windows in #133366, let me know thoughts. A |
* origin/main: (111 commits) pythongh-91048: Add filename and line number to external inspection routines (pythonGH-133385) pythongh-131178: Add tests for `ast` command-line interface (python#133329) Regenerate pcbuild.sln in Visual Studio 2022 (python#133394) pythongh-133042: disable HACL* HMAC on Emscripten (python#133064) pythongh-133351: Fix remote PDB's multi-line block tab completion (python#133387) pythongh-109700: Improve stress tests for interpreter creation (pythonGH-109946) pythongh-81793: Skip tests for os.link() to symlink on Android (pythonGH-133388) pythongh-126835: Rename `ast_opt.c` to `ast_preprocess.c` and related stuff after moving const folding to the peephole optimizier (python#131830) pythongh-91048: Relax test_async_global_awaited_by to fix flakyness (python#133368) pythongh-132457: make staticmethod and classmethod generic (python#132460) pythongh-132805: annotationlib: Fix handling of non-constant values in FORWARDREF (python#132812) pythongh-132426: Add get_annotate_from_class_namespace replacing get_annotate_function (python#132490) pythongh-81793: Always call linkat() from os.link(), if available (pythonGH-132517) pythongh-122559: Synchronize C and Python implementation of the io module about pickling (pythonGH-122628) pythongh-69605: Add PyREPL import autocomplete feature to 'What's New' (python#133358) bpo-44172: Keep reference to original window in curses subwindow objects (pythonGH-26226) pythonGH-133231: Changes to executor management to support proposed `sys._jit` module (pythonGH-133287) pythongh-133363: Fix Cmd completion for lines beginning with `! ` (python#133364) pythongh-132983: Introduce `_zstd` bindings module (pythonGH-133027) pythonGH-91048: Add utils for printing the call stack for asyncio tasks (python#133284) ...
It looks like there are a couple of problems with the |
Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Gregory P. Smith <[email protected]> Co-authored-by: Tomas R. <[email protected]> Co-authored-by: Rogdham <[email protected]>
Windows build support, autoconf fix, and the Python wrapper now all merged. I think that should be everything required for b1, but we could use this issue to track the outstanding items from #133365. Thanks! A |
I went ahead and added checkboxes for all of the TODOs left after review of both #133365 and #133027, as well as any TODO comments not discussed there. If I missed anything please leave a comment or edit the OP. |
Could it be the following?
Details
From below, it seems that
|
…-133791) (cherry picked from commit 1978904) Co-authored-by: Adam Turner <[email protected]>
(cherry picked from commit 50b5370) Co-authored-by: Rogdham <[email protected]>
(cherry picked from commit 1a548c0) Co-authored-by: Adam Turner <[email protected]>
…133854) gh-132983: Reduce the size of ``_zstdmodule.h`` (GH-133793) (cherry picked from commit 1a548c0) Co-authored-by: Adam Turner <[email protected]>
(cherry picked from commit 1a87b6e) Co-authored-by: Adam Turner <[email protected]>
Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Peter Bierma <[email protected]>
…nGH-133856) (cherry picked from commit 878e0fb) Co-authored-by: Rogdham <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Peter Bierma <[email protected]>
…33856) (#133859) gh-132983: Remove leftovers from EndlessZstdDecompressor (GH-133856) (cherry picked from commit 878e0fb) Co-authored-by: Rogdham <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Peter Bierma <[email protected]>
Please, see #133885 |
Feature or enhancement
This is a tracking issue for implementing PEP 784. See the PEP text for more details.
Since the diff is significant (~10k lines) I wanted to split up the PRs a bit.
Implementation Plan:
compression
module just re-exporting existing compression modules. Move the_compression
module._zstd
native module with Unix build config_zstd
(blocked on addinglibzstd
to cpython-source-deps) and SBOM config.zstd
Python module with testszstd
_train_dict
and_finalize_dict
to share common codelock_held
functionsoptions
value can't be converted to intLinked PRs
compression
package and move_compression
module #133018_zstd
bindings module #133027compression._common
a full-fledged package #133076compression.zstd
and Python tests #133365_zstd
on Windows #133366compression.zstd
#133547_zstd
#133670_zstd
(GH-133674) #133695_zstd
(GH-133670) #133756_zstd_exec()
#133775_zstd_exec()
(GH-133775) #133786_zstdmodule.h
#133793_zstdmodule.h
(GH-133793) #133854The text was updated successfully, but these errors were encountered: