-
Notifications
You must be signed in to change notification settings - Fork 451
Description
Feature Request / Improvement
Description
This issue proposes implementing a metadata-only replace API in PyIceberg, enabling orchestrators to submit a set of DataFiles to delete and a set of DataFiles to append in a single atomic transaction.
This functionality is critical for maintenance operations such as data compaction (the "small files" problem), ensuring the logical state of the table remains unaltered while physical data layout is optimized.
Background
In a current PR (#3124, part of #1092), PyIceberg's replace semantics are tightly coupled with PyArrow dataframes (def replace(self, df: pa.Table)). This approach introduces several architectural flaws:
- Coupling Physical Serialization with Metadata: It forces a
.parquetwrite serialization hook directly into the snapshot commit transaction, increasing the risk of schema degradation and blocking network topologies. - Missing
Operation.REPLACE: The current system uses primitives that log asAPPENDorOVERWRITE, muddying the table history and complicating snapshot expiry/maintenance. - Java Inconsistency: This severely drifts from Java Iceberg's native
org.apache.iceberg.RewriteFilesspecification, which strictly isolates the builder into accepting purelyDataFilepointers.
Proposed Solution
To fix this and achieve logical equivalence, we must implement an exact port of Java's RewriteFiles builder API into PyIceberg's native _SnapshotProducer engine.
-
Introduce
_RewriteFilesSnapshot Producer:
Add a new_RewriteFilesclass that specifically targets replacing existing files. This class will implement:_deleted_entries(): To find the existing target files and re-emit them asDELETEDentries, defensively keeping their ancestralsequence_numbers completely intact for time travel compatibility._existing_manifests(): To scavenge unchanged manifests natively, skipping deep rewrites and only mutating manifests impacted by the deleted files.
-
Builder Hook Implementation:
ImplementUpdateSnapshot().replace()which configures the transaction withOperation.REPLACE. -
Expose Shorthands on Table & Transaction:
AddreplaceAPIs on bothTableandTransactiontakingIterable[DataFile]arguments to elegantly wrap the snapshot mutation:def replace( self, files_to_delete: Iterable[DataFile], files_to_add: Iterable[DataFile], snapshot_properties: dict[str, str] = EMPTY_DICT, branch: str | None = MAIN_BRANCH, ) -> None: ...
Acceptance Criteria
-
replace()API implemented on bothTableandTransactionusingIterable[DataFile]. - PyArrow
.parquetwrite logic decoupled from the metadata transaction. -
_RewriteFilescorrectly copies ancestralsequence_numberpointers forDELETEDandEXISTINGmanifest entries. - Snapshots committed via the
replace()hook possess a Summary containingoperation=Operation.REPLACE. - Unit tests pass simulating data file swaps and summary verifications.
Related Java API
Inspired heavily by Java's builder interface: https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/RewriteFiles.java