Skip to content

feat: add Store and StorageFormat with options support for result storage#57

Closed
james-willis wants to merge 1 commit intowherobots:mainfrom
james-willis:feat/store-options
Closed

feat: add Store and StorageFormat with options support for result storage#57
james-willis wants to merge 1 commit intowherobots:mainfrom
james-willis:feat/store-options

Conversation

@james-willis
Copy link
Contributor

Summary

  • Add Store dataclass and StorageFormat enum in new result_store.py module for configuring query result storage to cloud storage (S3)
  • Support format-specific Spark DataFrameWriter options (e.g. {"ignoreNullFields": "false"} for GeoJSON, {"header": "false"} for CSV) that override server defaults
  • Thread store parameter through connect()ConnectionCursor.execute() so it can be set at connection level or per-query

Details

This brings the Python DB-API driver to feature parity with the JDBC driver (wherobots-jdbc-driver#19) and the server-side support (sql-session#190).

New module: wherobots/db/result_store.py

  • StorageFormat enum: PARQUET, CSV, GEOJSON
  • Store frozen dataclass with fields: format, single, generate_presigned_url, options
  • Store.for_download() factory for the common single-file + presigned URL pattern
  • Store.to_dict() for WebSocket protocol serialization
  • Validation: generate_presigned_url requires single=True
  • Defensive copying: options dict is copied on construction and on to_dict()
  • Empty options normalized to None (omitted from serialized dict)

Modified files

  • connection.py: Accepts store parameter, passes to Cursor, includes store.to_dict() in execute_sql WebSocket request
  • cursor.py: Accepts default_store from connection; execute() accepts per-query store override
  • driver.py: connect() and connect_direct() accept and forward store parameter
  • __init__.py: Exports Store and StorageFormat
  • README.md: Documents store usage, store options, and connection-level defaults

Tests: tests/test_result_store.py (21 tests)

  • StorageFormat enum values and default
  • Store construction, options handling (empty/None normalization, defensive copy, immutability)
  • Validation (presigned URL requires single)
  • for_download() factory method
  • to_dict() serialization with/without options
  • JSON round-trip and full execute_sql request shape

Usage

from wherobots.db import connect, Store, StorageFormat

with connect(api_key="...", runtime=Runtime.TINY, region=Region.AWS_US_WEST_2) as conn:
    curr = conn.cursor()

    # Store as GeoJSON with custom options
    store = Store.for_download(
        format=StorageFormat.GEOJSON,
        options={"ignoreNullFields": "false"},
    )
    curr.execute("SELECT * FROM my_table", store=store)
    results = curr.fetchall()

Dependencies

  • sql-session#190 must be deployed first (server must understand the options field)
  • Compatible with wherobots-jdbc-driver#19 (same protocol)

@james-willis james-willis requested a review from a team as a code owner March 2, 2026 21:34
@james-willis james-willis requested review from peterfoldes and removed request for a team March 2, 2026 21:34
@james-willis
Copy link
Contributor Author

Closing this PR — main already has the full Store/StorageFormat/StoreResult infrastructure from PRs #55 and #56. Will open a new, smaller PR that only adds the options field support on top of the current main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant