CompanyMay 29, 2014

DataStax Python Driver 2.0 Released

DataStax Python Driver 2.0 Released

We are excited to announce the public release of version 2.0 of the DataStax Python driver for Apache Cassandra and DataStax Enterprise. The driver has several new features in this version:

  • Full Apache Cassandra 2.0 and DataStax Enterprise 4.0 support
  • Enhanced stability and many minor improvements
  • Python 3 support

The Upgrade Guide contains details about new features and other changes. I'll give some examples of how to use the new features here.

Automatic pagination of results

If a query yields a very large number of results, only an initial amount of rows will be fetched (according to the page size). The rest of the rows will be fetched on demand as you iterate through the rows in the result set.

from cassandra.cluster import Cluster

cluster = Cluster()
session = cluster.connect("mykeyspace")

# set a page size of 1000 rows
session.default_fetch_size = 1000
user_rows = session.execute("SELECT * FROM users")
for user_row in users:
    # when the initial page is exhausted, the next page will
    # be transparently fetched
    process_user(user.id, user.name, user.email)

For more details, see the documentation for query paging.

Lightweight Transactions

Using Cassandra 2.0's lightweight transactions is simple:

from cassandra import ConsistencyLevel

create_user_statement = session.prepare(
    "INSERT INTO users (username, email) VALUES (?, ?) IF NOT EXISTS")
create_user_statement.serial_consistency_level = ConsistencyLevel.SERIAL

session.execute(create_user_statement, [new_username, new_email)

Batching Statements

Although it has always been possible to execute statements in a BATCH, there was no way to do this with multiple prepared statements. With version 2.0 of the driver, you can execute multiple prepared (or unprepared) statements atomically across multiple tables within a single batch.

from cassandra.query import BatchStatement

//Prepare the statements involved in a profile update
profile_statement = session.prepare(
    "UPDATE user_profiles SET email=? WHERE key=?")
user_track_statement = session.prepare(
    "INSERT INTO user_track (key, text, date) VALUES (?, ?, ?)")

# add the prepared statements to a batch
batch = BatchStatement()
batch.add(profile_statement, [emailAddress, "hendrix"])
batch.add(user_track_statement,
          ["hendrix", "email changed", datetime.utcnow()])

# execute the batch
session.execute(batch)

Note that the while version 2.0 supports the new Cassandra 2.0 features, this new version of the driver works with Apache Cassandra 1.2 and 2.0, and DataStax Enterprise 4.0, 3.2, and 3.1. When working with Cassandra 1.2 and DSE 3.x, you should explicitly set the protocol version to 1:

cluster = Cluster([127.0.0.1], protocol_version=1)

 

When using protocol version 1, lightweight transactions, automatic paging, and protocol-level batches are not available.

Version 2.0 of the driver is available on PyPI, and of course, you can always find the source on GitHub. Give it a test run and let us know what you think!

Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.