DataStax Python Driver 2.0 Released
We are excited to announce the public release of version 2.0 of the DataStax Python driver for Apache Cassandra and DataStax Enterprise. The driver has several new features in this version:
- Full Apache Cassandra 2.0 and DataStax Enterprise 4.0 support
- Automatic paging of large result sets
- Protocol-level statement batching
- Lightweight transactions
- SASL-based authentication
- Enhanced stability and many minor improvements
- Python 3 support
The Upgrade Guide contains details about new features and other changes. I'll give some examples of how to use the new features here.
Automatic pagination of results
If a query yields a very large number of results, only an initial amount of rows will be fetched (according to the page size). The rest of the rows will be fetched on demand as you iterate through the rows in the result set.
from cassandra.cluster import Cluster cluster = Cluster() session = cluster.connect("mykeyspace") # set a page size of 1000 rows session.default_fetch_size = 1000 user_rows = session.execute("SELECT * FROM users") for user_row in users: # when the initial page is exhausted, the next page will # be transparently fetched process_user(user.id, user.name, user.email)
For more details, see the documentation for query paging.
Lightweight Transactions
Using Cassandra 2.0's lightweight transactions is simple:
from cassandra import ConsistencyLevel create_user_statement = session.prepare( "INSERT INTO users (username, email) VALUES (?, ?) IF NOT EXISTS") create_user_statement.serial_consistency_level = ConsistencyLevel.SERIAL session.execute(create_user_statement, [new_username, new_email)
Batching Statements
Although it has always been possible to execute statements in a BATCH, there was no way to do this with multiple prepared statements. With version 2.0 of the driver, you can execute multiple prepared (or unprepared) statements atomically across multiple tables within a single batch.
from cassandra.query import BatchStatement //Prepare the statements involved in a profile update profile_statement = session.prepare( "UPDATE user_profiles SET email=? WHERE key=?") user_track_statement = session.prepare( "INSERT INTO user_track (key, text, date) VALUES (?, ?, ?)") # add the prepared statements to a batch batch = BatchStatement() batch.add(profile_statement, [emailAddress, "hendrix"]) batch.add(user_track_statement, ["hendrix", "email changed", datetime.utcnow()]) # execute the batch session.execute(batch)
Note that the while version 2.0 supports the new Cassandra 2.0 features, this new version of the driver works with Apache Cassandra 1.2 and 2.0, and DataStax Enterprise 4.0, 3.2, and 3.1. When working with Cassandra 1.2 and DSE 3.x, you should explicitly set the protocol version to 1:
cluster = Cluster([127.0.0.1], protocol_version=1)
When using protocol version 1, lightweight transactions, automatic paging, and protocol-level batches are not available.
Version 2.0 of the driver is available on PyPI, and of course, you can always find the source on GitHub. Give it a test run and let us know what you think!