Create a Stream Reactor sink connector from Apache Kafka® to Apache Cassandra®
The Apache Cassandra® Stream Reactor sink connector enables you to move data from an Aiven for Apache Kafka® cluster to a Apache Cassandra® database. It uses KCQL transformations to filter and map topic data before sending it to Cassandra.
See the full set of available parameters and configuration options in the connector's documentation.
Version compatibility
Stream Reactor version 9.0.2 includes class and package name changes introduced in version 6.0.0 by Lenses. These changes standardize connector class names and converter package names.
Version 9.x is not compatible with version 4.2.0. To continue using version 4.2.0, set the connector version before upgrading.
If you are upgrading from version 4.2.0, you must recreate the connector using the updated class name. For example:
"connector.class": "io.lenses.streamreactor.connect.<connector_path>.<ConnectorClassName>"
For details about the changes, see the Stream Reactor release notes.
Prerequisites
-
An Aiven for Apache Kafka service with Apache Kafka Connect enabled or a dedicated Aiven for Apache Kafka Connect cluster.
-
Gather the following information for the target Cassandra database:
-
CASSANDRA_HOSTNAME
: The Cassandra hostname. -
CASSANDRA_PORT
: The Cassandra port. -
CASSANDRA_USERNAME
: The Cassandra username. -
CASSANDRA_PASSWORD
: The Cassandra password. -
CASSANDRA_SSL
: Set totrue
,false
, ordefault
, depending on your SSL setup. -
CASSANDRA_KEYSTORE
: The path to the keystore containing the CA certificate, used for SSL connections. -
CASSANDRA_KEYSTORE_PASSWORD
: The password for the keystore.noteIf you are using Aiven for Apache Cassandra, use the following values:
CASSANDRA_TRUSTSTORE
:/run/aiven/keys/public.truststore.jks
CASSANDRA_TRUSTSTORE_PASSWORD
:password
-
CASSANDRA_KEYSPACE
: The Cassandra keyspace to use to sink the datawarningThe Cassandra keyspace and destination table need to be created before starting the connector, otherwise the connector task will fail.
-
TOPIC_LIST
: A comma-separated list of Kafka topics to sink. -
KCQL_TRANSFORMATION
: A KCQL statement to map topic fields to table columns. Use the following format:INSERT INTO CASSANDRA_TABLE
SELECT LIST_OF_FIELDS
FROM APACHE_KAFKA_TOPICwarningCreate the Cassandra keyspace and destination table (
CASSANDRA_TABLE
) before starting the connector. The connector fails to start if they do not exist. -
APACHE_KAFKA_HOST
: The Apache Kafka host. Required only when using Avro as the data format. -
SCHEMA_REGISTRY_PORT
: The schema registry port. Required only when using Avro. -
SCHEMA_REGISTRY_USER
: The schema registry username. Required only when using Avro. -
SCHEMA_REGISTRY_PASSWORD
: The schema registry password. Required only when using Avro.noteIf you are using Aiven for Cassandra and Aiven for Apache Kafka, get all required connection details, including schema registry information, from the Connection information section on the Overview page.
As of version 3.0, Aiven for Apache Kafka uses Karapace as the schema registry and no longer supports the Confluent Schema Registry.
For a complete list of supported parameters and configuration options, see the connector's documentation.
-
Create a connector configuration file
Create a file named cassandra_sink.json
and add the following configuration:
{
"name":"CONNECTOR_NAME",
"connector.class": "com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraSinkConnector",
"topics": "TOPIC_LIST",
"connect.cassandra.host": "CASSANDRA_HOSTNAME",
"connect.cassandra.port": "CASSANDRA_PORT",
"connect.cassandra.username": "CASSANDRA_USERNAME",
"connect.cassandra.password": "CASSANDRA_PASSWORD",
"connect.cassandra.ssl.enabled": "CASSANDRA_SSL",
"connect.cassandra.trust.store.path": "CASSANDRA_TRUSTSTORE",
"connect.cassandra.trust.store.password": "CASSANDRA_TRUSTSTORE_PASSWORD",
"connect.cassandra.key.space": "CASSANDRA_KEYSPACE",
"connect.cassandra.kcql": "KCQL_TRANSFORMATION",
"key.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url": "https://APACHE_KAFKA_HOST:SCHEMA_REGISTRY_PORT",
"key.converter.basic.auth.credentials.source": "USER_INFO",
"key.converter.schema.registry.basic.auth.user.info": "SCHEMA_REGISTRY_USER:SCHEMA_REGISTRY_PASSWORD",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "https://APACHE_KAFKA_HOST:SCHEMA_REGISTRY_PORT",
"value.converter.basic.auth.credentials.source": "USER_INFO",
"value.converter.schema.registry.basic.auth.user.info": "SCHEMA_REGISTRY_USER:SCHEMA_REGISTRY_PASSWORD"
}
Parameters:
name
: The connector name. ReplaceCONNECTOR_NAME
with your desired name.connect.cassandra.*
: Cassandra connection parameters collected in the prerequisite step.key.converter
andvalue.converter
: d Define the message data format in the Kafka topic. This example usesio.confluent.connect.avro.AvroConverter
to translate messages in Avro format. The schema is retrieved from Aiven's Karapace schema registry using theschema.registry.url
and related credentials.
The key.converter
and value.converter
fields define how Kafka messages
are parsed and must be included in the configuration.
When using Avro as the source format, set the following:
value.converter.schema.registry.url
: Use the Aiven for Apache Kafka schema registry URL in the formathttps://APACHE_KAFKA_HOST:SCHEMA_REGISTRY_PORT
.value.converter.basic.auth.credentials.source
: Set toUSER_INFO
, which means authentication is done using a username and password.value.converter.schema.registry.basic.auth.user.info
: Provide the schema registry credentials in the formatSCHEMA_REGISTRY_USER:SCHEMA_REGISTRY_PASSWORD
.
You can retrieve these values from the prerequisite step.
Create the connector
- Console
- CLI
-
Access the Aiven Console.
-
Select your Aiven for Apache Kafka or Aiven for Apache Kafka Connect service.
-
Click Connectors.
-
Click Create connector if Apache Kafka Connect is enabled on the service. If not, click Enable connector on this service.
Alternatively, to enable connectors:
- Click Service settings in the sidebar.
- In the Service management section, click Actions > Enable Kafka connect.
-
In the sink connectors list, select Amazon S3 source connector, and click Get started.
-
On the Stream Reactor Cassandra Sink page, go to the Common tab.
-
Locate the Connector configuration text box and click Edit.
-
Paste the configuration from your
cassandra_sink.json
file into the text box. -
Click Create connector.
-
Verify the connector status on the Connectors page.
-
Confirm that data appears in the Cassandra target table.
To create the connector using the Aiven CLI, run:
avn service connector create SERVICE_NAME @cassandra_sink.json
Replace:
SERVICE_NAME
: Your Kafka or Kafka Connect service name.@cassandra_sink.json
: Path to your configuration file.
Sink topic data to Cassandra
The following example shows how to sink data from a Kafka topic to a Cassandra table. If
your Kafka topic students
contains the following data:
{"id":1, "name":"carlo", "age": 77}
{"id":2, "name":"lucy", "age": 55}
{"id":3, "name":"carlo", "age": 33}
{"id":2, "name":"lucy", "age": 21}
To write this data to the students_tbl
table in the students_keyspace
keyspace,
use the following connector configuration:
{
"name": "my-cassandra-sink",
"connector.class": "com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraSinkConnector",
"topics": "students",
"connect.cassandra.host": "CASSANDRA_HOSTNAME",
"connect.cassandra.port": "CASSANDRA_PORT",
"connect.cassandra.username": "CASSANDRA_USERNAME",
"connect.cassandra.password": "CASSANDRA_PASSWORD",
"connect.cassandra.ssl.enabled": "CASSANDRA_SSL",
"connect.cassandra.trust.store.path": "CASSANDRA_TRUSTSTORE",
"connect.cassandra.trust.store.password": "CASSANDRA_TRUSTSTORE_PASSWORD",
"connect.cassandra.key.space": "students_keyspace",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false",
"connect.cassandra.kcql": "INSERT INTO students_tbl SELECT id, name, age FROM students"
}
Replace all placeholder values (such as CASSANDRA_HOSTNAME
, CASSANDRA_PORT
,
and CASSANDRA_USERNAME
) with your actual Cassandra connection details.
This configuration does the following:
"topics": "students"
: Specifies the Kafka topic to sink.- Connection settings (
connect.cassandra.*
)**: Provide the Cassandra host, port, credentials, SSL settings, and truststore paths. "value.converter"
and"value.converter.schemas.enable"
: Set the message format. The topic uses raw JSON without a schema."connect.cassandra.kcql"
: Defines the insert logic. Each Kafka message is written as a new row in thestudents_tbl
Cassandra table.
After creating the connector, check the Cassandra database to verify that the data has been written.