How can I configure Spark to connect to Cassandra in HDP 2.4?
I installed successfully Cassandra in hdp24, now I like to use pyspark to transform and save my Data back to Cassandra like shown here:
I could Create the Keyspace, Create the Tables and Insert the data.
Now I need the Spark Context with the connection settings to Cassandra..Someone got an Idea how I can do this in the Sandbox? (In the term of configuring Spark to be able to connect to Cassandra)
I'm sorry to refer you to StackOverflow, but it appears that this is how you do it:
JDBC Connection string sytax:
Or directly via Java code:
String serverIP = "127.0.0.1"; String keyspace = "system"; Cluster cluster = Cluster.builder() .addContactPoints(serverIP) .build(); Session session = cluster.connect(keyspace);
I am new to JDBC Connectors, so where do I put the .jar file ? What are the configuration steps I need to do to get the JDBC connector running?
(Maven? in Cassandra POM or in Spark?)
The instructions at https://docs.datastax.com/en/datastax_enterprise/4.6/datastax_enterprise/spark/sparkPySpark.html are for dse pyspark. You can either install that version or try https://github.com/TargetHolding/pyspark-cassandra or https://github.com/Parsely/pyspark-cassandra
@Lukas Ott hello sir
Am new in casandra and HDP !
Can you help me out please
I wonna install the cassandra with HDP 2.6.4 using this link !
can you help me out please !