About casel_chen

casel_chen · ‎01-16-2018

I guess Tu Nguyen want to load external Hive table into Spark, right? If so, think about the following code object SparkHiveJdbc extends App { val spark = SparkSession.builder.master("local[2]").appName("SparkHiveJob").getOrCreate val sc = spark.sparkContext val sqlContext = spark.sqlContext val driverName = "org.apache.hive.jdbc.HiveDriver" Class.forName(driverName) val df = spark.read .format("jdbc") .option("url", "jdbc:hive2://localhost:10000/default") .option("dbtable", "clicks_json") .load() df.printSchema() println(df.count()) df.show() } I run the above code and met error reported as root |-- clicks_json.ip: string (nullable = true) |-- clicks_json.timestamp: long (nullable = true) |-- clicks_json.url: string (nullable = true) |-- clicks_json.referrer: string (nullable = true) |-- clicks_json.useragent: string (nullable = true) |-- clicks_json.sessionid: integer (nullable = true) |-- clicks_json.tenantid: string (nullable = true) |-- clicks_json.datestamp: string (nullable = true) 998 Caused by: java.lang.NumberFormatException: For input string: "clicks_json.timestamp" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:589) at java.lang.Long.parseLong(Long.java:631) at org.apache.hive.jdbc.HiveBaseResultSet.getLong(HiveBaseResultSet.java:368) ... 23 more<br> The reason of error I think is spark load header title row as first row when convert ResultSet into internal Row object. Anything wrong here?

casel_chen · ‎11-11-2017

I will be appreciate if you would like to provide the NiFi template of your tutorial, thanks!

casel_chen · ‎10-13-2017

Hi @Andrew Lim, No, I didn't use Distributed Map Cache Client/Server. Yes, I use MySQL 5.6. I also wondered how to replicate MySQL Data to other None MySQL DB, like Hive or HBase, can we still use PutDatabaseRecord processor? How to handle 'Delete' action? Thanks!

casel_chen · ‎10-11-2017

Why there are NULL database name and table name in delete/update/create output of capture processor of mine but normal in begin/commit event type? Anything wrongly configured? {"type":"insert","timestamp":1507709276000,"binlog_filename":"mysql-bin.000658","binlog_position":93481655,"database":null,"table_name":null,"table_id":null,"columns":[{"id":1,"value":10},{"id":2,"value":"mrs"},{"id":3,"value":"erika"},{"id":4,"value":"king"},{"id":5,"value":"1171 depaul dr"},{"id":6,"value":"addison"},{"id":7,"value":"wisconsin"},{"id":8,"value":"50082"},{"id":9,"value":"F"},{"id":10,"value":"erika.king55@example.com"},{"id":11,"value":"goldenbutterfly498"},{"id":12,"value":"chill"},{"id":13,"value":"(635)-117-5424"},{"id":14,"value":"(662)-110-8448"},{"id":15,"value":"122-71-7145"},{"id":16,"value":null},{"id":17,"value":null},{"id":18,"value":"http://api.randomuser.me/portraits/women/52.jpg"},{"id":19,"value":"http://api.randomuser.me/portraits/med/women/52.jpg"},{"id":20,"value":"http://api.randomuser.me/portraits/thumb/women/52.jpg"},{"id":21,"value":"0.6"},{"id":22,"value":"US"}]} {"type":"commit","timestamp":1507689471000,"binlog_filename":"mysql-bin.000657","binlog_position":21750290,"database":"mercury_dev"}

casel_chen · ‎10-10-2017

What about reading external Hive Data by JDBC from Spark SQL?

Online	Offline
Last Visited	‎07-05-2018 11:34 AM

Member Since	‎09-20-2017 07:56 AM
Last Visited	‎07-05-2018 11:34 AM
Posts	12

Cloudera Community

Re: Spark with HIVE JDBC connection

Re: Incrementally Streaming RDBMS Data to Your Had...

Re: Change Data Capture (CDC) with Apache NiFi (Pa...

Re: Change Data Capture (CDC) with Apache NiFi (Pa...

Re: Spark with HIVE JDBC connection