Member since
01-09-2017
49
Posts
7
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1116 | 01-19-2017 06:37 AM | |
3370 | 12-23-2016 05:54 AM |
03-22-2017
04:30 AM
hi @pbarna thanks for the response. date1 is dynamic partition column hence i have not declared it in table and i am trying to use the t_date column in df as date1 value. correct me if i am doing wrong? thanks, Rishit Shah
... View more
03-21-2017
12:13 PM
yes tried with same column name as well..but same error. thanks, Rishit Shah
... View more
03-21-2017
05:10 AM
Need a help in a error received. i am trying to insert into hive table from spark using following syntax. tranl1.write.mode("overwrite").partitionBy("t_date").insertInto("tran_spark_part") Note : tranl1 is DF i had created for loading data from oracle. val tranl1 = sqlContext.load("jdbc", Map("url" -> "jdbc:oracle:thin:userid/pwd@localhost:portid","dbtable" -> "(select a.*,to_char(to_date(trunc(txn_date,'DD'),'dd-MM-yy')) as t_date from table a WHERE TXN_DATE >= TRUNC(sysdate,'DD'))")) my table in hive : create table tran_spark_part(id string,amount string,creditaccount string,creditbankname string,creditvpa string,customerid string,debitaccount string,debitbankname string,debitvpa string,irc string,refid string,remarks string,rrn string,status string,txn_date string,txnid string,type string,expiry_date string,approvalnum string,msgid string,seqno string,upirc string,reversal string,trantype string) partitioned by (date1 string); However when i run tranl1.write.mode("overwrite").partitionBy("t_date").insertInto("tran_spark_part") it gives error : java.util.NoSuchElementException: key not found: date1 Please help me what i am missing or doing wrong? thanks. Rishit Shah.
... View more
Labels:
- Labels:
-
Apache Spark
01-10-2017
10:54 AM
1 Kudo
hi, i am trying to import xml data in hive. below is an example: <ns2:reqValAdd xmlns:ns2="http://www.ss.ss"></ns2:reqValAdd> CREATE TABLE xml_test4(ns2 STRING)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
"column.xpath.ns2"="/ns2:ReqValAdd/@ns2"
)
STORED AS
INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
TBLPROPERTIES (
"xmlinput.start"="<ns2:reqValAdd xmlns:ns2",
"xmlinput.end"="</ns2:reqValAdd>"
); the output is coming as NULL while i am expecting "http://www.ss.ss". can you please suggest what is wrong and how to rectify? thanks, Rishit Shah
... View more
Labels:
- Labels:
-
Apache Hive
12-23-2016
05:54 AM
hi @Artem Ervits - they are already installed. the flume-env.sh set up was not proper. i added paths of lib and it got solved. But, now i get error of memory capacity full. i had tried to increase channel capacity and transaction capacity as well but same error. any advice? thanks. Rishit Shah
... View more
12-20-2016
10:02 AM
hi, it seems that all the jars need to be present in flume/lib directory. it works for me with this, but this doesnt seem to be ideal case. any suggestion? thanks, Rishit Shah
... View more
12-17-2016
07:18 AM
adding conf file for readable purposeflume-hive-hortonworks.txt pls reply as i am stuck.
... View more
12-17-2016
07:15 AM
Caused by: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.streaming.RecordWriter my flume-conf file is as below: agent1.channels = ch1
agent1.sinks = HIVE
agent1.sources = sql-source
#
agent1.sources = sql-source
agent1.sources.sql-source.type = org.keedio.flume.source.SQLSource
agent1.sources.sql-source.channels = ch1
#
#
# URL to connect to database (currently only mysql is supported)
#agent1.sources.sql-source.connection.url = jdbc:oracle:thin:@<ip>:<SID> agent1.sources.sql-source.hibernate.connection.url = jdbc:oracle:thin:@<IP>:<SID> #
# # Database connection properties
agent1.sources.sql-source.hibernate.connection.user = mis_user
agent1.sources.sql-source.hibernate.connection.password = abcd_1234
agent1.sources.sql-source.table = ods_user.bb_rm_data
#agent1.sources.sql-source.database = ods_user
#
agent1.sources.sql-source.columns.to.select = *
agent1.sources.sql-source.max.rows = 100
# # Increment column properties
# # agent1.sources.sql-source.incremental.column.name = id
# # Increment value is from you want to start taking data from tables (0 will import entire table)
# agent1.sources.sql-source.incremental.value = 0
#agent1.sources.sql-source.hibernate.dialect = org.keedio.flume.source.SQLServerCustomDialect
# Query delay, each configured milisecond the query will be sent
agent1.sources.sql-source.run.query.delay=5000000
#
# Status file is used to save last readed row
agent1.sources.sql-source.status.file.path = /home/dev_Rht
agent1.sources.sql-source.status.file.name = sql-source.status
#
#channel properties
agent1.channels.ch1.type = memory
agent1.channels.ch1.capacity = 2000000
#
agent1.sinks.HIVE.type = hive
agent1.sinks.HIVE.channel = ch1
agent1.sinks.HIVE.hive.metastore = thrift://localhost:9083
agent1.sinks.HIVE.hive.database = default
agent1.sinks.HIVE.hive.table = fi_bb6
agent1.sinks.HIVE.useLocalTimeStamp = false
agent1.sinks.HIVE.round = true
agent1.sinks.HIVE.roundValue = 10
agent1.sinks.HIVE.roundUnit = minute
agent1.sinks.HIVE.serializer = DELIMITED
agent1.sinks.HIVE.serializer.delimiter = ","
agent1.sinks.HIVE.serializer.serdeSeparator = ','
agent1.sinks.HIVE.serializer.fieldnames = col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16,col17,col18,col19,col20,col21,col22,col23,col24,col25,col26,col27,col28,col29,col30
... View more
Labels:
- Labels:
-
Apache Flume
- « Previous
-
- 1
- 2
- Next »