Member since
06-21-2016
6
Posts
1
Kudos Received
0
Solutions
11-08-2017
10:34 AM
I am also facing the same error message when using sqoop with oozie only. If i run the sqoop command individually it is working fine. i have tried with adding hive-home as well but still i am facing the same error. Is there any other way which we can solve the problem.
... View more
12-14-2016
03:17 AM
In my hive DB i have two tables those are (table1)source and (table2) destination table and we created table2 as orc table with bucketing and transactional enable table to perform update and delete, but table1 is managed table with test format table. While creating the table1 i am creating the hash_diff for all the columns and storing it as seperate column in the table for checking the difference between two tables. Hive is not supporting for subqueries in update. If the the hash diff is not same, i need to update the entire row into another table, but not able to do this Can you guide me how to achieve this.
... View more
Labels:
09-19-2016
04:59 AM
Hi, I am facing issue with oozie workflow, i created a table which is linked to hive and mongo db using the below syntax CREATE TABLE IF NOT EXISTS test.emp_test ( id INT, name STRING ) STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler' TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/test.emp_test'); add jar hdfs:///data/jars/mongo-hadoop-core-1.3.2.jar; add jar hdfs:///data/jars/mongo-hadoop-hive-1.3.2.jar; add jar hdfs:///data/jars/mongodb-driver-3.0.2.jar; insert into test.emp_test values(1,"Raghava"); if i remove the insert statement which is working fine, if i add the insert statement i am facing the below issue ERROR org.apache.hadoop.hive.ql.exec.Utilities - Error caching map.xml: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for class: com.mongodb.hadoop.hive.input.HiveMongoInputFormat Can someone help me what could be the reason.
... View more
06-21-2016
11:11 PM
Currently i am using spooldir(source) for copying the files from local file system to HDFS, but i want to copy files from remote windows system. So can some one suggest which source option can i use to copy the files from remote windows system to HDFS using flume where i can specify the username and password.
... View more
06-21-2016
01:59 PM
1 Kudo
Hi I am using flume to copy the files from spooling directory to HDFS using file as the channel. #Component names
a1.sources = src
a1.channels = c1
a1.sinks = k1
#Source details
a1.sources.src.type = spooldir
a1.sources.src.channels = c1
a1.sources.src.spoolDir = /home/cloudera/onetrail a1.sources.src.fileHeader = false
a1.sources.src.basenameHeader = true
# a1.sources.src.basenameHeaderKey = basename
a1.sources.src.fileSuffix = .COMPLETED
a1.sources.src.threads = 4
a1.sources.src.interceptors = newint
a1.sources.src.interceptors.newint.type = timestamp
#Sink details
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = hdfs:///data/contentProviders/cnet/%Y%m%d/
# a1.sinks.k1.hdfs.round = false
# a1.sinks.k1.hdfs.roundValue = 1
# a1.sinks.k1.hdfs.roundUnit = second
a1.sinks.k1.hdfs.writeFormat = Text
a1.sinks.k1.hdfs.fileType = DataStream
#a1.sinks.k1.hdfs.file.Type = DataStream
a1.sinks.k1.hdfs.filePrefix = %{basename}
# a1.sinks.k1.hdfs.fileSuffix = .xml
a1.sinks.k1.threadsPoolSize = 4
# use a single file at a time
a1.sinks.k1.hdfs.maxOpenFiles = 1
# rollover file based on maximum size of 10 MB
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.rollInterval = 0
a1.sinks.k1.hdfs.rollSize = 0
a1.sinks.k1.hdfs.batchSize = 12
# Channel details
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = /tmp/flume/checkpoint/
a1.channels.c1.dataDirs = /tmp/flume/data/
# Bind the source and sink to the channel
a1.sources.src.channels = c1
a1.sinks.k1.channels = c1 with the above configuration it is able to copy the files to hdfs but the problem which i am facing is one file is keep staying as .tmp and not copying the complete file content.
Can some one help me what could be the problem.
... View more
Labels:
06-21-2016
07:09 AM
Hi I am using flume to copy the files from spooling directory to HDFS using file as the channel.
#Component names
a1.sources = src
a1.channels = c1
a1.sinks = k1
#Source details
a1.sources.src.type = spooldir
a1.sources.src.channels = c1
a1.sources.src.spoolDir = /home/cloudera/onetrail
a1.sources.src.fileHeader = false
a1.sources.src.basenameHeader = true
# a1.sources.src.basenameHeaderKey = basename
a1.sources.src.fileSuffix = .COMPLETED
a1.sources.src.threads = 4
a1.sources.src.interceptors = newint
a1.sources.src.interceptors.newint.type = timestamp
#Sink details
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = hdfs:///data/contentProviders/cnet/%Y%m%d/
# a1.sinks.k1.hdfs.round = false
# a1.sinks.k1.hdfs.roundValue = 1
# a1.sinks.k1.hdfs.roundUnit = second
a1.sinks.k1.hdfs.writeFormat = Text
a1.sinks.k1.hdfs.fileType = DataStream
#a1.sinks.k1.hdfs.file.Type = DataStream
a1.sinks.k1.hdfs.filePrefix = %{basename}
# a1.sinks.k1.hdfs.fileSuffix = .xml
a1.sinks.k1.threadsPoolSize = 4
# use a single file at a time
a1.sinks.k1.hdfs.maxOpenFiles = 1
# rollover file based on maximum size of 10 MB
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.rollInterval = 0
a1.sinks.k1.hdfs.rollSize = 0
a1.sinks.k1.hdfs.batchSize = 12
# Channel details
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = /tmp/flume/checkpoint/
a1.channels.c1.dataDirs = /tmp/flume/data/
# Bind the source and sink to the channel
a1.sources.src.channels = c1
a1.sinks.k1.channels = c1
with the above configuration it is able to copy the files to hdfs but the problem which i am facing is one file is keep staying as .tmp and not copying the complete file content.
Can some one help me what could be the problem.
... View more