Member since
06-26-2018
10
Posts
0
Kudos Received
0
Solutions
10-25-2018
08:50 PM
I am trying to run a Oozie job for a partitioned table using --hcatalog tags to enable me to populate partitions. I keep getting the following error Append mode for imports is not compatible with HCatalog. The same sqoop job works fine if I were to run it on CLI.
... View more
Labels:
10-25-2018
03:40 PM
Thanks @nkumar for your response. I have reverted my changes because there was no point in turning it ON as In-Memory Cache per Daemon was 0. I have a 5 node heterogeneous cluster hence I allocated 2 nodes (Name Nodes) to be used by LLAP, I do not know how to configure number of llap daemons. I have HiveServer2 Heap Size =16GB and can increase it to 250GB. Metastore Heap Size = 2048 and again the same can be increased to 250GB and Client Heap Size=4096 and again the same can be increased to 250GB. max memory for yarn container = 250GB, currently allocated 190GB for Node 1, 100GB for Node 2 and 60GB for the others. I created an additional queue following the llap sizing and setup document but it did not help. I even set up seperate node 1 and 2 configs in Hive and Yarn. But it is not transfering the additional horsepower even I assign the new queue to the node 1 configuration(which has 250GB), it is still showing Memory per Daemon - 53248 and In-Memory Cache per Daemon - 0 Any help would be appreciated
... View more
10-24-2018
04:41 PM
I am trying to enable LLAP, I am getting very poor memory allocations. Memory per Daemon - 53248 In-Memory Cache per Daemon - 0 Number of executors per LLAP Daemon - 13 I even created a new queue but still no effect. Would appreciate any help
... View more
10-01-2018
01:22 PM
I am experiencing extremely poor performance. The job is not getting killed, but the log keeps showing this info incrementally and very slowly. Any help would be appreciated.
... View more
Labels:
09-24-2018
03:50 PM
@Vinicius Higa Murakami, any help bud?
... View more
09-14-2018
08:04 PM
Hi @Vinicius Higa Murakami, I am enclosing the explain plan for the long running query. We have turned CBO on, however the explain plan is not using CBO , not sure why? explain-plan.txt
... View more
09-14-2018
05:21 PM
Sqoop Import via CLI works. I am trying to orchestrate thru Oozie and I am running into all kinds of issues. I am trying to import data from sql server and append a parquet hive table. Since I was running into jar file issues, I created a lib folder inside the sqoop folder and have uploaded all the necessary jar files. (This is probably not right too) I am still running into issues and am getting the following error. ERROR org.apache.sqoop.tool.ImportTool - Imported Failed: Missing Hive MetaStore connection URIoozie-lib-jar-files.jpg
... View more
Labels:
09-14-2018
03:08 PM
Hi @Vinicius Higa Murakami, - We are running TEZ but no llap as we have large data sets to crunch and we observed that the queries would just hang. - We are definitely using mapJoin, however I am not sure about Vertex Enable -- However if we look at the DAG details for TEZ queries, it does create multiple Vertices. - We have turned on CBO and we do run stats on our tables - We have a single partition on all our tables - All our HIVE tables are stored in the Parquet fileformat (Additionally- If you see our configuration, our edge node has only 4 cores and has 47GB RAM and also has the YARN client services. Could this be a possible bottleneck ?. Plz advise. ) Thanks for responding _/\_
... View more
09-12-2018
11:28 PM
we have a 4 data node cluster with the following RAM configuration: Master Node - 32 cores - 251.7 GB data node - 32 cores - 125.71 GB data node - 32 cores - 62.71 GB data node - 32 cores - 62.71 GB edge node - 4 cores -47.01 GB The memory/cpu/load is marginal as you can see in the images. However the yarn seems to be running at 95+% but the hive jobs take a longtime. any suggestions on how to improve the performance?
... View more
Labels:
06-26-2018
02:02 PM
I am running a sqoop job which has a --query statement. This mandates a --target-dir clause, however when I provide a path say /user/xyz/sqoop_import/database/table1. However this path is not honored and instead the data is stored as a hive table in the warehouse folder hdfs://namenode:8020/apps/hive/warehouse/play. Here is the Sqoop Job run: sqoop import \
--connect "jdbc:sqlserver://xx.aa.dd.aa;databaseName=XYZ" \
--connection-manager org.apache.sqoop.manager.SQLServerManager \ --username XXXX \ --password XXXX \
--num-mappers 20 \
--query "select ID,name,x,y,z from TABLE1 where DT between '2018/01/01' and '2018/01/31' AND \$CONDITIONS" \ --split-by id \
--relaxed-isolation \
--target-dir /user/XXXX/sqoop_import/XYZ/2018/TABLE1 \ --fetch-size=100000 \
--hive-import \ --hive-table TABLE1 \ --hive-partition-key Reportdate \
--hive-partition-value Reportdate \
--as-parquetfile \
--compress \
--compression-codec org.apache.hadoop.io.compress.SnappyCodec;
... View more
Labels: