Member since
01-11-2019
2
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5336 | 01-11-2019 09:42 AM |
01-11-2019
09:42 AM
1 Kudo
After increase heapsize in hive-env.sh to 4G , it's working perfect without OOM. export HADOOP_HEAPSIZE=4096
... View more
01-11-2019
06:18 AM
Hello all, I'm trying to config Hiveserver2 use Spark and it's working perfect with small file. But with large file ( ~ 1.5GB ) , it will be crash by "GC overhead limit exceeded" . My flow is simple like this : 1. Load data from text file into table_text ( text file ~ 1.5G ) Sql: load data local path 'home/abc.txt' into table table_text; 2. select data from table_text to insert to table_orc ( crash in this flow ) SQL : Insert into table table_orc select id,time,data,path,size from table_text; I guess spark have to load all data from table_text and save it in memory before insert to table_orc . I researched and know that spark can config if data does not fit in memory, store the partitions that don't fit on disk, and read them from there when they're needed ( RDD Persistence ). My environment: Ubuntu 16.04 Hive version : 2.3.0 Free memory when launch sql : 4G My config in hive-site.xml: <property>
<name>hive.execution.engine</name>
<value>spark</value>
</property>
<property>
<name>spark.master</name>
<value>local[*]</value>
</property>
<property>
<name>spark.eventLog.enabled</name>
<value>true</value>
</property>
<property>
<name>spark.driver.memory</name>
<value>12G</value>
</property>
<property>
<name>spark.executor.memory</name>
<value>12G</value>
</property>
<property>
<name>spark.serializer</name>
<value>org.apache.spark.serializer.KryoSerializer</value>
</property>
<property>
<name>spark.yarn.jars</name>
<value>/home/cpu60020-local/Documents/Setup/Java/server/spark/jars/*</value>
</property>
<property>
<name>spark.eventLog.enabled</name>
<value>false</value>
</property>
<property>
<name>spark.eventLog.dir</name>
<value>/home/cpu60020-local/Documents/Setup/Hive/apache-hive-2.3.0-bin/log/</value>
</property> Please tell me if you have any suggess , thanks all !
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark