Member since
04-24-2017
106
Posts
13
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1422 | 11-25-2019 12:49 AM | |
2509 | 11-14-2018 10:45 AM | |
2259 | 10-15-2018 03:44 PM | |
2127 | 09-25-2018 01:54 PM | |
1948 | 08-03-2018 09:47 AM |
08-07-2018
03:31 PM
1 Kudo
have a look here: https://community.hortonworks.com/questions/88526/how-to-salt-row-key-in-hbase-table.html Basically it says that your prefix definition should be made in a way that you can calculate it during the query as well. In your (but perhaps simplified) example it might be even numbers prefix 000, odd numbers prefix 001.
... View more
08-06-2018
03:42 PM
Does your data actually span all of the regions you created splitpoints for? Or, when this finishes generating the HFile, does the client end up having to split the HFiles (and not just load them?). The only thing I can guess would be that the HBaseStorageHandler isn't doing something right. Generating only on HFile when you have 10 regions is definitely suboptimal.
... View more
08-01-2018
08:12 PM
Good question. I'm just running the insert into ... select * from ... command, like I could do it e.g. in Beeline or Ambari Hive View (JDBC). Is this running in Single-Insert or Batch mode?
... View more
07-19-2018
08:59 AM
@Daniel Müller
When merging the Hive ORC files, instead of matching the files wrt Block size, the files are merge as per the ORC stripe size. The property which controls this is hive.merge.orcfile.stripe.level. When the property is set to true, the merge happens at stripe level and when set to false, the files are merge at file level. Parameters which affect the file level merge are: hive.merge.tezfiles=true
hive.merge.mapfiles=true
hive.merge.size.per.task=256000000
hive.merge.smallfiles.avgsize=16000000 For more details refer link. Also, there are some known limitations related to concatenation. Do observe the behaviour and file count when the concatenate is run in say 5 iterations.
... View more
07-10-2018
01:54 PM
Yes, I'm familiar with Spark. What I wondered about was the caching behavior. It really seems to know which HiveQL statement belongs to the cached data, and re-uses it automatically when the same query comes: // Cache the table for the first time => takes some time!
val df1_1 = sqlContext.sql("SELECT a, b FROM db.table limit 1000000")
val df1_2 = df1_1.cache()
df1_2.count()
// This re-uses the cached object, as the request is the same as before => very fast!
val df2_1 = sqlContext.sql("SELECT a, b FROM db.table limit 1000000")
val df2_2= df2_1.cache()
df2_2.count()
// This caches the data, because the request is different (another limit clause) => takes some time!
val df3_1 = sqlContext.sql("SELECT a, b FROM db.table limit 10")
val df3_2= df3_1.cache()
df3_2.count()
Thanks for your help @Felix Albani
... View more
01-12-2018
10:54 AM
1 Kudo
Yes, that was the solution! Thank you very much! Everything works fine with the following statement now: SELECT SUM(menge) menge FROM mytable
... View more
08-27-2018
06:14 PM
@Daniel Muller, can you grep "Safe mode is" from hdfs namenode log? That will tell the reason why namenode does not exit safemode directly.
... View more
07-06-2017
06:41 AM
1 Kudo
Removing the "hdp-1.novalocal" from the hosts list and using the hostname script for setting the public / private hostname did it for me! Thank you so much, I think you saved my whole week!
... View more
07-07-2017
08:34 AM
Ok, easy solution here: I took the hive-jdbc-<version>.jar file as dependency, but I have to take hive-jdbc-<version>-standalone.jar, so changing /usr/hdp/current/hive-client/lib/hive-jdbc-1.2.1000.2.6.1.0-129.jar into /usr/hdp/2.6.1.0-129/hive2/jdbc/hive-jdbc-2.1.0.2.6.1.0-129-standalone.jar did it for me! You can find the hive-jdbc-standalone.jar with find / -name "hive-jdbc*standalone.jar
... View more
04-24-2018
10:02 AM
Reading the Value with the XPath //@Type works fine.
... View more
- « Previous
-
- 1
- 2
- Next »