Member since
07-01-2015
460
Posts
78
Kudos Received
43
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1346 | 11-26-2019 11:47 PM | |
1304 | 11-25-2019 11:44 AM | |
9481 | 08-07-2019 12:48 AM | |
2183 | 04-17-2019 03:09 AM | |
3497 | 02-18-2019 12:23 AM |
08-26-2018
09:46 PM
What OS are you using?
... View more
08-26-2018
09:33 PM
Hi, you dont have to union 60 times, you can do this: select t.rowid, t.orderdate, t.shipmode, t.customername, t.state, m.metric, case m.metric when 'sales' then t.sales when 'quantity' then t.quantity end value
from mytable t cross join ( select 'sales' metric union all select 'quantity' metric ) m
... View more
08-24-2018
01:28 AM
Hi, you can inspect the avro files with avro-tools utility. create table work.test_avro ( i int, s string ) stored as avro;
insert into work.test_avro select 1, "abc";
set hive.exec.compress.output = true
set hive.exec.compress.intermediate = true;
set avro.output.codec= snappy;
insert into work.test_avro select 2, "abcdefgb"; In this table there are two file, one compressed with snappy one without compression, you can check it with get-meta command: $ avro-tools getmeta 000000_0
avro.schema {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]}
$ avro-tools getmeta 000000_0_copy_1
avro.schema {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]}
avro.codec snappy
... View more
08-24-2018
12:53 AM
Hi, if you are using Cloudera Manager deployed cluster with parcels, add a new host to the list of host and then deploy YARN and SPARK GATEWAY roles on this node. This will trigger the CM and it will distribute the parcels on this edge node and "activate" it. After that you should have on PATH the following commands: spark-submit, spark-shell (or spark2-submit, spark2-shell if you deployed SPARK2_ON_YARN) If you are using Kerberos, make sure you have the client libraries and valid krb5.conf file. And make sure you have a valid ticket in your cache. Then to submit a spark job to YARN: spark-submit --class path.to.your.Class --master yarn --deploy-mode cluster [options] <app jar> [app options]
or
spark-submit --class path.to.your.Class --master yarn --deploy-mode client [options] <app jar> [app options]
... View more
08-24-2018
12:37 AM
1 Kudo
As @GeKas said, enable ACL on HDFS (I you dont have it already - dfs.namenode.acls.enabled should be checked). Then you need to set the default group access for the parent directory, so every other subdirectory will be accessible by the mapred user (assuming mapred is in hadoop group): hdfs dfs -setfacl -R -m default:group:hadoop:r-x /user/historyh
hdfs dfs -setfacl -R -m group:hadoop:r-x /user/history And try it again.
... View more
08-23-2018
06:43 AM
1 Kudo
You are probably hiting OOM, maybe overloaded system. Do you have any warnings about overcommitment (how much memory the node has for OS, YARN, Impala etc)?
... View more
08-23-2018
04:47 AM
Hi, I think it is related to the snapshots or hidden directories. Maybe the distcp is preparing a snapshot, and as it failed, it left these temporary objects in HDFS.
... View more
08-23-2018
02:41 AM
1 Kudo
I understand. But how many times you create and drop ta table with 180k partitions? It is a matter of simple script. But maybe you are right, the metastore should handle bigger timeouts.
... View more
08-23-2018
02:31 AM
1 Kudo
I assume the type mismatch is because you dont have defined the else in the if statement. Then the result is just Any. Type mismatch, expected: Seq[String], actual: Array[Any] val regExpr = yearDF.schema.fields.map(x => if(x.dataType == String) { your_regex(x) } else { some_expression_returning_string } )
yearDF.selectExpr(regExpr:_*)
... View more
08-23-2018
02:06 AM
Hi, try to log into the Metastore database and manually remove the table from the metadata tables. I think the table containing tables is TBLS. You should also remove the records from child tables, such as columns and locations. Then restart the metastore, and it should ok. As this is an external table, you will not remove the data with this action.
... View more