Member since
09-24-2015
816
Posts
488
Kudos Received
189
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3173 | 12-25-2018 10:42 PM | |
| 14192 | 10-09-2018 03:52 AM | |
| 4763 | 02-23-2018 11:46 PM | |
| 2481 | 09-02-2017 01:49 AM | |
| 2913 | 06-21-2017 12:06 AM |
05-05-2016
03:42 AM
4 Kudos
That's the ratio of HFiles associated with regions served by an RS having one replica stored on the local Data Node in HDFS. RS can access local files directly from a local disk if short-circuit is enabled. And If you, for example run a RS on a machine without DN, its locality will be zero. You can find more details here. And on HBase Web UI page you can find locality per table.
... View more
05-04-2016
02:53 PM
Ambari and ps command can show you that ZK service and ZK process is running on respective nodes, but only after "zkService.sh status" shows you that one node is the leader and the others are followers you can be absolutely certain that ZK is running and fully functional. You can run it using pdsh and targeting only ZK nodes.
... View more
05-04-2016
09:21 AM
2 Kudos
As far as I know, Hive can read only the current (most recent) data version in HBase. Only when using HBase API or hbase shell you can also read all versions, or only the ones in a specific time interval. From the Hive HBase integration document: there is currently no way to access the HBase timestamp attribute, and queries always access data with the latest timestamp.
... View more
05-04-2016
09:18 AM
1 Kudo
xmlns stands for XML Name Space, you can find general introduction here. In Oozie workflows there are 2 xmlns specified, the one on top: <workflow-appname="once-a-day"xmlns="uri:oozie:workflow:0.1"> defines XML tags for Oozie workflow files in general. The other one: <sqoop xmlns="uri:oozie:sqoop-action:0.2"> defines XML tags specific to Sqoop action, you can find its definition here. In your case xmlns is not a problem. If it were Oozie would reject your workflow xml file as incorrect, for example, because of using non-existent tags, or existent ones in a wrong way.
... View more
05-04-2016
08:59 AM
Hi Artem, as we discussed, min.user.id (min_user_id is used only in Ambari) and container-executor.cfg are only referenced by LinuxContainerExecutor (LCE). By default, DefaultContainerExecutor is used. More details here (the doc is about Hadoop-2.7.2 but this part applies to 2.7.1 as well). Besides secure clusters LCE can also be used in non-secure ones to enable CGroups.
... View more
05-04-2016
06:48 AM
Hi @R Wys, I just tried, but for me your regexp works as-is from Hive command line! Try testing only that part SELECT daily_date, regexp_replace(substr(from_unixtime(unix_timestamp(trim(daily_date),'dd/MM/yyyy')),0,7),"-","")as month FROM staging_table; Was that error with the command you posted, or with another version? Your posted version looks good, replace 0 by 1 as the substring starting index (as per specs), it works with 0 but better follow the specs. You can also try to escape "-" by "\-". Here is my result, note that my date-time field is in different format: hive> select s, regexp_replace(substr(from_unixtime(unix_timestamp(trim(s), 'yyyy-MM-dd')),0,7),"-","") as month from yr;
OK
s month
1975-01-01 00:00:00 197501
1999-10-12 199910
2016-03-22 10:20:30.155 201603
2001-07-07 11:00:05.0 200107
Time taken: 0.563 seconds, Fetched: 4 row(s)
... View more
05-04-2016
05:18 AM
Check which hosts are registerd with Ambari, from Ambari server run curl -u admin:admin http://localhost:8080/api/v1/clusters/CLUSTER_NAME/hosts My guess is that your Machine1 is there, registered as "machine1" (all lower-case letters). Then, try adding a new host "machine1". By the way, it's the best to avoid using capitals in host names.
... View more
05-04-2016
01:39 AM
Check your jobTracker setting in your job.properties, you had "jobTracker=hdp1.cidev:8050" in your question. And make sure you use "-conf job.properties" when you submit your job.
... View more
05-04-2016
01:21 AM
To be on the safe side you can restart all Yarn related components. Also note that min.user.id and container-executor.cfg will be used only with Yarn secure containers, meaning, on Linux yarn.nodemanager.container-executor.class=org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor or LCE, and WSCE on Windows.
... View more
05-04-2016
01:06 AM
1 Kudo
Check do you have enough Yarn memory and what's your yarn.scheduler.minimum-allocation-mb. Even with driver/executor memory set to 512m, another 384m are needed for the overhead, meaning 896m for the driver and each executor. Also try using only one executor: "--num-executors 1".
... View more