Member since
07-31-2013
1924
Posts
461
Kudos Received
311
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1269 | 07-09-2019 12:53 AM | |
6582 | 06-23-2019 08:37 PM | |
7163 | 06-18-2019 11:28 PM | |
7471 | 05-23-2019 08:46 PM | |
2735 | 05-20-2019 01:14 AM |
08-14-2016
03:43 PM
The error "Temporary failure in name resolution" comes out of the DNS lookup sub-system on your OS, and likely indicates a fault of some sort when accessing one or more of your nameservers (defined in /etc/resolv.conf). If this is a repeating yet intermittent problem, I'd recommend contacting the DNS maintainers to find out if there are maintenance events or other downtime related issues ongoing with their servers. You can also check your /var/log/messages or "dmesg" contents for more clues about this lower-env trouble. The RM and other alerts you see coming out as a result of this failure is an avalanche effect. The agent polls metrics and states from the roles it runs, by contacting their webserver end-points. Since that's failing to resolve (its really a local address, shouldn't have to go through DNS if your /etc/nsswitch.conf is setup right) the alert gets flagged too. Its worth also running a local nameservice caching daemon (Such as nscd, etc.) to help cushion such effects to a certain degree and also to prevent overloading the DNS with too many queries which could also cause this potentially.
... View more
08-14-2016
03:28 PM
1 Kudo
No such sqoop tool: sqoop. See 'sqoop help'. Your Sqoop Action's commands should begin with just the "import" command, and not include "sqoop" as its first argument, i.e. it should look like this: <command>import --connect …</command> And not like this, which is how you've specified it: <command>sqoop import --connect …</command> The Sqoop Action of Oozie is documented with an example at http://archive.cloudera.com/cdh5/cdh/5/oozie/DG_SqoopActionExtension.html
... View more
08-14-2016
11:47 AM
1 Kudo
I'd recommend not using versions for anything but. If the data you're looking to store via versions does not naturally age out (such as via TTL or via version limits) then its better to store as defined columns instead. Since your reads are going to be specific, going wider per row with growing # of columns shouldn't be a problem. Of course the key design could be thought about - but that depends on your primary read case. Scans allow you to grab specific time range slices easily so a TS-using key may be a good option too, but you'll need to think separately about serving profile information.
... View more
08-14-2016
11:35 AM
1 Kudo
The whole support around Parquet is documented at http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_parquet.html Impala's support for Parquet is ahead of Hive at this moment, while https://issues.apache.org/jira/browse/HIVE-8950 will help it catch up in future. In Hive you will still need to manually specify a column, but you may alternatively create the table in Impala and use it then in Hive. Parquet's loader in Pig supports reading the schema off the file [1] [2], as does Spark's Parquet support [3]. None of the eco system approaches use an external schema file as was the case with Avro storages. [1] - https://github.com/Parquet/parquet-mr/blob/master/parquet-pig/src/main/java/parquet/pig/ParquetLoader.java#L90-L95 [2] - https://github.com/Parquet/parquet-mr/blob/master/parquet-pig/src/test/java/parquet/pig/TestParquetLoader.java#L94-L97 [3] - http://spark.apache.org/docs/latest/sql-programming-guide.html#parquet-files
... View more
08-14-2016
03:24 AM
Impala lets you create a Parquet table from an example data file but there's no separate schema file concept in the Parquet storage implementation today. The LIKE 'FILE' feature is described further at https://www.cloudera.com/documentation/enterprise/latest/topics/impala_parquet.html#parquet_ddl, after which if you want to evolve the schema you can read on at https://www.cloudera.com/documentation/enterprise/latest/topics/impala_parquet.html#parquet_schema_evolution
... View more
08-14-2016
01:03 AM
1 Kudo
Have you placed all the requisite JDBC jars required to connect to Informix under /var/lib/sqoop/ on the host you are trying to invoke this command on?
... View more
08-13-2016
05:43 AM
1 Kudo
The RHBase libraries are very dated and advise the use of Thrift 0.8: https://github.com/RevolutionAnalytics/RHadoop/wiki/Installing-RHadoop-on-RHEL#installing-rhbase The installation proceeds smoothly for me as per instructions if I use Thrift 0.8. Note that Thrift major version upgrades are not guaranteed to be compatible to its clients, so since the library uses the older version you will most likely need to stick to it to ensure it has everything it calls available in the library and set of installed headers. I installed and tested a simple table creation and existence check that worked OK against the 5.7 HBase Thrift Service; I didn't check it extensively beyond that.
... View more
08-13-2016
04:40 AM
1 Kudo
The ExportSnapshot is an MR job, and as a result of that it will run across your NodeManager hosts. To provide its destination as a local filesystem URI, such as your file:///local_linux_fs_dir would only work if that passed path is visible with the same consistent content across all your cluster hosts. You can do this perhaps by mounting the same NFS across all hosts, and then using a controlled ExportSnapshot parallelism to write to them without overloading them (limit the # of maps to be low-enough). If that's not desirable, then you can also opt to run the MR job in local mode, which would still be parallel but limitedly so, by passing -Dmapreduce.framework.name=local to ExportSnapshot before any other option.
... View more
08-12-2016
03:22 PM
2 Kudos
@jh070784 - Please read my previous response. Support right now is disabled for incompleteness of design/impl., not unavailable. API was added before it was later disabled, and calling it will yield an error as the source reads: https://github.com/cloudera/hadoop-common/blob/cdh5.5.1-release/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileContext.java#L1404-L1406 @michaelthoward - I am not aware of any major pick up of redoing symlink but if you're interested in history reading and contributing changes please start at https://issues.apache.org/jira/browse/HADOOP-10019 which is the parent JIRA listing all the problems faced with its design and current implementation.
... View more
08-12-2016
12:03 AM
You do not need to set these env-vars manually. They should be set automatically for you in CDH. The env-var, if you do choose to set it manually for some reason, must point to a singular directory of configs vs. multiple in a classpath style as it appears in your output. Can you retry with an unset HADOOP_CONF_DIR since its handled automatically and your value's causing it to not use the automatic location? Does /etc/hadoop/conf/ exist now with usable files under it?
... View more