Member since
09-14-2015
79
Posts
91
Kudos Received
22
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2313 | 01-25-2017 04:43 PM | |
1786 | 11-23-2016 05:56 PM | |
5735 | 11-11-2016 02:44 AM | |
1540 | 10-26-2016 01:50 AM | |
9438 | 10-19-2016 10:22 PM |
02-14-2017
05:22 PM
Thanks! I wanted to confirm before took such a drastic move 🙂 Worked perfectly.
... View more
02-14-2017
05:08 PM
I am aware that there is no way to delete Atlas tags via the UI or REST endpoints. However, I am wondering if there is a simple way to truncate the underlying database or wipe it so we start with what is essentially a fresh Atlas installation?
... View more
Labels:
- Labels:
-
Apache Atlas
01-25-2017
04:43 PM
Hi @Devpriyo Bhattacharya, You cannot install HDP natively on Windows 10. You can, however, run HDP on your Windows 10 laptop via Docker or virtual machines but I expect the behavior to be very unpredictable. You are going to be *very* resource constrained and you will likely experience occasional component failures and/or slow response time. That said, typically I would recommend that you leverage the dockerized sandbox for this situation. However, if you wish to go through the process of a customized installation then you can take the whole thing for a test drive using Docker for Windows. To do this you will need to do the following: Download and install Docker for Windows Launch a CentOS 7 container with appropriate ports opened - most importantly 8080 for Ambari and 10000 for HiveServer2 - you may realize that you need others open later for various UIs and connectivity (e.g., 50070 for the HDFS UI) Connect to the CentOS 7 container and run through the standard Ambari installation process to install your custom single-node HDP installation. I recommend installing the very minimum number of components due to your resource limitation. This will get you a single-node HDP installation running on your laptop that you can use for basic functionality testing. It will be similar to the Sandbox with the exception being that you have hand-selected the components that you wish to install.
... View more
11-23-2016
07:23 PM
@Dagmawi Mengistu Happy to help. If you don't need any more detail then feel free to accept the answer so we can close out the issue. Thanks!
... View more
11-23-2016
05:56 PM
3 Kudos
Hi @Dagmawi Mengistu, We do not currently support start/stop of cluster created via HDC. The likely reason that you are seeing the above error is that the local instance storage was chosen to support HDFS at cluster creation. This is short-lived and does not persist through start/stop of instances in EC2. In general, HDC clusters are intended to be for ephemeral workloads. If you want to start and stop the compute resources to control costs then I recommend creating a persistent metastore backed by RDS and providing that when you create new clusters. This way you can spin up and destroy clusters as you see fit and the data can be persisted via the shared metastore and S3. I hope this helps.
... View more
11-11-2016
02:44 AM
2 Kudos
Hi @Saminathan A One thing you can do is drop the SplitLine processor and go straight to the ExtractText processor where you can use a regex to pull out the first 5 lines via a regex. Then you can use the groups within that regex to work on the individual groups (e.g., the first 5 lines) in the UpdateAttribute processor. This regex should work for you: ^(.*)\n(.*)\n(.*)\n(.*)\n(.*)\n.*
... View more
11-02-2016
06:46 PM
There is also a good amount of detailing covering all of the knobs and dials related to configuring the Capacity Scheduler here. Note that in the latest versions of Ambari there is a Capacity Scheduler View where you can graphically configure the queues instead of getting into the weeds of the XML.
... View more
10-26-2016
01:50 AM
1 Kudo
Hi @Houssam Manik, The big benefit that you get by utilizing snapshots with distCP is that you can do incremental backups when distCP'ing the snapshotted directory in the future by leveraging the differential between the snapshots. Jing provides some context around this in the second answer here. The work to complete this is discussed in HDFS-7535 and some more context is provided there. This was first pulled into Hadoop 2.7.0
... View more
10-19-2016
10:22 PM
5 Kudos
Hi @Santhosh B Gowda, Assuming that this is happening on a single JournalNode then you can try the following: As a precaution, stop HDFS. This will shut down all Journalnodes as well. On the node in question, move the fsimage edits directory (/hadoop/hdfs/journal/stanleyhotel/current) to an alternate location. Copy the fsimage edits directory (/hadoop/hdfs/journal/stanleyhotel/current) from a functioning JournalNode to this node. Start HDFS. This should get this Journalnode back inline with the others and get you back to a properly functioning HA state.
... View more
10-10-2016
08:05 PM
1 Kudo
Hi @Simran Kaur, You can still achieve this using out-of-the-box functions in Hive as you mentioned. You just missed getting the string in the right format. For clarity, the basic steps are:
Replace the 'T' in the string with a space so the date is in the format expected by Hive. Convert the string to a unix timestamp using the unix_timestamp function. Convert the timestamp to your preferred date format using the from_unixtime function. Here is a quick example you can run in Hive to see the result for the string you provided: select from_unixtime(unix_timestamp(regexp_replace('2016-09-13T06:03:51Z', 'T',' ')), 'dd-MM-yyyy HH-mm-ss'); Notice that the only additional step is the replace operation.
... View more