Member since
07-28-2016
44
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1078 | 06-15-2017 08:53 AM |
12-06-2017
08:59 AM
Hi Nitish... Thanks for the update and new info. I will try it out and let you know how things work. One question...you wrote the this will only work wiht Oracle 12g....but there is no Oracle 12g...is there? There is 11g and 12c. We have 12c at our place. thanks..d
... View more
12-05-2017
01:30 PM
So...just for everyone interested....this still doesn't work. We set the paramater on the sqoop side, ran the eval command and could see the encyrypted notes. However...while running the sqoop command, we did a packet capture and all of the data/traffic was in plain text, it was not encrypted at all. Hopefully, there is another way?
... View more
12-01-2017
07:52 AM
Thanks for the reply and information. We do have Resource Pools set up now...but more by function. However...from what you are saying (and from the few things I have seen), the only way for us to get what we want is to create resource pools per project/customer. I forgot about the Impala stuff....about them not being included. Thanks for mentioning that too. thank you!
... View more
11-30-2017
03:05 PM
Hello -
We need to be able to report back on cluster usage. I can use the existing reports in CM to figure out the HDFS (or storage) usage for our projects and customers. However, I also need a way to calculate and report on the compute resources used per project or customer/tenant.
Does anyone know how to do this? I have heard it is possible, but I have searched and haven't found anything.
Thanks for any help or suggestions
...
... View more
Labels:
- Labels:
-
Cloudera Manager
-
HDFS
10-20-2017
05:03 PM
Just wondering....has anyone successfully used Sqoop with SSL to connect to oracle and import data into cloudera cluster / hdfs ?
... View more
10-19-2017
01:00 PM
Thanks so much for the help. That worked. I was able to get the backup of the fsimage.
... View more
10-19-2017
11:19 AM
Hello - I am trying to run the hdfs dfsadmin -fetchImage command to get a back up of our Namenode. It is failing similar to this thread -> http://community.cloudera.com/t5/CDH-Manual-Installation/Backing-up-Namenode-FSImage/m-p/27268#M727 In that post, there is the suggestion below...about adding users to the dfs.cluster.adminstrators property (see snippet below). We are on CM 5.11 and I am not finding this property. Does anyone know how to set it? Or, do i have to just update some config ini/xml file? Thanks...d === from link === In clusters managed by Cloudera Manager 5.x, the property is set to "hdfs" by default. So if this command is run as the hdfs user on any node on the cluster, the command will succeed. You can try setting dfs.cluster.administrators to the list of users and groups who are allowed to perform this operation and then try again. It will require a restart of the Namenode(s) to take effect. <property> <name>dfs.cluster.administrators</name> <value>user1,user2,user3 group1,group2,group3</value> </property>
... View more
Labels:
- Labels:
-
HDFS
10-19-2017
10:35 AM
Thanks for the quick reply. I did find that article, but thought that the SSL options/keywords might be the same. I'll revew the article and let you know if this works. thank you! ...d
... View more
10-19-2017
10:01 AM
Hello - We are trying to import data from Oracle ( 12.1.0.2) using Sqoop and with SSL enabled. I have tested without encryption and the sqoop command works and we can import data. However, I am having troubles figuring out the correct syntax to add the SSL options to the Sqoop command. From what i have read online, it requires (at least) these: useSSL=true and requireSSL=true. I have tried many variations of adding the options to the sqoop command and none work. I get an error from Sqoop indicating "invalid connection string format" Here is the connection string that works: sqoop import -Dmapred.job.queue.name=hi.adhoc --connect jdbc:oracle:thin:@170.173.150.162:1522/phdssts2 --username p378428 -P --table P378428.BAR --target-dir /user/p378428/insights2/ --fields-terminated-by '\t' --delete-target-dir --verbose Below are four of variants that don't work. If anyone knows how to add the SSL options to a sqoop command to Oracle...that would be great. Thanks... (a) sqoop import -Dmapred.job.queue.name=hi.adhoc --connect jdbc:oracle:thin:@170.173.150.162:1522/phdssts2 --useSSL=true --requireSSL=true --username p378428 -P --table DSS_STAGE.KHS_ZC_ARRIV_MEANS --target-dir /user/p378428/insights-zc-khs_SSL_1/ --fields-terminated-by '\t' --delete-target-dir --verbose (b) sqoop import -Dmapred.job.queue.name=hi.adhoc --connect jdbc:oracle:thin:@170.173.150.162:1522/phdssts2;useSSL=true;requireSSL=true --username p378428 -P --table DSS_STAGE.KHS_ZC_ARRIV_MEANS --target-dir /user/p378428/insights-zc-khs_SSL_1/ --fields-terminated-by '\t' --delete-target-dir --verbose (c) sqoop import -Dmapred.job.queue.name=hi.adhoc --connect jdbc:oracle:thin:@170.173.150.162:1522/phdssts2&useSSL=true&requireSSL=true --username p378428 -P --table DSS_STAGE.KHS_ZC_ARRIV_MEANS --target-dir /user/p378428/insights-zc-khs_SSL_1/ --fields-terminated-by '\t' --delete-target-dir --verbose (d) sqoop import -Dmapred.job.queue.name=hi.adhoc --connect "jdbc:oracle:thin:@170.173.150.162:1522/phdssts2;useSSL=true;requireSSL=true" --username p378428 -P --table DSS_STAGE.KHS_ZC_ARRIV_MEANS --target-dir /user/p378428/insights-zc-khs_SSL_1/ --fields-terminated-by '\t' --delete-target-dir --verbose
... View more
Labels:
- Labels:
-
Apache Sqoop
08-22-2017
07:18 PM
Hello - Per the two links below, seems that to execute Impala jobs in Oozie, the user needs to provide his keytab. a) https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/How-to-Schedule-Impala-Jobs-with-Oozie/ta-p/31277 b) https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-schedule-with-oozie-tutorial/td-p/23906 Where can the users find (or generate) their keytabs? thanks w
... View more
Labels:
- Labels:
-
Apache Oozie
-
Kerberos
07-20-2017
07:57 AM
Hello I just need to remove the Data Node role from a node that doesn't need to be a data node. This is in a testing environment. Replication factor = 3 thanks...douglas
... View more
07-19-2017
02:56 PM
Hello - I realized I set up a couple of hosts with the data node role, but they do not need to have that role. I have three other hosts with data node role. How can I remove the role? I do not see any way to simply remove the role. thanks
... View more
Labels:
- Labels:
-
Cloudera Manager
-
HDFS
06-26-2017
09:32 AM
Hello - We recently and installed HAProxy for a load balancer for Impala. We have 5 worker/data nodes that have the impala daemon running. After four days....when I look at the workload summary, i can only see 4 coordinators/nodes that have any queries. There is one 1 node that doesn't appear to be serving/accepting queries...but the health on the node/role is good. Also, there doesn't seem to be an even distribution of queries on the other nodes. One has run over 50%, one 30% and then the other two about 10% each. Is there any way for us to identify the issues: 1. Why doesn't the fifth node run/accept queries 2. why are the distributions so far off??? thanks..
... View more
Labels:
- Labels:
-
Apache Impala
06-15-2017
08:53 AM
Hi all ... According to Cloudera support the date value "...is an epoch timestamp includes milliseconds, which makes the the timestamp three digits longer than the date converter expects. You can either remove the last three characters using a substring function, or insert a period before the millisecond digits. Either of these commands will provide the correct timestamp for 1491020099000: date --date='@1491020099' date --date='@1491020099.000'" So...if anyone else needs to know! ...douglas
... View more
06-14-2017
08:46 AM
Hello - When I view the Historical Disk Usage report in CM for daily or weekly or monthly, the values displayed look like dates: 2017-02 or 2017-05-02. However, when I download the report to excel or csv, the date column has values like: 1491020099000 Just wondering what that format is and how i can covert back to a date. thanks
... View more
Labels:
- Labels:
-
Cloudera Manager
06-06-2017
11:15 AM
Hello - We noticed an on-going issue at times, where Impala queries will be receive this type of message: Query Status: Couldn't open transport for <hostname>:22000 (SSL_connect: Connection reset by peer) We are running CDH 5.7.3 and Impalad verison is 2.5.0. When we see this, i can look at the webui for the impalad host and I usually see a query that in the "CREATED" state, but is not running...and typically these queries are from days before. I also notice that the Last Event will indicate something like "Ready to start 47 remote fragments". I try to cancel (esp if the query is 2 or 3 days old) and i cannot cancel it and get this message: Error: Query not yet running Seems the only way to clear the query is to reset the Impalad node. That seems like bad way to resolve this issue. Has anyone faced this issue before and have any thoughts/suggestions? thanks...
... View more
Labels:
- Labels:
-
Apache Impala
05-18-2017
03:42 PM
Hi... No, this isn't what I am looking for. I know how to use the Create and Grant for Sentry Roles for Hive. I am not sure if you are familiar with HUE, but there is a way to add an AD Group (either manually or via LDAP Sync). Once you add the AD Group, you need to assign Permissions to features of HUE - Filebrowser Access, HBase access, Pig access, Oozie access, etc. I am trying to find a way to set the permissions without having to click hundreds of time when i add AD Groups. thanks...
... View more
05-18-2017
01:27 PM
Hello - Currently, when we add a new AD Group (via LDAP Sync) we have to manually add the permissions to the group by click...click...click..click....and then save. When we have lots of groups to add, this can be a painfully slow process. Does anyone know if there is a way to run a statement in Beeline (sort of like a GRANT statement) that we could use to add permissions to an AD Group? thanks..
... View more
Labels:
05-02-2017
02:43 PM
Hello - I am trying to find out if the Spark Thrift Server is available in CDH 5.7.2. And, if it is, how can i enable it? thanks
... View more
Labels:
- Labels:
-
Apache Spark
-
Cloudera Manager
04-20-2017
08:27 AM
Hi... Thanks for the reply. Yes, we are using LDAP authentication. But we do have some existing AD groups with spaces in the names. Looks like we either need to create new AD Groups (without a space) or one of the tools like SSSD or Centrify to sync between LDAP/AD and linux. thanks .
... View more
04-19-2017
02:46 PM
Hello in our environment, we have some existing AD Groups that have a space in the name, like "My Company Group One". We can import these into HUE via LDAP Sync and we do get all of the users. However, adding this AD group to the Linux OS is proving nearly impossible. I've tried using various escape characters, quoting it,etc....and just can't get it to add to Linux. Just wondering how other folks might be handling this issue? Does Cloudera even support having AD Group names with a space? thanks...
... View more
Labels:
- Labels:
-
Cloudera Hue
-
HDFS
03-23-2017
08:14 AM
@Harsh J - I just checked the other post you listed...and that looks close. Seems that you are stating that there is a way to completely remove Sqoop from the avaiable Oozie workflow options. That would be great if i could do it on a per user or per group basis.
... View more
03-23-2017
08:12 AM
@Harsh J - thanks again for the responses and suggestions. In our case, we cannot revoke access at the DB for these users. They access the DB with other tools outside of our cluster (part of their jobs), so we cannot remove it.
... View more
03-23-2017
08:10 AM
@csguna - thanks for the link. I had read that already and it doesn't provide for a way to restrict Sqoop. It does allow for restricting other things...HBase, Impala, etc. I checked w/Cloudera support and they stated that there isn't a mechanism now. However, I could create a group and put users in that group to restrict (via read only access to Oozie). Unfortunately, this might not work for us either. thanks....
... View more
03-22-2017
09:29 AM
Also, we will need to restrict the ability for users to run sqoop via Hue (Oozie workflows). Is there a way to do that?
... View more
03-20-2017
07:52 AM
Hi... Thanks for the response. What do you mean by revoking DB access credentials? Do you mean removing it for the users that we want to prevent from using Sqoop? thanks!
... View more
03-17-2017
04:19 PM
Hello - We need to restrict access to running sqoop at the command line in hdfs. My thought on this was that I would probably have to manage this at the Linux OS layer. 1. Create a group (sqoop-users) in linux 2. Add users to that group 3. use ACLs (via setfacl) to add the new group (sqoop-users) to /usr/bin/sqoop with r-x permissions. 4. then change permissions via chmod to remove "other" access completely (so chmod 750). Just wondering if anyone has thoughts or suggestions...and if that is the way to go. thanks...
... View more
Labels:
- Labels:
-
Apache Sqoop
03-15-2017
12:58 PM
Hi... thanks for the reply and info. Is there any way for me to find job history...for jobs that have already finished...like from the previous week? I was wondering if the one of the log files (not sure which one) might capture that info. I won't have the ability to know when the user's run the job...so i won't be able to get the URL from the command line. thanks
... View more
03-15-2017
11:29 AM
Hello - I am trying to identify when a user runs a sqoop job at the command line. I was told I could use the YARN application monitor/page, but that still doesn't seem to show the sqoop jobs. Does anyone know a way i could monitor or audit whether users are running sqoop jobs? thanks...d
... View more
Labels:
- Labels:
-
Apache Sqoop
-
Cloudera Manager
02-21-2017
12:36 PM
Hello... Thanks for the response. We have users and groups that we want to retain....but there are a few groups and their associated users (that we added) that we want to remove. So, I don't want to blow away the DB. Sounds like this is just not possible, so we'll need to create a manual process/workflow...which is too bad. thanks
... View more