Member since
05-02-2019
319
Posts
144
Kudos Received
58
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3582 | 06-03-2019 09:31 PM | |
731 | 05-22-2019 02:38 AM | |
1054 | 05-22-2019 02:21 AM | |
606 | 05-04-2019 08:17 PM | |
775 | 04-14-2019 12:06 AM |
04-14-2020
09:44 AM
Is it possible to do something similar but receiving the Server, Port and Path where the files are? Because GetSFTP and ListSFTP can't receive any input
... View more
12-09-2019
11:07 PM
Hi, Hope this document will clarify your doubts. This was a tuning document. https://blog.cloudera.com/how-to-tune-your-apache-spark-jobs-part-2/ Thanks AK
... View more
05-05-2019
03:06 PM
Hi, we not have ping to hortonworks ( cluster is in SZ network )
... View more
05-06-2019
07:33 PM
@Lester Martin With the latest version of NiFi it is no longer necessary to use a Remote Process Group (RPG) to redistribute FlowFiles within the same NiFi cluster. A new FlowFile load balancer capability has been added to connections between processors. This new capability allows you to redistribute FlowFile when they land on a connection without needing to send them through a RPG. It also provides multiple strategies for load balancing. Thank you, Matt
... View more
05-04-2019
08:17 PM
Probably for using HCatalog with can be extremely useful for Pig programmers even if they don't want to use Hive and just leverage this for schema management instead of defining AS clauses in their LOAD commands? Just as likely this is something hard-coded into Ambari? If you really don't want Hive, I bet you can just delete it after installation. For giggles, I stood up an HDFS-only HDP 3.1.0 cluster for https://community.hortonworks.com/questions/245432/is-it-possible-to-install-only-hdfs-on-linux-machi.html?childToView=245544#answer-245544 and just added Pig (required YARN, MR, Tez & ZK, but that makes sense!) and did NOT require Hive to be added as seen below. Good luck and happy Hadooping!
... View more
04-15-2019
08:32 AM
Thanks @Lester Martin I keep in mind the balancer admin command. I solve the issue simply by removing a very huge file created by a data scientist executing a very huge request on hive. The temporary files located at /tmp/hive/[user] seems to be not replicated (i'am not sure of that).
... View more
03-06-2019
11:15 PM
Welcome to Phoenix... where the cardinal rule is if you are going to use Phoenix, then for that table, don't look at it or use it directly from the HBase API. What you are seeing is pretty normal. I don't see your DDL, but I'll give you an example to compare against. Check out the DDL at https://github.com/apache/phoenix/blob/master/examples/WEB_STAT.sql and focus on the CORE column which is a BIGINT and the ACTIVE_VISITOR column which is INTEGER. Here's the data that gets loaded into it; https://github.com/apache/phoenix/blob/master/examples/WEB_STAT.csv. Here's what it looks like via Phoenix... Here's what it looks like through HBase shell (using the API)... Notice the CORE and ACTIVE_VISITOR values looking a lot like your example? Yep, welcome to Phoenix. Remember, use Phoenix only for Phoenix tables and you'll be all right. 🙂 Good luck and happy Hadooping/HBasing!
... View more
02-21-2019
10:38 PM
The idea is to decode 1600 files which contain 4000 records each on average, that is 6.400.000 records in total, these must be processed and sent through nifi in less than 10 minutes to another server sftp. Thanks for the help 😉
... View more
12-30-2018
08:31 PM
Can you provide a very simple, but indicative, example of what you are looking for? Maybe a few rows and details of what you would be looking for. As you already know, the filter language documented at https://hbase.apache.org/book.html#thrift.filter_language will end up scanning everything. And also as you know, creating another table whose rowkey is aligned with your query you want to run fast will take effort to keep in sync as you upsert your data. There's always Phoenix is you want to stay completely in HBase, but maybe a hybrid of HBase and Solr might work. Again, my a simplified example could help.
... View more
04-27-2018
11:12 AM
Surely NOT the same issue, but along this line of buggy behavior in the HDP Sandbox (2.6.0.3) using Hive and getting messages mentioning hostnames sandbox and sandbox.hortonworks.com, I got this message a few times. FAILED: SemanticException Unable to determine if hdfs://sandbox.hortonworks.com:8020/user/root/salarydata is encrypted: java.lang.IllegalArgumentException: Wrong FS: hdfs://sandbox.hortonworks.com:8020/user/root/salarydata, expected: hdfs://sandbox:8020 It seems to go away if I just exit the SSH connection and establish it again.
... View more
02-13-2018
10:27 PM
Theoretically... yes, that should work. I'd stop HDFS as you are thinking and then get the contents of /hadoop/hdfs/data into /HDFS (might just leave it there as a fall back!!) and then update the dfs.datanode.data.dir property to now point to /HDFS instead of the /hadoop/hdfs/data default location. Using Ambari, you can find it as identified by the red arrows in the attached screenshot. After Ambari change is made and pushed to the datanodes, you can start HDFS back up and see if worked well or not. Again, theoretically should work, but if this was your production system, I'd do a dry run on another cluster (could do that on a single node psuedo-cluster) to gain some confidence that all would work well. Good luck and happy Hadooping!
... View more
03-14-2018
12:44 PM
@gdeleon, I didn't notice any notification about your post. I didn't get as far as those screenshots; after choosing the file to import, the first error message was to the effect that the file was not valid. Anyway, 3.1 is out and that imported just fine.
... View more
11-23-2017
03:54 PM
Hi Jacob, I am experiencing that same problem. How it was resolved ? Thanks !
... View more
07-31-2017
12:34 PM
Yep, this could work, but for a big cluster I could imagine this being time-consuming. The initial recursive listing (especially since it will represent down to the file level) could be quite large for any file system of any size. The more time-consuming effort would be to run the "hdfs dfs -count" command over and over and over. But... like you said, this should work. Preferably, I'd want the NN to just offer a "show me all quoto details" or at least just "show me directories w/quotas". Since this function is not present, Maybe there is a performance hit for NN to quickly determine this that I'm not considering as seems lightweight to me. Thanks for your suggestion.
... View more
07-13-2017
02:08 PM
I'm using Ambari 2.4.2.0 (and Capacity Scheduler Ambari View 1.0.0) which DOES have the "Save and Refresh Queues". That's not the problem. What is concerning is that over in the YARN service page Ambari wants to restart the RMs as shown in the attached screenshot. Probably doesn't need to be done, BUT this causes long-running grief for operators who don't want to see all of these warning messages to restart things. Thoughts?
... View more
06-23-2017
07:18 PM
Thanks so much @Lester Martin I appreciate your help now worked, I replaced my statement using yours and it worked. salaries_cl = FOREACH salaries_fl GENERATE (int)year as year:int,$1,$2,$3, (long)salary as salary:long; Weird why the other one didn't work but well thanks so much.
... View more
06-07-2017
07:37 PM
@John ClevelandThis is what i did, changed the script to dump all data , truck_events = LOAD '/user/satu/test.csv' USING PigStorage(',')
AS (driverId:int, truckId:int, eventTime:chararray,
eventType:chararray, longitude:double, latitude:double,
eventKey:chararray, correlationId:long, driverName:chararray,
routeId:long,routeName:chararray,eventDate:chararray);
--DESCRIBE truck_events;
DUMP truck_events;
--truck_events_subset = LIMIT truck_events 100;
--DESCRIBE truck_events_subset;
--DUMP truck_events_subset; Job completed in 62 seconds.
... View more
06-07-2017
08:06 PM
Excellent. Truthfully, the case sensitivity is a bit weird in Pig -- kind of like the rules of the English language. Hehe!
... View more
05-31-2017
03:11 PM
welp! That shows what happens when you are too "intelligent" to read through the basic stuff because you think you already know it. 😞 Thank you again @Lester Martin!! 😄
... View more
06-04-2017
04:38 AM
It did the trick for me. I sure hope it helps out @Joan Viladrosa, too! Thanks, Sriharsha!
... View more
05-03-2017
03:16 PM
1 Kudo
You can look at the timestamp on the files in HDFS as a starting point. I don't believe we store that information in HBase -- I can't think of any immediate reason why the system would need to preserve that.
... View more
05-03-2017
03:22 PM
Fair enough on cluster-to-cluster replication. I'm thinking about intra-cluster replication of regions for the purpose of HA Read features. Is this automatically addressed already and if not, are there strategies to take care of this?
... View more
03-30-2017
06:32 AM
In hive its seems to be easy..but i had to do the same in pig, so I wrote udf, anyways thnx 🙂
... View more
03-08-2017
03:51 PM
Thanks for all the great responses here and below. Yes, indeed, the worker nodes that I have all of this running on are overloaded. Taking into account the distribution of services and the underlying box resource footprint. Thanks again!!
... View more
10-12-2017
08:18 PM
Hi, I have got 3 node cluster running kerberized hdp 2.6.2 with Ranger but without Ranger Storm plugin. I also see the errors when I try to run command "storm list". Storm sevice check runs fine. I get the following error when I use underprivileged user account with a valid token. Any clues most apprecieted. 2290 [main] WARN o.a.s.s.a.k.ClientCallbackHandler - Could not login: the client is being asked for a password, but the client code does not currently support obtaining a password from the user. Make sure that the client is configured to use a ticket cache (using the JAAS configuration setting 'useTicketCache=true)' and restart the client. If you still get this message after that, the TGT in the ticket cache has expired and must be manually refreshed. To do so, first determine if you are using a password or a keytab. If the former, run kinit in a Unix shell in the environment of the user who is running this client using the command 'kinit <princ>' (where <princ> is the name of the client's Kerberos principal). If the latter, do 'kinit -k -t <keytab> <princ>' (where <princ> is the name of the Kerberos principal, and <keytab> is the location of the keytab file). After manually refreshing your cache, restart this client. If you continue to see this message after manually refreshing your cache, ensure that your KDC host's clock is in sync with this host's clock.
2298 [main] ERROR o.a.s.s.a.k.KerberosSaslTransportPlugin - Server failed to login in principal:javax.security.auth.login.LoginException: No password provided
javax.security.auth.login.LoginException: No password provided
at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:919) ~[?:1.8.0_112]
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760) ~[?:1.8.0_112]
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) ~[?:1.8.0_112]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
... View more
05-01-2019
03:50 PM
It is confusing what triggers this task to run. Do you have any additional info on that or know if there is any way to configure it more precisely?
... View more
04-27-2018
08:25 PM
Is it removed from HDP 2.6.4 GA version too as I don't see the zeppelin view
... View more
01-23-2017
01:36 AM
1 Kudo
Good write-up from @Ambud Sharma plus you can visit http://storm.apache.org/releases/1.0.2/Guaranteeing-message-processing.html for info from the source. Additionally, take a peek at the picture below I just exported from our http://hortonworks.com/training/class/hdp-developer-storm-and-trident-fundamentals/ course that might help visualize all of this information. Good luck and happy Storming!
... View more