Member since
09-02-2016
523
Posts
89
Kudos Received
42
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2309 | 08-28-2018 02:00 AM | |
2160 | 07-31-2018 06:55 AM | |
5070 | 07-26-2018 03:02 AM | |
2433 | 07-19-2018 02:30 AM | |
5863 | 05-21-2018 03:42 AM |
04-16-2018
04:05 AM
@null_pointer For some reason I cannot see the image that you have uploaded, still i got your point and trying to answer your question we cannot always match/compare the memory usage from CM vs linux for various reasons 1. Yes, as you said CM only takes count of memory used by Hadoop components and it won't count consider if you have any other applications running on your local linux as CM designed to monitor only Hadoop and dependent services 2. (I am not sure you are getting the CM report from host monitor) There are practical difficulties to get memory usage of each client node in a single report. Ex: Consider you have 100+ nodes and each node has different memory capacity like 100 GB, 200 GB, 250 GB, 300 GB, etc, it is difficult to generate a single report to get memory usage of each client still if the default report available in CM is not meeting your requirement, may be you can try to build custom chart from CM -> Chart (menu) -> your tsquery https://www.cloudera.com/documentation/enterprise/5-9-x/topics/admin_cluster_util_custom.html
... View more
04-15-2018
09:19 AM
1 Kudo
@Aedulla here you go,.... http://www.bayareabikeshare.com/open-data https://grouplens.org/datasets/movielens/ https://www.nyse.com/market-data/historical also you can use the below free hue access (login uid: demo, pwd: demo) where you can get some pre-existing data for hive, impala, hbase, etc. Note: if you are getting any exception after login, pls try after sometime or raise a ticket, so that someone from hue team will fix the issue http://demo.gethue.com
... View more
04-11-2018
05:08 AM
@bukangarii as long as you have jdbc connectivity to your legacy system, it is possible to export the parquet hive table to your legacy system please check the sqoop guide document to understand the supporting data types
... View more
04-10-2018
10:55 PM
1 Kudo
@ludof no need to do it everytime, because in general once you done kinit, it will be valid for 24 hours (you can customize if you want), so do it once a day manually or you can automate it in some scenarios using cron jobs ex: you have jobs round the clock, more than one users are using the same user/batchid for a project, etc
... View more
04-09-2018
11:58 AM
@hedy can you try to run the 2nd pyspark command from a different user id? because it seems this is normal issue according to the below link https://support.datastax.com/hc/en-us/articles/207356773-FAQ-Warning-message-java-net-BindException-Address-already-in-use-when-launching-Spark-shell
... View more
04-09-2018
11:46 AM
1 Kudo
@ludof all you have to do is, run the kinit command and give the kerberos password before you start your spark session and continue with your steps, it will be fixed
... View more
04-09-2018
10:57 AM
1 Kudo
@hedy In general one port will allow one session (one connection) at a time, so your 1st session connects to the default port 4040 and your 2nd session is trying to connect to the same port but got the bind issue, so trying to connect to the next port but it is not working there are two things that you need to check 1. please make sure the port 4041 is open 2. On your second session, when you run pyspark, pass the avilable port as a parameter. Ex: Long back i've used spark-shell with different port as parameter, pls try similar option for pyspark session1: $ spark-shell --conf spark.ui.port=4040 session2: $ spark-shell --conf spark.ui.port=4041 if 4041 is not working you can try upto 4057, i think thease are the available port for spark by default
... View more
04-09-2018
10:20 AM
@RajeshBodolla Not sure I get your intension to have multiple datanodes on the same machine if you want to store data nodes in different/multiple directories in the same machine then you can use CM -> HDFS -> Configuration -> datanode.data.dir and specify your directories
... View more
02-12-2018
08:07 PM
srinivas ?? 🙂 @Cloudera learning Is it struck when 1 or 2 blocks left over? As mentioned earlier, you can monitor this from CM -> HDFS -> WebUI -> Namenode Web UI -> a new window will open, 'Datanodes' menu -> scroll down to Decommissioning (keep refresh this page to get the progress) If your answer is yes for my above question, then I got the similar issues few times and I've over come this issue as follows: 1. CM -> Hosts -> Abort the decomm process 2. CM -> HDFS -> Instance -> Node -> Stop 3. Try to decommission the same node again for the left over blocks Note: Some times you may struck again, retry couple of times
... View more