Member since
12-21-2017
67
Posts
3
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
532 | 10-15-2018 10:01 AM | |
2849 | 03-26-2018 08:23 AM |
04-26-2019
06:47 AM
I have met the similar problem. I have move the block pool to another place, but the problem occurs as well. How to fix it? su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-datanode/../hadoop/sbin/hadoop-daemon.sh start datanode" output following message: Java Hotshpot 64bit server VM warning: cannot open file /var/log/hadoop/hdfs/gc.log-2019... due to No such file or directory
... View more
03-12-2019
01:08 AM
Thanks @Jordan Moore As 'Kafka is very memory and disk sensitive. ', do you recommend to install kafka brokers on a virtual machine, as I cannot have more dedicated machines for kafka?
... View more
03-11-2019
03:32 AM
I meet a significant performance problem recently. I have about 30 spark streaming applications, which read data from kafka and write the data to hdfs. But recently the writing progress on some spark executor become very slow. The data amount for each spark tasks are similar, but the time cost of tasks are in great difference, where the slowest one is about 4 times of the fastest one. I have checked the disk usage, where disk use time on some hosts are about 80% to 90%. So I guess if it is caused by slow hdfs writing speed, because of my kafka broker, hdfs data node, yarn nodemanager locating on same hosts. So will it actually affect the performance?
... View more
- Tags:
- Data Processing
- Kafka
Labels:
- Labels:
-
Apache Kafka
12-07-2018
01:44 AM
1 Kudo
Hi @Jagadeesan A S my current save mode is append. My sparking streaming apps will run every 5 min, it is not convenient to delete manually....So I think the better solution is customize the temp location. Or Can I set offset of the scheduled running time? For example, my current 2 apps every 5 minutes, that's run at 0, 5, 10, 15, 20 Can I set a schedule, make one still runs at 0, 5, 10 , 15, and another runs at 2.5, 7.5, 10.5?
... View more
12-06-2018
04:32 AM
Thanks @Jagadeesan A S _temporary is a temp directory under path of the df.write.parquet(path) on hdfs. However spark.local.dir default value is /tmp, and in document, Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local disk in your system. So it should be a directory on local file system. I am not sure spark.local.dir refers to the temp directory of spark writing ...
... View more
12-06-2018
02:16 AM
I have two spark applications writing data to one directory on HDFS, which cause the faster completed app will delete the working directory _temporary containing some temp file belonging to another app. So can I specify a _temporary directory for each Spark application?
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
11-30-2018
02:36 AM
I use kafka with default value(PLAINTEXT://localhost:9092), however, it broadcasts it's hostname instead of ip address. The result is, if the producer runs a host without the kafka host info, it cannot send messages to kafka. How to resolve it?
... View more
Labels:
- Labels:
-
Apache Kafka
11-20-2018
03:11 AM
OK @Jay Kumar SenSharma So can I upgrade the ambari metrics service individually , besides to the whole Ambari?
... View more
11-20-2018
02:46 AM
for netstat -anlp | grep :6188 | grep CLOSE_WAIT | wc -l I get 0. May my restarting the service solve this problem temporarily. You mean the CLOSE_WAITING problem relates to Failed to invoke 'unbuffer' method inclassclass org.apache.hadoop.fs.FSDataInputStream ?
... View more
11-20-2018
02:27 AM
I am using Ambari 2.6.1.0 The ambari metrics collector log is stored on production environment. I cannot export it. And I can give you the error log by hand typing copy: now In embedded mode, ambari metrics log: MetaDataProtos$MetaDataService for row \x00\x00METRIC_RECORD .... Caused by java.lang.InterruptedException ams hbase log: FSDataInputStreamWrapper: Failed to invoke 'unbuffer' method in class class org.apache.hadoop.fs.FSDataInputStream So there may be a TCP socket connection left open in CLOSE_WAIT state .... caused by java.lang.UnsupportedOperationException: this stream does not support ubbuffering All settings for ams are in default. There are 6 nodes in total in cluster. The host running ams has 64 cores and 256GB. Currently it has 53GB free memory and 233GB free memory in cache
... View more
11-19-2018
06:46 AM
my ambari metric collector always doesn't work. In log, I found Failed to get result with timeout, timeout = 300000ms row 'METRIC_AGGREGATE' on table 'SYSTEM.CATALOG' at region=SYSTEM.CATALOG, ***** host name=[hostname], [port] *** seNum=6....causes by ....IOException: Failed to get result with timeout, timeout=300000ms then stuck at org.apache.hadoop.hbase.client.AsynProcess:#1 waiting for 28379 actions to finish It seems relate to hbase, but my hbase runs well without any error If I restart ambari metric collector, it recovers immediately, and become unavailable again after several hours. How to fix it? Thanks
... View more
Labels:
- Labels:
-
Apache Ambari
11-13-2018
08:49 AM
Thanks @KB And another question: When my spark application writing massive of data to hdfs, it always throws error message like following: No lease on /user/xx/sample_2016/_temporary/0/_temporary/attempt_201604141035_0058_m_019029_0/part-r-19029-1b93e1fa-9284-4f2c-821a-c83795ad27c1.gz.parquet:File does not exist.HolderDFSClient_NONMAPREDUCE_1239207978_115 does not have any open files.
How to solve this problem? I search online and others said it is related to dfs.datanode.max.xcievers
... View more
11-13-2018
08:33 AM
I increase the ipc max length according to this https://community.hortonworks.com/questions/101841/issue-requested-data-length-146629817-is-longer-th.html The hdfs service seems back to work.
... View more
11-13-2018
07:46 AM
Thanks @KB I have reset the dfs.datanode.max.xcievers to 32768, is it still too high? I increase it to avoid "No lease on file (inode 5425306)" error. So what's the proper value for this property? If I set the value to a proper value, will the missing block be recovered automatically?
... View more
11-13-2018
02:21 AM
Yesterday I add three more data nodes to my hdfs cluster with hdp 2.6.4. Few hours later, because of sparking writing error(No lease on...), I increase dfs.datanode.max.xcievers to 65536 and increase the heap size of name node and data node from 5G to 12G. And then restart it. However, the hdfs restart progress pauses in name node stage. It shows it is always in safe mode, and continues for 10 minutes. I force to leave the safe mode manually, and then hdfs reports a lot blocks are missing(about more than 90%). I checked the log of datanode and namenode, there are two kinds of error log: 1. In name node: Requested data length ** is longer than maximum configured RPC length ** 2. In data node: End of file exception between local host is "***", destination host is "**:8020" So how can I recovery my missing file? and what's the actual cause of this problem?
... View more
Labels:
- Labels:
-
Apache Hadoop
10-15-2018
10:01 AM
1 Kudo
Solve by using HttpFs. It set a gateway where no need to access data node.
... View more
10-11-2018
06:46 AM
I need to read files on hdfs which is distributed on another network, where I can access it via proxy. Now I can access the directories, files status via WebHDFS by specific port successfully. However when I try to read the file content, it redirects me to another data node url, while I only have access to name node. So how can I read the file in a convenient way?
... View more
- Tags:
- Hadoop Core
- HDFS
Labels:
- Labels:
-
Apache Hadoop
10-10-2018
05:57 AM
@Aditya Sirna So in default , there are up to 1000 lines of results stored on hdfs for each query? If I increase the limit, will it have some negative effects? Such as slow http transferring? Or result receiving failed?
... View more
10-10-2018
04:49 AM
@Aditya Sirna Thanks Aditya So what about paging? Since the whole results are saved on hdfs in JSON format, if I need to load part of whole result, just load the whole json file and cut out part of it by given page size and page number in memory ? In practice for zeppelin, will it have out of memory problem if the size is too huge?
... View more
10-10-2018
03:52 AM
I am working on designing a hdfs query system based on spark, which containing a paging function, and zeppelin seems be a good sample for me. Now I have a problem. I see spark or spark sql query results are existed even I refresh or reopen the notebook. So the results must be saved on some place. So I am wondering where these result data is saved on? If the data is saved on database, what if the result data size is pretty huge so that causing the database performance problem?
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache Zeppelin
08-16-2018
07:15 AM
Hi @Jonathan Sneep Fine. thanks. I have added the user and group info to my namenode. So the typical way to adding the new user or group is creating the user and group on namenode, and waiting for usersync to sync the user info to Ranger? So if I don't care the group policy, creating internal user in ranger and specifying them in allow conditions also works? At least it seems work in practice..
... View more
08-16-2018
01:31 AM
Hi @Jonathan Sneep Not yet. So I need to add the user the related group on my namenode host manually?
... View more
08-15-2018
09:39 AM
Hi, @Jonathan Sneep Thanks for your response. Actually both user and group are created in Ranger, which are internal for ranger
... View more
08-15-2018
07:55 AM
I meet some problem in ranger authentication. Here is my step to represent it: 1. I create one account in ranger, where the username is test01 2. I set it belong to a group test_group01 3. In the ranger hdfs policy, I set test_group01 has access to the directory /data/ If it runs normally, the test01 user should have the access to /data/ from the privilege inheriting from the group "test_group01". But in practice, it cannot access the directory /data. However if I specify the 'select user' with the test01, it works well. So it seems that specifying the group in policy doesn't work, and specifying the permitted user is fine. How to solve it? Thanks!
... View more
Labels:
- Labels:
-
Apache Ranger
07-12-2018
08:13 AM
Thanks Jay. I checked curl and libcurl version by running "yum list | grep curl", their version is . curl.x86_64 7.19.7-46.el6 libcurl.x86_64 7.19.7-46.el6 python-pycurl.x86_64 7.19.0-8.el6 libcurl.i686 7.19.7-46.el6 libcurl-devel.i686 7.19.7-46.el6 libcurl-devel.x86_64 7.19.7-46.el6 curl -V prints the following info: curl 7.19.7(x86-64-redhat-linux-gnu) libcurl/7.19.7... Protocals:... Features: GSS-Negotiate ... If I run the alert_spark2_livy_port.py script independently, it runs well What confuse me is, all my three hosts have the complete same version of curl, but only one have the above problem.
... View more
07-12-2018
02:45 AM
The spark livy alert always reports an alert : Connection failed on host ***:8999 In detail, it prints ExecutionFailed: Execution of 'curl -s o /dev/null -w'%{http_code} --negotiate -u: -k http://host:8999/session | grep 200' return 1, curl: option --negotiate: the installed liburl version doesn't support this curl: try curl --help... I have 3 host in this cluster, but only one host report this alert. I have checked the curl version and libcurl on hosts respectively, and they are all same. It may caused by installing anaconda and python version changing, but I am not sure as default python version is 2.6. How to fix it? Thanks!
... View more
Labels:
- Labels:
-
Apache Spark
04-03-2018
03:32 AM
I am trying to read data from kafka and writing them in parquet format via Spark Streaming.
The problem is, the data from kafka are in variable data structure. For example, app one has columns A,B,C, app two has columns B,C,D. So the data frame I read from kafka has all columns ABCD. When I decide to write the dataframe to parquet file partitioned with app name,
the parquet file of app one also contains columns D, where the columns D is empty and it contains no data actually. So how to filter the empty columns when I writing dataframe to parquet?
Thanks!
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache Spark
04-01-2018
08:38 PM
I have submitted a spark java program via "Spark Submit Jar" and it looks running well.
However, when I click the logs link in specified application in Job tab in hue, it shows "cannot acces: /jobbrowser/jobs/appliacation_****/single_logs."
So how can I find logs of running spark application?
... View more
Labels:
- Labels:
-
Apache Spark
-
Cloudera Hue
03-26-2018
08:23 AM
Fixed it by recovering spark home setting
... View more
03-23-2018
02:58 AM
I am testing spark within zeppelin. But in running tutorial %spark2.spark
spark.verson
It throws the following error: java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:391)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:380)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:828)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:483)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) Then I disable the hive context according to https://stackoverflow.com/questions/43289067/getting-nullpointerexception-when-running-spark-code-in-zeppelin-0-7-1 , the same exception is still thrown. How to solve it? ========================================================= Update 1: I have checked the spark interpreter log, and get the following error: requirement failed:/python/lib/pyspark.zip not found;cannot run pyspark application in YARN mode. How to locate this file or config the path?
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache Zeppelin