Member since
01-05-2018
24
Posts
0
Kudos Received
0
Solutions
02-19-2018
06:09 AM
capture.jpg Hi Trying to connect to Spark Cluster from RStudio on local machine. I go to set up the connection. From the drop down menu for "Master:", I choose "Cluster'' and then I receive this message attached. Connecting with a remote spark cluster requires an RStudio server instance... I am very new to this. Can anyone shed any light on how to overcome this problem? Thanks!
... View more
Labels:
- Labels:
-
Apache Spark
01-07-2018
08:36 AM
thank you. I was following this procedure however there was an error in my csv file. For some reason I needed to resave it again as csv.
... View more
01-07-2018
08:34 AM
I was looking at the wrong directory. This appears to have resolved the issue. Thank you.
... View more
01-07-2018
08:33 AM
Hi
I am using getmerge to combine multiple files like this:
hdfs dfs -getmerge /user/maria_dev/Folder3/* /Folder3/output1.csv
How can I exclude the header of each file? When I upload into hive table, it repeats each header row.
Alternatively, is there a query in Hive to exclude the actual header names? If I join 2 files and upload this into Hive, I have 2 lines of headers, and so on.
When I created my table, I included the following:
TBLPROPERTIES ("skip.header.line.count"="1");
However, this only skips the first line. How can I exclude the rest of the headers?
Thanks
... View more
Labels:
- Labels:
-
Apache Hive
01-07-2018
05:32 AM
Hi I have only just begun to learn HIVE. I am trying to upload a CSV file. However it does not look correct in Ambari view, please see screen shot attached. I cannot set up my columns as the data does not look to be separating correctly. Have I missed a step or done something wrong? I have field delimiter set a ",". pic8.jpg
... View more
01-07-2018
02:51 AM
@ Jay Kumar SenSharma do you mean from WinSCP? I am not sure. I have included pics in my previous post. If you could take a look, that would be most appreciated. When I create a Folder1 directory, it does not show up under root - home - maria_dev.
... View more
01-07-2018
02:18 AM
@ Jay Kumar SenSharma are you referring to WinSCP directory? If so, please see screen shot attached.pic6.jpg There is no Folder1 here so this is why I am still confused. Pic7 shows directory under root (pic 6 was as maria_dev).pic7.jpg The Folder1 shows here but I am still unsure as to how to progress. Should I be logged into WinSCP as root or maria_dev?
... View more
01-07-2018
02:06 AM
@ Jay Kumar SenSharma I changed the "ls-l" to see WRITE permission as shown in the attached pic. I also created tespic5.jpgt.txt but where should I see this now? It is not showing in list, also not showing in my directory for WinSCP?
... View more
01-07-2018
01:48 AM
@ Jay Kumar SenSharma thank you for your help with this. I am still not quite there. The Folder1 definitely exists as shown in the attached (pic4) with two files. I then tried: chmod 777 -R Folder1 but then when I ran the getmerge command again, it produced the same error? pic4.jpg
... View more
01-07-2018
01:25 AM
@ Jay Kumar SenSharma thank you. The Folder1 folder does indeed exist. This worked for me: hadoop fs -getmerge /user/maria_dev/Folder1/* output.csv I cannot seem to use any of the "hdfs dfs" commands above? The above gave me an output file, but it was only the first file, i.e. it did not join the second file to it?
... View more
01-07-2018
12:53 AM
@ Jay Kumar SenSharma thank you. Two questions: 1. Is there a way to merge the files directly on HDFS, or do you need to merge to local file system then put back on HDFS? 2. I followed your instructions but on point no. 4 I used: hdfs dfs -getmerge /user/maria_dev/Folder1/* /Folder1/output.csv I have a folder called Folder1 on HDFS and it is also the same folder on local system, but got the same error: getmerge: Mkdirs failed to create file:/Folder1 (exists=false, cwd=file:/home/maria_dev) Not sure why this occurred. Have I missed a step or typed incorrectly? Thanks
... View more
01-07-2018
12:43 AM
@ Jay Kumar SenSharma thank you. Two questions: 1. Is there a way to merge the files directly from HDFS, or do you need to merge them to local file system and then back to HDFS? 2. I was following your instructions, but on point 4 with getmerge, I used this: hdfs dfs -getmerge /user/maria_dev/Folder1/* /maria_dev/Folder1/output.csv I have a folder called Folder1 (it is also on local file system under maria_dev folder as Folder1 but get the same error: getmerge: Mkdirs failed to create file:/maria_dev/Folder1 (exists=false, cwd=file:/home/maria_dev) Have I missed a step or written this incorrectly? Thanks
... View more
01-07-2018
12:04 AM
How can I now join all the files in one folder to one single csv file? I have a folder called Folder1 and I want to combine them all to a file called "output.csv". I tried:
hadoop fs -getmerge Folder1 /user/maria_dev/output.csv But I get the error:
getmerge: Mkdirs failed to create file:/user/maria_dev (exists=false, cwd=file:/home/maria_dev) I also tried: hadoop fs -cat Folder1 /output.csv But receive error: No such file or directory. Thanks
... View more
01-07-2018
12:01 AM
@ Jay Kumar SenSharma thank you for letting me know, as I am new to the community. I will mark this as answered and open another question.
... View more
01-06-2018
11:27 PM
@ Jay Kumar SenSharma many thanks. I have successfully moved a file (and now subsequently a folder which was my initial aim) to the maria_dev folder. How can I now join all the files in the folder to one single csv file? I tried: hadoop fs -getmerge Folder1 /user/maria_dev/output.csv But I get the error: getmerge: Mkdirs failed to create file:/user/maria_dev (exists=false, cwd=file:/home/maria_dev) I am trying to join all the files in Folder1 folder to a file called "output.csv" in the same folder. Thanks
... View more
01-06-2018
10:11 PM
@ Harald Berghoff thank you for your reply. I have installed WinSCP and I can see 2 drop down folders: 1. / <root> 2. root Which one is the correct directory? I have moved one file over as an example (called sample.txt) from my PC to the root folder as shown in the attached screen shot. How can I now move this file over to HDFS (or Sandbox?) using command line with Putty. I am logged into Putty as shown in the attached screen shot. pic1.jpgpic2.jpg
... View more
01-06-2018
06:10 AM
Hi I am completely new to Sandbox and I am struggling with connection to my local windows PC. All I have done is the following: 1. Downloaded and installed Oracle VirtualBox Manager. 2. Imported applicance - Hortonworks Sandbox. 3. Started my VM and logged into Ambari via my browser with maria_dev. Here is where I have a problem. How can I get the VM or Hortonworks to talk to my windows PC so that I can transfer files via command line? I know that I can use the upload via Ambari but I wish to transfer multiple files at once. What other steps are required? Do I need to set up another user in sandbox? Any help or advice for a newbie would be greatly appreciated. Thanks.
... View more
01-05-2018
09:48 AM
@ Jay Kiumar SenSharma ok thank you for your help anyway. I appreciate you replying to me even though I have had no success with it.
... View more
01-05-2018
09:27 AM
@ Jay Kumar SenSharma thank you for your reply. From Putty I used your command above: ssh root@127.0.0.1 -p 2222 And received the following error: ssh: connect to host 127.0.0.1 port 2222: Connection refused Can you also please explain what you mean by "you will need to SCP your files to Sandbox and then from Sandbox you can put them to HDFS"?
... View more
01-05-2018
09:13 AM
@ Jay Kumar SenSharma thank you, however I wish to upload multiple files at once from a directory. In Ambari File View, I can only upload a single file if I understand correctly.
... View more
01-05-2018
09:10 AM
@ Jay Kumar SenSharma Could you please explain how the user who is running hadoop command can belong to "hadoop" group?
... View more
01-05-2018
09:05 AM
@ Jay Kumar SenSharma you are correct, I am trying to push the file from my (Windows) desktop to Sandbox. The file does exist, however I am unsure if the correct permissions have been set? The output is: I understand that I can use Ambari File View, thank you, however my desire is to upload multiple files at once from a directory (the sample file is a test). ls: cannot access /Users/Matt/dir3/sample.txt: No such file or directory
... View more
01-05-2018
08:44 AM
@ Jay Kumar SenSharma I have a file on my desktop called sample.txt (in location Users/Matt/dir3/sample.txt. I have tried this: hadoop fs -copyFromLocal /Users/Matt/dir3/sample.txt /user/maria_dev/ and receive the error: copyFromLocal: `/Users/Matt/dir3/sample.txt': No such file or directory
... View more
01-05-2018
08:37 AM
Hi I am a brand new user. I have installed Hortonworks Sandbox on my VM virtual box manager. I have logged into Ambari 127.0.0.1 using maria_dev user name. I have installed Putty and set up by using a connection with maria_dev@sandbox-hdp. I cannot copy a file from my local directory to HDFS. Do I need to set permissions or have I missed a step in the set up process? Any assistance would be greatly appreciated. Thanks
... View more