Member since
02-04-2016
189
Posts
70
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3763 | 07-12-2018 01:58 PM | |
7884 | 03-08-2018 10:44 AM | |
3768 | 06-24-2017 11:18 AM | |
23310 | 02-10-2017 04:54 PM | |
2296 | 01-19-2017 01:41 PM |
06-15-2016
04:41 PM
Thanks
@Jitendra Yadav
The output is too large to paste it all here.
I'm trying out the "upload file" feature you guys have here. hive-to-s3-output.txt
(Note: I replaced the names of our servers and s3 bucket, but it should still be pretty clear. The folder I tried to use is /HDFS_ToS3_Testing/Hive2/
... View more
06-15-2016
02:53 PM
1 Kudo
I've been experimenting with the options for copying data from our (bare metal) cluster to S3.
I found that something like this works:
hive> create table aws.my_table
> (
> `column1` string,
> `column2` string,
....
> `columnX` string)
> row format delimited fields terminated by ','
> lines terminated by '\n'
> stored as textfile
> location 's3n://my_bucket/my_folder_path/';
hive> insert into table aws.my_table select * from source_db.source_table;
But only if the source data set is pretty small.
For a larger data set (10's of GB), it fails with errors like
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: n must be positive
at
... I understand that pushing gigabytes (and eventually terabytes) of data to a remote server is going to be somewhat painful. So, I'm wondering what kind of customizations are available. Is there a way to specify compression or upload throttling, etc? Can anyone give me instruction on getting around the errors?
... View more
Labels:
- Labels:
-
Apache Hive
06-15-2016
09:39 AM
I didn't understand the difference between s3 and s3n. This link helped: http://stackoverflow.com/questions/10569455/difference-between-amazon-s3-and-s3n-in-hadoop Thanks again.
... View more
06-14-2016
10:44 PM
I added the appropriate entries to hive and hdfs configs in ambari (as specified here https://community.hortonworks.com/articles/25578/how-to-access-data-files-stored-in-aws-s3-buckets.html), and gave this a try: hdfs dfs -put /user/my_user/my_hdfs_file s3://my_bucket/my_folder I got the error: -put: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively). I noticed that the instructions mention these settings: fs.s3n.awsAccessKeyId fs.s3n.awsSecretAccessKey but the error message mentions these: fs.s3.awsAccessKeyId fs.s3.awsSecretAccessKey Once I made that change, I was able to make some progress. However, I think I still need a little help. In your example, you show s3://bucket/hivetable as the destination. But our S3 instance doesn't have tables, just folders. When I try and point at a folder, I get an error: put: /<folder name> doesn't exist Do I need to use the other syntax "create external table... LOCATION 's3n://mysbucket/" to create a TABLE in S3 and then access in this way? Is there a similar way to simply transfer a file from hdfs to a FOLDER in an s3 bucket? cc @Jitendra Yadav Thanks!
... View more
06-13-2016
04:30 PM
Thanks @Jitendra Yadav Is this baked into HDP, or are there Amazon-related binaries that I need in order for this to work?
... View more
06-13-2016
03:27 PM
1 Kudo
I want to copy some data from Hive tables on our (bare metal) cluster to a S3. I know that I can export the data out of HDFS to a CSV file and upload that to S3, but I'm guessing that there are better ways to accomplish this. Any ideas?
... View more
Labels:
- Labels:
-
Apache Hive
-
HDFS
05-24-2016
02:08 PM
Thanks, @Yogeshprabhu I changed my directory user/group structure to match yours and it appears to be working now. I would say that this is a bug in the installation.
... View more
05-24-2016
01:40 PM
(sorry for the ugly formatting - it looks fine when I submit it)
... View more
05-24-2016
01:39 PM
What should the permissions be?
[root@myserver log]# ls -alh /usr/hdp/current/zeppelin-server/lib
total 1.4M
drwxrwxrwx. 7 root root 4.0K May 20 12:25 .
drwxr-xr-x. 3 root root 4.0K May 19 17:41 ..
drwxr-xr-x. 2 root root 4.0K May 19 17:41 bin
lrwxrwxrwx. 1 root root 18 May 19 17:41 conf -> /etc/zeppelin/conf
drwxr-xr-x. 21 root root 4.0K May 19 17:42 interpreter
drwxr-xr-x. 2 root root 4.0K May 19 17:42 lib
-rw-r--r--. 1 root root 14K Apr 20 02:27 LICENSE
drwxr-xr-x. 2 root root 4.0K May 20 12:25 notebook
-rwxrwxrwx. 1 root root 6.7K Apr 20 02:27 README.md
drwxrwxrwx. 3 root root 4.0K May 19 17:58 webapps
-rwxrwxrwx. 1 root root 76K Apr 25 06:22 zeppelin-server-0.6.0.2.4.2.0-258.jar
-rwxrwxrwx. 1 root root 1.3M Apr 25 06:22 zeppelin-web-0.6.0.2.4.2.0-258.war
[root@myserver log]# ls -alh /usr/hdp/current/zeppelin-server/lib/notebook/
total 8.0K
drwxr-xr-x. 2 root root 4.0K May 20 12:25 .
drwxrwxrwx. 7 root root 4.0K May 20 12:25 ..
[root@myserver log]#
... View more
05-24-2016
01:25 PM
We installed Zeppelin along with our upgrade to 2.4.2 In Ambari, everything looks good, but when I try to actually interact with it, nothing seems to happen. When I look at the log file on the server, it shows the error below. Could this just be a permissions issue? INFO [2016-05-24 09:15:00,707] ({qtp895821303-5595} NotebookServer.java[onClose]:213) - Closed connection to 10.1.165.24 : 63750. (1001) null
WARN [2016-05-24 09:15:01,104] ({qtp895821303-5592} SecurityRestApi.java[ticket]:79) - {"status":"OK","message":"","body":{"principal":"anonymous","ticket":"anonymous","roles":"[]"}}
INFO [2016-05-24 09:15:01,154] ({qtp895821303-5592} NotebookServer.java[onOpen]:92) - New connection from 10.1.165.24 : 63761
ERROR [2016-05-24 09:15:13,897] ({qtp895821303-5593} NotebookServer.java[onMessage]:207) - Can't handle message
org.apache.commons.vfs2.FileSystemException: Could not create folder "file:///usr/hdp/current/zeppelin-server/lib/notebook/2BNKRY7B6".
at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:999)
at org.apache.zeppelin.notebook.repo.VFSNotebookRepo.save(VFSNotebookRepo.java:218)
at org.apache.zeppelin.notebook.repo.NotebookRepoSync.save(NotebookRepoSync.java:144)
at org.apache.zeppelin.notebook.Note.persist(Note.java:439)
at org.apache.zeppelin.notebook.Notebook.createNote(Notebook.java:159)
at org.apache.zeppelin.notebook.Notebook.createNote(Notebook.java:133)
at org.apache.zeppelin.socket.NotebookServer.createNote(NotebookServer.java:497)
at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:147)
at org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56)
at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455$WSFrameHandler.onFrame(WebSocketConnectionRFC6455.java:835)
at org.eclipse.jetty.websocket.WebSocketParserRFC6455.parseNext(WebSocketParserRFC6455.java:349)
at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455.handle(WebSocketConnectionRFC6455.java:225)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.commons.vfs2.FileSystemException: Could not create directory "/usr/hdp/current/zeppelin-server/lib/notebook/2BNKRY7B6".
at org.apache.commons.vfs2.provider.local.LocalFile.doCreateFolder(LocalFile.java:153)
at org.apache.commons.vfs2.provider.AbstractFileObject.createFolder(AbstractFileObject.java:988)
... View more
Labels:
- Labels:
-
Apache Zeppelin