Member since
04-08-2016
48
Posts
4
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6618 | 04-15-2016 11:18 PM |
04-25-2016
02:02 AM
Thanks Pierre. You info is very useful. I have cloned the project to my instance and I am about to create the .jars. Which classes do you use to build your storm-twitter-0.0.1-SNAPSHOT.jar ?
... View more
04-24-2016
04:05 PM
Thanks Pierre. Storm was in good shape with all green. I have changed the nimbus.host to host and it worked. 🙂 Now that I was able to run my first topology on Storm I am looking into ways to adapt it to the case of Twitter streams. Thanks for the link to https://github.com/pvillard31/storm-twitter. I will use this on myinitial test. I am start clone your project repo and start working on it from my instance. I guess I have some questions regarding the connection to the Twitter api as well. Would you have documentation explaining step by step these two processes? Merci!
... View more
04-23-2016
02:45 AM
Thanks Pierre. I am starting with this tutorial on my 3 node cluster ( I am not using sandbox) to get familiar with Storm: http://hortonworks.com/hadoop-tutorial/ingesting-processing-real-time-events-apache-storm I am following all the steps until the point I have to run: [root@sandbox ~]# storm jar storm-starter-0.0.1-storm-0.9.0.1.jar storm.starter.WordCountTopologyWordCount-c storm.starter.WordCountTopologyWordCount-c nimbus.host=sandbox.hortonworks.com Here I am getting some errors. Should I run this command from any specific folder. Here is a partial description of the error: at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:271) [storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:157) [storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] at storm.starter.WordCountTopology.main(WordCountTopology.java:77) [storm-starter-0.0.1-storm-0.9.0.1.jar:?] Caused by: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Conexão recusada at org.apache.thrift7.transport.TSocket.open(TSocket.java:187) ~[storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] at org.apache.thrift7.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] at backtype.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:102) ~[storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] at backtype.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:48) ~[storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] ... 8 more Caused by: java.net.ConnectException: Conexão recusada at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_60] at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_60] at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_60] at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_60] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_60] at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_60] at org.apache.thrift7.transport.TSocket.open(TSocket.java:182) ~[storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] at org.apache.thrift7.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] at backtype.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:102) ~[storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] at backtype.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:48) ~[storm-core-0.10.0.2.4.0.0-169.jar:0.10.0.2.4.0.0-169] ... 8 more Exception in thread "main" java.lang.RuntimeException: Could not find leader nimbus from seed hosts [ip-172-31-34-25.sa-east-1.compute.internal]. Did you specify a valid list of nimbus hosts for config nimbus.seeds at backtype.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:90) at backtype.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:225) at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:271) at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:157) at storm.starter.WordCountTopology.main(WordCountTopology.java:77) Thanks- Wellington
... View more
04-22-2016
02:42 AM
Thanks for the info, Andrew!
... View more
04-22-2016
02:41 AM
Thanks Artem. Point 2 solved it! My JSON is identical to yours.
... View more
04-22-2016
01:15 AM
Hey Artem. I was able to get it working by doing: CREATE TABLE TwitterTest( createddate string, geolocation string, tweetmessage string, `user` struct<geoenabled:boolean, id:int, name:string, screenname:string, userlocation:string>) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/sample_twitter_data.txt' OVERWRITE INTO TABLE TwitterTest; However, when I try to do the query you did I get the following error: Error while compiling statement: FAILED: ParseException line 1:39 cannot recognize input near 'user' '.' 'name' in selection target [ERROR_STATUS] When I look at the table structure by doing a simple SELECT * FROM TwitterTest LIMIT 10; I see that all the fields inside the user struct are inside the same column (twittertest.user) ... As for the other fields (createddate, geolocation, twittermessage) they have their own columns.... Is that normal? Thanks-
... View more
04-22-2016
12:49 AM
Thanks Arten! quick question what does the folder '/user/root/' refers to? In my example I am specifying the /tmp/ folder where I store my twitter sample file. CREATE TABLE TwitterExample_0( createddate string, geolocation string, tweetmessage string, `user` struct<geoenabled:boolean, id:int, name:string, screenname:string, userlocation:string>)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' LOCATION '/tmp/'; I am getting this error when I try a simple SELECT * FROM TwitterExample_0 LIMIT 10; trace":"org.apache.ambari.view.hive.client.HiveErrorStatusException: H170 Unable to fetch results. java.io.IOException: org.apache.hadoop.security.AccessControlException: Permission denied: user=admin, access=READ_EXECUTE, inode=\"/tmp/ambari-qa\":ambari-qa:hdfs:drwx------\n\tat.... ---------------------------------------------------- [hdfs@ip-172-31-34-25 ~]$ hadoop fs -chmod 777 /tmp/ [hdfs@ip-172-31-34-25 ~]$ hadoop fs -ls /tmp/ drwx------ - ambari-qa hdfs 0 2016-04-16 17:47 /tmp/ambari-qa drwxr-xr-x - hdfs hdfs 0 2016-04-16 17:43 /tmp/entity-file-history drwx-wx-wx - hive hdfs 0 2016-04-16 20:10 /tmp/hive -rwxr-xr-x 3 hdfs hdfs 1902 2016-04-16 17:45 /tmp/id1fac3f21_date451616 -rwxr-xr-x 3 ambari-qa hdfs 1902 2016-04-16 17:50 /tmp/idtest.ambari-qa.1460843437.95.in -rwxr-xr-x 3 ambari-qa hdfs 957 2016-04-16 17:50 /tmp/idtest.ambari-qa.1460843437.95.pig -rwxrwxrwx 3 hdfs hdfs 2755 2016-04-21 20:35 /tmp/sample_twitter_data.txt drwxr-xr-x - ambari-qa hdfs 0 2016-04-16 17:48 /tmp/tezsmokeinput drwxr-xr-x - ambari-qa hdfs 0 2016-04-16 17:48 /tmp/tezsmokeoutput Any help appreciated. Thanks-
... View more
04-21-2016
11:19 PM
Thanks for the clarifications, Benjamin. It makes sense. I did as you instructed and I am getting an error related to the salary.employee_number... I have checked the expression and all the naming seems accurate. Have you had this error? Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:227 Invalid table alias or column reference 'salary': (possible column names are:.....
... View more
04-21-2016
11:20 AM
1 Kudo
Hello, I am pretty new to Storm and I am getting started by trying to process some tweet streams with it. What would be the basic steps to start it? I am aware of there is a stream api for it (https://dev.twitter.com/streaming/overview), but how would I integrate it with my Storm elements to start making it work. Any insights appreciated. Thanks!
... View more
Labels:
- Labels:
-
Apache Storm
04-21-2016
11:16 AM
1 Kudo
Hello there, I am creating a table for storing json twitter data. I see different ways of using org.apache.hcatalog.data.JsonSerDe for it, but what would be the simpler process to use org.apache.hcatalog.data.JsonSerDe for this purpose. Where should I get org.apache.hcatalog.data.JsonSerDe, and how to integrate it into my HIVE instance?
Thanks!
... View more
Labels:
- Labels:
-
Apache Hive
- « Previous
-
- 1
- 2
- Next »