1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1999 | 04-03-2024 06:39 AM | |
| 3168 | 01-12-2024 08:19 AM | |
| 1725 | 12-07-2023 01:49 PM | |
| 2505 | 08-02-2023 07:30 AM | |
| 3517 | 03-29-2023 01:22 PM |
10-18-2016
06:54 PM
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2 See examples here: https://community.hortonworks.com/articles/4103/hiveserver2-jdbc-connection-url-examples.html https://community.hortonworks.com/questions/456/i-have-nifi-setup-on-my-cluster-but-it-only-works.html
... View more
10-18-2016
03:05 PM
http://hortonworks.com/apache/knox-gateway/ https://knox.apache.org/ Knox sits on top of Kerberos Try this out: http://hortonworks.com/hadoop-tutorial/securing-hadoop-infrastructure-apache-knox/
... View more
10-17-2016
04:51 PM
Can you access Oracle on that port with that driver? is there a firewall between the sqoop machine and oracle? can you access hive from that machine? A good first test is just to access hive, access oracle and make sure those aren't issues. you can also do a simple http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_how_the_standard_oracle_manager_works_for_imports http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_data-access/content/using_sqoop_to_move_data_into_hive.html https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_dataintegration/content/ch_using-sqoop.html It could be a permissions issue. Very possibly in HDFS /user/hive/warehouse/yourusername hdfs dfs -ls output /user/hive/warehouse check those and your currently logged in user. if you are admin or root you may not have HDFS write permissions may need to do sudo hdfs hdfs dfs -chmod -R 777 /user/hive/warehouse/youruser hdfs dfs -chown youruser /user/hive/warehouse/youruser
... View more
10-14-2016
08:19 PM
Turn on logging. See: http://hortonworks.com/blog/best-practices-in-hdfs-authorization-with-apache-ranger/ https://hortonworks.com/hadoop-tutorial/manage-security-policy-hive-hbase-knox-ranger/
... View more
10-14-2016
07:53 PM
seems like everything is here: https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Metrics_API http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_yarn_resource_mgt/content/ch_yarn_rest_apis.html https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#container command line yarn application -list get your list of job ids yarn container -list application_1475786720159_0003
... View more
10-14-2016
07:16 PM
2 Kudos
Sometimes you have Java messages that you would like to easily ingest into HDFS or perhaps HDFS as raw files, Phoenix, Hive and other destinations. You can do that pretty easy with Apache NiFi 1.0.0 as part of HDF 2.0. For this simple example, I also added a REST gateway for bulk loading, testing and to provide another way to easily send JMS messages. ListenHTTP accepts HTTP POSTS on port 8099, which I made the listener port for that processor. It takes what you send and publishes that to a JMS queue. I am using ActiveMQ. I have a little Python 2.7 script that I found on github that makes fake log records and modified it to send 1,000 JSON messages via REST to our REST to JMS gateway in NIFI for testing. You can easily do this with shell script and CURL, Apache JMeter, Java code, Go script and many other open source REST testers and clients. url = 'http://server.com:8099/contentListener'
r = requests.post(url, json={"rtimestamp": timestamp, "ip": random_ip(), "country": country, "status": status}) I installed an ActiveMQ JMS broker as my example JMS server, which is very simple on Centos 7. All you need to do is download the gziped tar and untar it. It's ready to run with a chmod. That download also includes the client jar that we will need on the HDF 2.0 server for accessing the message queue server. You must also have the port open. On ActiveMQ that defaults to 61616. ActiveMQ also includes a nice web console that you may want to unblock that port for viewing the status of queues and messages. In my simple example, I am running JMS via: bin/activemq start > /tmp/smlog 2>&1 &; I recommend changing your HTTP Listening Port, so you can run a bunch of these processors as needed. Processors used: ConsumeJMS, MergeContent and PutHDFS. You need to set Destination Name which is the name of the QUEUE in this case, but could also be the name of the Topic. I picked Destination Type of QUEUE since I am using a QUEUE in Apache ActiveMQ. It's very easy to add more output processors for sinking data into Apache Phoenix, HBase, Hive, Email, Slack and other NoSQL stores. It's also easy to convert messages into AVRO, ORC and other optimized big data file formats. As you see we get a number of jms_ attributes including priority, message ID and other attributes associated with the JMS message. Example Message
ActiveMQ Screens References:
https://community.hortonworks.com/articles/59349/hdf-20-flow-for-ingesting-real-time-tweets-from-st.html https://community.hortonworks.com/articles/59975/ingesting-edi-into-hdfs-using-hdf-20.html http://activemq.apache.org/uri-protocols.html http://activemq.apache.org/initial-configuration.html http://activemq.apache.org/version-5-getting-started.html http://www.apache.org/dyn/closer.cgi?filename=/activemq/5.14.1/apache-activemq-5.14.1-bin.tar.gz&action=download
... View more
Labels:
10-14-2016
05:20 PM
sounds like a coding error, how are you ending your code? how big is the data, seems it can't process all of it. could be an issue with your parquet file, maybe try to save to another format to ORC, AVRO or JSON or HIVE. can you post the save source code from the datawriter around here: AppOutils.scala:506 try this processing mode and allocate more cpu and more memory --master yarn --deploy-mode cluste r
... View more
10-14-2016
03:20 PM
https://community.hortonworks.com/questions/6488/yarn-jmx-access.html https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_hdfs_admin_tools/content/ch07.html http://www.slideshare.net/Hadoop_Summit/w-525hall1shenv2
... View more