About TimothySpann

TimothySpann · ‎10-18-2016

https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2 See examples here: https://community.hortonworks.com/articles/4103/hiveserver2-jdbc-connection-url-examples.html https://community.hortonworks.com/questions/456/i-have-nifi-setup-on-my-cluster-but-it-only-works.html

TimothySpann · ‎10-18-2016

any updates for late 2016 on this? Spark 2 support? 1.6.2?

TimothySpann · ‎10-18-2016

any updates for late 2016 on this? Spark 2 support? 1.6.2?

TimothySpann · ‎10-18-2016

http://hortonworks.com/apache/knox-gateway/ https://knox.apache.org/ Knox sits on top of Kerberos Try this out: http://hortonworks.com/hadoop-tutorial/securing-hadoop-infrastructure-apache-knox/

TimothySpann · ‎10-17-2016

Can you access Oracle on that port with that driver? is there a firewall between the sqoop machine and oracle? can you access hive from that machine? A good first test is just to access hive, access oracle and make sure those aren't issues. you can also do a simple http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_how_the_standard_oracle_manager_works_for_imports http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_data-access/content/using_sqoop_to_move_data_into_hive.html https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_dataintegration/content/ch_using-sqoop.html It could be a permissions issue. Very possibly in HDFS /user/hive/warehouse/yourusername hdfs dfs -ls output /user/hive/warehouse check those and your currently logged in user. if you are admin or root you may not have HDFS write permissions may need to do sudo hdfs hdfs dfs -chmod -R 777 /user/hive/warehouse/youruser hdfs dfs -chown youruser /user/hive/warehouse/youruser

TimothySpann · ‎10-14-2016

Turn on logging. See: http://hortonworks.com/blog/best-practices-in-hdfs-authorization-with-apache-ranger/ https://hortonworks.com/hadoop-tutorial/manage-security-policy-hive-hbase-knox-ranger/

TimothySpann · ‎10-14-2016

seems like everything is here: https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Metrics_API http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_yarn_resource_mgt/content/ch_yarn_rest_apis.html https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#container command line yarn application -list get your list of job ids yarn container -list application_1475786720159_0003

TimothySpann · ‎10-14-2016

Sometimes you have Java messages that you would like to easily ingest into HDFS or perhaps HDFS as raw files, Phoenix, Hive and other destinations. You can do that pretty easy with Apache NiFi 1.0.0 as part of HDF 2.0. For this simple example, I also added a REST gateway for bulk loading, testing and to provide another way to easily send JMS messages. ListenHTTP accepts HTTP POSTS on port 8099, which I made the listener port for that processor. It takes what you send and publishes that to a JMS queue. I am using ActiveMQ. I have a little Python 2.7 script that I found on github that makes fake log records and modified it to send 1,000 JSON messages via REST to our REST to JMS gateway in NIFI for testing. You can easily do this with shell script and CURL, Apache JMeter, Java code, Go script and many other open source REST testers and clients. url = 'http://server.com:8099/contentListener' r = requests.post(url, json={"rtimestamp": timestamp, "ip": random_ip(), "country": country, "status": status}) I installed an ActiveMQ JMS broker as my example JMS server, which is very simple on Centos 7. All you need to do is download the gziped tar and untar it. It's ready to run with a chmod. That download also includes the client jar that we will need on the HDF 2.0 server for accessing the message queue server. You must also have the port open. On ActiveMQ that defaults to 61616. ActiveMQ also includes a nice web console that you may want to unblock that port for viewing the status of queues and messages. In my simple example, I am running JMS via: bin/activemq start > /tmp/smlog 2>&1 &; I recommend changing your HTTP Listening Port, so you can run a bunch of these processors as needed. Processors used: ConsumeJMS, MergeContent and PutHDFS. You need to set Destination Name which is the name of the QUEUE in this case, but could also be the name of the Topic. I picked Destination Type of QUEUE since I am using a QUEUE in Apache ActiveMQ. It's very easy to add more output processors for sinking data into Apache Phoenix, HBase, Hive, Email, Slack and other NoSQL stores. It's also easy to convert messages into AVRO, ORC and other optimized big data file formats. As you see we get a number of jms_ attributes including priority, message ID and other attributes associated with the JMS message. Example Message ActiveMQ Screens References: https://community.hortonworks.com/articles/59349/hdf-20-flow-for-ingesting-real-time-tweets-from-st.html https://community.hortonworks.com/articles/59975/ingesting-edi-into-hdfs-using-hdf-20.html http://activemq.apache.org/uri-protocols.html http://activemq.apache.org/initial-configuration.html http://activemq.apache.org/version-5-getting-started.html http://www.apache.org/dyn/closer.cgi?filename=/activemq/5.14.1/apache-activemq-5.14.1-bin.tar.gz&action=download

TimothySpann · ‎10-14-2016

sounds like a coding error, how are you ending your code? how big is the data, seems it can't process all of it. could be an issue with your parquet file, maybe try to save to another format to ORC, AVRO or JSON or HIVE. can you post the save source code from the datawriter around here: AppOutils.scala:506 try this processing mode and allocate more cpu and more memory --master yarn --deploy-mode cluste r

TimothySpann · ‎10-14-2016

https://community.hortonworks.com/questions/6488/yarn-jmx-access.html https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_hdfs_admin_tools/content/ch07.html http://www.slideshare.net/Hadoop_Summit/w-525hall1shenv2

Online	Offline
Last Visited	‎05-20-2024 05:42 PM

Member Since	‎01-07-2019 11:58 AM
Last Visited	‎05-20-2024 05:42 PM
Posts	1,973
Kudos received	1122

Cloudera Community

Re: Has anyone tried NiFi consuming (JMSConsume) f...

Re: NiFi Crash after runing chain of lookups

Re: Recommend approach for listening to RSS Feed i...

Re: NiFi ListenFTP Processor Default Data Port

Re: Nifi: Kafka Producer with Avro format in both ...

Re: NiFi Hive Connection Pool Error.

Re: Spark on YARN vs Mesos?

Re: Spark on YARN vs Mesos?

Re: Difference between apache knox and kerberos

Re: Sqoop Hive Import failing

Re: How to troubleshoot creation of default servic...

Re: Access Running Job Metrics via JMX

Ingesting JMS Messages to HDFS via HDF 2.0

Re: Spark job stage cancelled because SparkContext...

Re: Access Running Job Metrics via JMX