Member since
01-12-2016
33
Posts
19
Kudos Received
0
Solutions
08-17-2022
12:48 PM
You are using an array, this is why you got null values. You have 2 options: Change the expressions to $[0].ARRIVAL_DATE and $[0].BOOK_ID, and it will work Read the file and split the array using the SplitJson processor - set JsonPath expression to: $, and connect the SplitJson processor to your EvaluateJsonPath processor and it should work.
... View more
06-29-2016
11:46 AM
@nejm hadjmbarek, in the information you provided, it seems your oozie max concurrency has been reached for the coordinator. You therefore have a number of applications waiting AM resources. Check you max AM resource percentage in capacity scheduler and consider raising it to either .5 or .6 which states that of the total resources, RM can assign our 50 or 60 percent to AM containers.
... View more
05-30-2016
10:25 AM
3 Kudos
@nejm hadjmbarek Set below property in oozie-site.xml to * to resolve this issue hadoop.proxyuser.oozie.host
... View more
05-25-2016
09:04 AM
i fix the unhealthy node by reducing the memory of this node , after that i run the workflow oozie and it succeed.
... View more
05-19-2016
09:23 AM
@nejm hadjmbarek
You can achieve this in two ways. 1. Create a wrapper shell script and call "pig <pig script path>" inside it. After that you can create an Unix cron entry to schedule it as per your requirement. 2. Another way is through Oozie scheduler, for that you can either create a pig action and along with recurring a coordinator service(see below link) or you can also create an Oozie shell action and call the same wrapper shell script inside your Oozie shell action( point 1). http://rogerhosto.com/apache-oozie-shell-script-example/ https://oozie.apache.org/docs/3.2.0-incubating/WorkflowFunctionalSpec.html#a3.2.3_Pig_Action http://blog.cloudera.com/blog/2013/01/how-to-schedule-recurring-hadoop-jobs-with-apache-oozie/ Thanks
... View more
05-24-2016
09:43 AM
1 Kudo
Hi @nejm hadjmbarek, I'm Posting my code, which is working Fine: import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.PreparedStatement;
import java.sql.Statement;
public class phoenix_hbase
{
public static void main(String[] args) throws SQLException
{
@SuppressWarnings("unused")
Statement stmt = null;
ResultSet rset = null;
try
{
Class.forName("org.apache.phoenix.jdbc.PhoenixDriver");
}
catch (ClassNotFoundException e1)
{
System.out.println("Exception Loading Driver");
e1.printStackTrace();
}
try
{
Connection con = DriverManager.getConnection("jdbc:phoenix:172.31.124.43:2181:/hbase-unsecure"); //172.31.124.43 is the adress of VM, not needed if ur running the program from vm itself
stmt = con.createStatement();
PreparedStatement statement = con.prepareStatement("select * from javatest");
rset = statement.executeQuery();
while (rset.next())
{
System.out.println(rset.getString("mycolumn"));
}
statement.close();
con.close();
}
catch(Exception e)
{
System.out.println(e.getMessage());
}
}
}
... View more
03-30-2016
03:04 PM
4 Kudos
@nejm hadj First I’ll answer your question and then I’ll make my recommendation. Answer: The name of the file does not matter. When setting up a Hive external table just specify the data source as the folder that will contain all the files (regardless of names). Details on setting up and external table: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_dataintegration/content/moving_data_from_hdfs_to_hive_external_table_method.html Details on reading/parsing JSON files into Hive: http://hortonworks.com/blog/howto-use-hive-to-sqlize-your-own-tweets-part-two-loading-hive-sql-queries/ (alternatively, you can convert JSON to CSV within NiFi. To do so, follow the NiFi portion of this example https://community.hortonworks.com/articles/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.html)
Recommendation: HDFS prefers large files with many entries as opposed to many files with small entries. The main reason being that for each file landed on HDFS, file information is saved in the NameNode (in memory). If you’re putting each twitter message in a separate file you will quickly fill up your NameNodes’s memory and overload the server. I suggest you aggregate multiple messages into one file before writing to HDFS. This can be done with the MergeContent processor in Nifi. Take a look at the below screenshots showing how it would be set up. Also, take a look at the NiFi Twitter_Dashboard.xml example template (https://raw.githubusercontent.com/abajwa-hw/ambari-nifi-service/master/demofiles/Twitter_Dashboard.xml). You can import this into your NiFi by by clicking on Templates (third icon from right) which will launch the 'Nifi Flow templates' popup, and selecting the file.
... View more
03-02-2017
06:57 AM
Hi @Andy LoPresto I am still struggling with that. Tried to add certificate to the truststore as well as you mentioned in your posts however,still getHTTP is not working. It is showing me an error in the access token which is working fine if I put that in the browser. I am yusing the template provided by github. SSL context service is also enabled. Highly appreciate your support. Thanks.
... View more
02-29-2016
11:02 AM
@nejm hadj Adding more information based on your comments https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.flume.ExecuteFlumeSink/additionalDetails.html You should stick with NiFi and use built in processor to ingest the data from various social media sources. Please do read docs. In NiFi, the contents of a FlowFile are accessed via a stream, but in Flume it is stored in a byte array. This means the full content will be loaded into memory when a FlowFile is processed by the ExecuteFlumeSink processor. You should consider the typical size of the FlowFiles you'll process and the batch size, if any, your sink is configured with when setting NiFi's heap size.
... View more
02-23-2016
11:40 AM
1 Kudo
Thx it is work
... View more