Member since
05-16-2018
8
Posts
0
Kudos Received
0
Solutions
07-03-2018
06:04 AM
Thanks for you answer. And if I want to use only HDFS for read/write, Its not recommended for me with amount of 200B records per day?
... View more
07-02-2018
08:14 AM
I need to consider how to write my data to Hadoop. I'm using Spark, I got a message from Kafka topic, each message in JSON record. I have around 200B records per day. The data fields may be change (not alot but may be change in the future), I need fast write and fast read, low size in disk. What should I choose? Avro or Parquet? If I choose Parquet/Avro, Should I need to create the table with all fields of my JSON? If no, What is the way to create the table with Parquet format and Avro format? Thanks!!
... View more
Labels:
- Labels:
-
Apache Spark
07-02-2018
08:13 AM
I need to consider how to write my data to Hadoop. I'm using Spark, I got a message from Kafka topic, each message in JSON record. I have around 200B records per day. The data fields may be change (not alot but may be change in the future), I need fast write and fast read, low size in disk. What should I choose? Avro or Parquet? If I choose Parquet/Avro, Should I need to create the table with all fields of my JSON? If no, What is the way to create the table with Parquet format and Avro format? Thanks!!
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
06-04-2018
07:59 AM
Whats mean "Security enabled"? Yes, the user has read permission to .xml configuration files
... View more
05-27-2018
02:18 PM
Hi, I'm using Nifi 1.6.0. I'm trying to write to HDFS and to Hive (cloudera) with nifi.
On "PutHDFS" I'm configure the "Hadoop Confiugration Resources" with hdfs-site.xml, core-site.xml files, set the directories and when I'm trying to Start it I got the following error: "Failed to properly initialize processor, If still shcedule to run, NIFI will attempt to initalize and run the Processor again after the 'Administrative Yield Duration' has elapsed. Failure is due to java.lang.reflect.InvocationTargetException: java.lang.reflect.InvicationTargetException" On "PutHiveStreaming" I'm configure the "Hive Metastore URI" with thrift://..., the database and the table name and on "Hadoop Confiugration Resources" I'm put the Hive-site.xml location and when I'm trying to Start it I got the following error:"Hive streaming connect/write error, flow file will be penalized and routed to retry. org.apache.nifi.util.hive.HiveWritter$ConnectFailure: Failed connectiong to EndPoint {metaStoreUri='thrift://myserver:9083', database='mydbname', table='mytablename', partitionVals=[]}:"". How can I solve the errors? Thanks.
... View more
Labels:
05-23-2018
06:56 AM
Hi, I want to delpoy Nifi and I have some questions about the best practice deployment. I have alot of data (around 100B records per day) so I need the best performance. 1. Should I need to install Nifi on VM or physic server? 2. Except the following Sizing Guide there is another one? Thanks.
... View more
Labels:
- Labels:
-
Apache NiFi
05-16-2018
08:48 AM
Hi, I'm looking for tutorial for the following flow: 1. Read message from Kafka (JSON format) 2. Convert the JSON format to CSV format 3. Write the CSV to Hadoop It's possible to do it with Nifi? Thanks.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark