Member since
09-29-2015
142
Posts
45
Kudos Received
15
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1729 | 06-08-2017 05:28 PM | |
6265 | 05-30-2017 02:07 PM | |
1595 | 05-26-2017 07:48 PM | |
3923 | 04-28-2017 02:48 PM | |
2413 | 04-28-2017 02:41 PM |
10-04-2016
03:23 PM
I faced the same issue. I used sqoop to import a table, then the search function just hung. I reimported the vm, and now I can't access the Atlas dashboard. I get a 503 error.
... View more
09-30-2016
06:40 PM
Sweet. Glad I could help.
... View more
09-30-2016
04:21 PM
Try this: ExecuteSQL > SplitAvro > ConvertAvrotoJSON > EvaluateJsonPath SplitAvro creates individual Avro records ConvertAvrotoJSON creates JSON from Avro EvaluateJsonPath allows you to create new FlowFile attributes from JSON path.
... View more
09-30-2016
04:09 PM
1 Kudo
Try replacing your ConvertAvrotoJSON with a SplitAvro processor. So try a flow like this: ExecuteSQL > SplitAvro > ConvertAvrotoJSON > PutMongo
... View more
09-30-2016
02:03 PM
3 Kudos
Are you sure that only the first record was written? The NiFi doc says ConvertAvrotoJson converts to a single JSON object.
... View more
09-15-2016
02:44 PM
Oh, sorry I missed that.
... View more
09-13-2016
06:37 PM
I have done the following: In my main method: public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("Simple Application").setMaster("local"); JavaSparkContext sc = new JavaSparkContext(conf);
...
My app needs a json file, so in my run configuration, I just put the following on the Arguments > Program arguments tab:
/Users/bhagan/Documents/jsonfile.json
And make sure you have all the dependencies you need in your pom.xml: Run it and see the output: Give it a shot and let us know if you get it working.
... View more
08-23-2016
01:16 PM
@Sunile Manjee Yes, I did flatten the json. Here is what I used (all one line): {"enumTypes":[],"structTypes":[],"traitTypes": [{"superTypes":[],"hierarchicalMetaTypeName":"org.apache.atlas.typesystem.types.TraitType","typeName":"EXPIRES_ON","attributeDefinitions":[{"name":"expiry_date","dataTypeName":"string","multiplicity":"required","isComposite":false,"isUnique":false,"isIndexable":true,"reverseAttributeName": null}]}],"classTypes":[]} But for me, I had left out an attribute.
... View more
07-29-2016
05:21 PM
1 Kudo
I was reviewing some posts related to Pig, and found the following question interesting: https://community.hortonworks.com/questions/47720/apache-pig-guarantee-that-all-the-value-in-a-colum.html#answer-47767 I wanted to share an alternative solution using Pentaho Data
Integration (PDI), an open source ETL tool, that provides visual mapreduce
capabilities. PDI is YARN ready, so when you configure PDI to use your HDP
cluster (or sandbox) and run the attached job, it will run as a YARN
application. The following image is your Mapper. Above, you see the main transformation. It
reads input, which you configure in the Pentaho MapReduce Job (seen below). The
transformation follows a pattern, which is to immediately split the delimited
file into individual fields. Next, I use a Java Expression to determine if a
field is numeric. If not, the we set the value of the field as the String,
null. Next, to prepare for MapReduce output, we concatenate the
fields back together as a single value and pass the key / value to the
MapReduce Output. Once you have the main MapReduce
transformation created, you wrap that into a PDI MapReduce Job. If you're
familiar with MapReduce, you will recognize the configuration options below,
which you would set in your code. Next, configure your Mapper. The Job Succeeds! And the file is in HDFS.
... View more
Labels:
07-26-2016
12:17 AM
1 Kudo
It is often the case that we need to install Hortonworks in
environments with strict requirements. One such requirement may be that all
http traffic must go through a dedicated proxy server. When installing Hortoworks HDP using Ambari, you can find
instructions for configuring Ambari to use the proxy on the
docs.hortonworks.com website. For example, here is the page for configuring
Ambari 2.2 http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_ambari_reference_guide/content/ch_setting_up_an_internet_proxy_server_for_ambari.html Notice that the instructions mention that you must also
configure yum to use the proxy. It’s important to note that the above
instructions for yum will configure all repositories to use the proxy, and you
may not want this behavior. So while it is great to set the proxy at yum’s global level,
you should review any existing repository configurations to determine if they
should not use the proxy. If any repositories should not use the proxy, then
you can update their configurations with the following option: proxy=_none_ Additionally, while preparing for an HDP installation, you
will also use the tools wget and curl. I suggest that you confirm that
these tools are also setup to use the proxy. If not, it’s as easy as setting
the proxy option in their configuration files. Wget has a global file /usr/local/etc/wgetrc. Wget Options: use_proxy
= on http_proxy
= http://proxyhost:port Curl does not have a global file, so you can create .curlrc
in your home directory. proxy
<[protocol://][user:password@]proxyhost[:port] Once you have Ambari, yum, wget, and curl configured to use
your proxy, you’ll be ready to start the installation.
... View more