Member since
05-02-2016
154
Posts
54
Kudos Received
14
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4207 | 07-24-2018 06:34 PM | |
5802 | 09-28-2017 01:53 PM | |
1439 | 02-22-2017 05:18 PM | |
14405 | 01-13-2017 10:07 PM | |
3982 | 12-15-2016 06:00 AM |
12-20-2017
06:47 AM
was going through some ORC based processing from pdf. looks it works better if each page is split into a monochrome image. Examples online show Ghostscript as an option. I was able to leverage this processor to extract images and with a property change it to grayscale if needed. I could now send this to OCR processor for extraction. @Jeremy Dyer
... View more
12-20-2017
06:44 AM
1 Kudo
https://github.com/rkarthik29/pdfprocessors
... View more
11-16-2017
12:51 PM
1 Kudo
@Muhammad idrees for 2.6.1 you have to use a mpack to installation of solr. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_solr-search-installation/content/ch_hdp-search-install-ambari.html. try that
... View more
11-16-2017
05:13 AM
@Muhammad idrees var/lib/ambari-server/resources/stacks/HDP/2.5/repos/repoinfo.xml , are you trying this with 2.5?
... View more
10-06-2017
07:42 PM
i tried on 1.1 it worked fine. json2avro.xml {
"type": "record",
"name": "testdata",
"fields": [
{
"name": "timestamp",
"type": {"type":"string","logicalType":"timestamp-millis"}
},
{
"name": "c1",
"type": "double"
},
{
"name": "c2",
"type": "double"
},
{
"name": "c3",
"type": "double"
},
{
"name": "c4",
"type": "double"
},
{
"name": "c5",
"type": "double"
},
{
"name": "c6",
"type": "double"
},
{
"name": "c7",
"type": "double"
},
{
"name": "c8",
"type": "double"
},
{
"name": "c9",
"type": "double"
},
{
"name": "c10",
"type": "double"
},
{
"name": "c11",
"type": "double"
},
{
"name": "postal_code",
"type": "string"
},
{
"name": "country",
"type": "string"
}
]
}
... View more
10-06-2017
03:41 AM
hmm, strange which version of nifi were you on?
... View more
10-05-2017
06:08 PM
as per the avro doc, timestamp-millis is a logicaltype, so you have to use something like this... {"name":"timestamp","type": {"type": "string", "logicalType": "timestamp-millis"} see if that works. https://github.com/mtth/avsc/wiki/Advanced-usage#logical-types
... View more
10-03-2017
02:47 PM
@Arun A K why not use an the new Record oriented processors, store the schema locally in nifi using the avroschemaregistry. https://community.hortonworks.com/questions/113959/use-nifi-to-change-the-format-of-numeric-date-and.html
... View more
09-28-2017
01:53 PM
@james.jones
Hi not sure what it is called, but the what i think has to happen is the credentials that you are using for your ec2 machine, if that is xyz. You need allow xyz to impersonate arn:aws:sts::7777777:assumed-role/role-hdf-node/i-03333330000. see if this helps http://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_cross-account-with-roles.html
... View more
09-27-2017
04:35 PM
This will give you provenance in nifi, which provides you with confirmation of how much data in bytes was extracted and sent to hdfs, so no need to do this additional check.
... View more