About emilysharpe

emilysharpe · ‎04-10-2017

Thanks for clearing it up @Bryan Bende, much appreciated.

emilysharpe · ‎04-10-2017

Thanks @Matt Clarke, much appreciated

emilysharpe · ‎04-10-2017

Currently any NiFi templates that are created are not stored anywhere outside of flow.xml.gz unless they are explicitly downloaded using the NiFi UI. I can see them within my flow.xml.gz file and can still download them and import them using the UI, so I am not experiencing any issues in terms of the template function. However it was my understanding that active templates would automatically be persisted in directory conf/templates, or otherwise in a custom location using the nifi.templates.directory property. Is this the correct behaviour? Or is the templates directory more a convenience for downloaded templates to be manually stored into? I have NiFi version 1.1.2. Thanks!

emilysharpe · ‎10-17-2016

Hi @Ashish Vishnoi. In the sqoop-export doco, it says: "The --input-null-string and --input-null-non-string arguments are optional. If --input-null-string is not specified, then the string "null" will be interpreted as null for string-type columns. If --input-null-non-string is not specified, then both the string "null" and the empty string will be interpreted as null for non-string columns. Note that, the empty string will be always interpreted as null for non-string columns, in addition to other string if specified by --input-null-non-string " There is another discussion here around a similar issue. If it doesn't work even for string columns, it may be that a workaround of some kind is needed. E.g. conversion of blanks to another character (or set of characters that wouldn't normally be a part of your data set) prior to export which can be converted back to blanks once in Teradata. Hope this helps.

emilysharpe · ‎10-10-2016

When specifying fully-qualified paths to copy data between two HA clusters with DistCp, e.g: hdfs://nn1:8020/foo/bar Is the address of nn1 really referring to the where the active HDFS NameNode is, or is it looking for the active ResourceManager? Thanks!

emilysharpe · ‎08-18-2016

Hi @mqureshi. Thanks for your response. Personally I have no motivation to use Federation I am just curious about it as I see it mentioned occasionally, and I hadn't really come across a concrete example of its practical application and how that would work.

emilysharpe · ‎08-15-2016

Hi @mqureshi. How are the clients divided up between the namenodes? Can the whole cluster still interact fully? E.g. if pig and hive connect to different nameservices/namenodes can they still operate on the same data on HDFS?

emilysharpe · ‎07-29-2016

Hi @Sridharan Govindaraj. Did running your command from Unix shell instead of HBase shell solve this issue? You might also want to fully qualify your hdfs file path: $ hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.columns="HBASE_ROW_KEY,id,temp:in,temp:out,vibration,pressure:in,pressure:out" sensor hdfs:///user/hbase.csv

emilysharpe · ‎07-14-2016

Hi @Sunile Manjee @Steven O'Neill's solution above fixed the issue. Originally, trying to retrieve information for a cell which should be null looked like: Using REST API and going to the HBase URL: Internet explorer would download a 0KB file HBase Shell get command: 1 row returned, "timestamp=14856365476857, value=" After doing a pig filter and store for each individual column, trying to retrieve the same cell looks like: REST: http error 404 HBase Shell get: 0 rows returned So the empty cells were not being automatically dropped out plus HBase was storing the key for the cells with no value.

emilysharpe · ‎07-05-2016

@Sunile Manjee thank you, yes I believe so. I have read both that every cell stores a full row key, and that empty cells do not exist at all in HBase. So I think I expect the behaviour of a get command for a specified cell to bring back either a key and a non-empty value, or nothing (no key) as it "does not exist". If I request a cell that I know does not exist e.g. a real key, real column family, fake column qualifier, I get no errors but 0 rows returned. However for a cell that "doesn't exist" based on a null value, I get one row returned (the key and empty value). How does the get command know to return the key if it isn't stored as a part of that cell? Does this come from metadata rather than the cell itself? In any case, I hope this explains my confusion.

Online	Offline
Last Visited	‎11-16-2018 12:44 AM

Member Since	‎12-09-2015 11:26 PM
Last Visited	‎11-16-2018 12:44 AM
Posts	35
Kudos received	13

Cloudera Community

Re: Merge and Rename files in HDFS - Pig?

Re: NiFi templates not persisting unless exported ...

Re: NiFi templates not persisting unless exported ...

NiFi templates not persisting unless exported from...

Re: SQOOP EXPORT OF BLANKS VALUES AS BLANK IN TERA...

Is the DistCp NameNode path the active HDFS NameNo...

Re: HDFS Federation

Re: HDFS Federation

Re: Unable to import data into hbase table through...

Re: Best way to ensure null values are not stored ...

Re: Best way to ensure null values are not stored ...