About pminovic

pminovic · ‎05-27-2016

Try to set you location to: <locationtype="data"path="/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}"/> From the feed specification page: The granularity of date pattern in the path should be at least that of a frequency of a feed.

pminovic · ‎05-27-2016

In sqoop-action 0.2 elements inside "action" must appear in a specific order (called xs:sequence below): job-tracker, name-node, prepare, job-xml comes here, etc. minOccurs=0 means that an element is optional. minOccurs=1 means that an element is required. Full details here. <xs:sequence> <xs:element name="job-tracker" type="xs:string" minOccurs="1" maxOccurs="1"/> <xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/> <xs:element name="prepare" type="sqoop:PREPARE" minOccurs="0" maxOccurs="1"/> <xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="1"/> <xs:element name="configuration" type="sqoop:CONFIGURATION" minOccurs="0" maxOccurs="1"/> <xs:choice> <xs:element name="command" type="xs:string" minOccurs="1" maxOccurs="1"/> <xs:element name="arg" type="xs:string" minOccurs="1" maxOccurs="unbounded"/> </xs:choice> <xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> <xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence>

pminovic · ‎05-27-2016

Okay, try to use the attached file. xy.txt

pminovic · ‎05-27-2016

Can you upload your file, just first few lines will be enough. Tnx.

pminovic · ‎05-27-2016

I suspect it's not in a correct format. Can you open the file in your favorite editor and count characters on the first line between "abc" and "su", just move the cursor to the right. If there are 6 characters then try this, I assume you are on linux (make a copy of your file before): sed -i 's/\u00001/^A/g' datafilename To type ^A, keep "Control" key pressed and type "v" followed by "a". Replace your data filename and retry to list your table from Hive.

pminovic · ‎05-27-2016

'\u0001' is a single character Ctrl-A. What do you have in your data file as the delimiter? A single ctrl-A or 6 characters '\u0001'? The delimiter in Hive must be a single character, and actually Ctrl-A is the default. The best way to generate it is programmatically. If you open a file with Ctrl-A's in vi you can see, for example: 1010^Abob^A2016-04-10 05:52:25.0 (here I have 3 fields: id, string and timestamp).

pminovic · ‎05-27-2016

Yes, metadata is in stored in special HBase tables in the 'hbase' namespace. In latest versions of HBase you can inspect them by opening hbase shell and running, for example list_namespace_tables 'hbase' scan 'hbase:meta' This approach won't work in distributed HBase, and some changes in "meta" would be required. Also in your case it might be necessary to restore the backup directory on the same path in the target system, but most likely not. Just give it a try.

pminovic · ‎05-26-2016

Just stop HBase and copy the contents of your hbase.rootdir directory (in hdfs version it is /apps/hbase/data, in your case it's somewhere on your file system). That's going to be your backup artifact. Then, as a test try to restore it to another environment and make sure it works by listing and scanning some tables. All HBase metadata are included there, and it should work as-is.

pminovic · ‎05-24-2016

One more way to look at this: If you have 3 ZKs you can afford to lose one, if you have 5 you can afford to lose two. If your organization is agressively applying security patches and other upgrades, like firmware, kernel, Java, other packages used by Hadoop, and taking nodes down to do the job, then during those upgrades with 3 ZKs, you ZK runs with only two nodes, and if you are unlucky and one of them goes down, then your whole cluster will go down. So, in this case 5 are bettter. However, the more ZK nodes you have, the slower the ZK becomes for writes.

pminovic · ‎05-23-2016

There is no way to tell Sqoop to import all tables into HBase, becuase you have to use "--hbase-table" which is incompatible with "--import-all-tables". Note that HBase is not a general purpose data-base/storage, it's used to store a relatively small number of tables and provide real-time access to them, so it doesn't make sense to import 100s of tables into HBase. For a reasonably small number of tables you can create a script: for t in t1 t2 t3; do sqoop --connect jdc:mysql://... --table $t --hbase-table $t --hbase-create-table ... done Note that it's a good idea to pre-create HBase tables, for example to set splitting and compression etc, because Sqoop will not do that. Another approach for your project can be to import all your tables into Hive, create a few Hive tables mapped onto HBase, and populate them using your Hive imported tables.

Online	Offline
Last Visited	‎08-19-2019 01:20 AM

Member Since	‎09-24-2015 04:02 AM
Last Visited	‎08-19-2019 01:20 AM
Posts	816
Kudos received	481

Cloudera Community

Re: datanode + Error occurred during initializatio...

Re: Problem when Distcp between two HA Cluster.

Re: Beeline over KNOX fails with HTTP Response co...

Re: What does nclients option of performance evalu...

Re: missing directories in ambari installation pac...

Re: [RESOLVED] [FALCON] error during feed retentio...

Re: Error: E0701 : E0701: XML schema error, cvc-co...

Re: Delimited /u0001 not working

Re: Delimited /u0001 not working

Re: Delimited /u0001 not working

Re: Delimited /u0001 not working

Re: HBase backup - offline, standalone mode (no HD...

Re: HBase backup - offline, standalone mode (no HD...

Re: How to decide, how many zookeepers should I ha...

Re: how can import all tables from sql server data...