Member since
09-24-2015
816
Posts
488
Kudos Received
189
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3173 | 12-25-2018 10:42 PM | |
| 14193 | 10-09-2018 03:52 AM | |
| 4764 | 02-23-2018 11:46 PM | |
| 2481 | 09-02-2017 01:49 AM | |
| 2914 | 06-21-2017 12:06 AM |
05-27-2016
11:48 PM
Try to set you location to: <locationtype="data"path="/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}"/> From the feed specification page: The
granularity of date pattern in the path should be at least that of a
frequency of a feed.
... View more
05-27-2016
03:01 PM
In sqoop-action 0.2 elements inside "action" must appear in a specific order (called xs:sequence below): job-tracker, name-node, prepare, job-xml comes here, etc. minOccurs=0 means that an element is optional. minOccurs=1 means that an element is required. Full details here. <xs:sequence>
<xs:element name="job-tracker" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="name-node" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="prepare" type="sqoop:PREPARE" minOccurs="0" maxOccurs="1"/>
<xs:element name="job-xml" type="xs:string" minOccurs="0" maxOccurs="1"/>
<xs:element name="configuration" type="sqoop:CONFIGURATION" minOccurs="0" maxOccurs="1"/>
<xs:choice>
<xs:element name="command" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="arg" type="xs:string" minOccurs="1" maxOccurs="unbounded"/>
</xs:choice>
<xs:element name="file" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="archive" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
... View more
05-27-2016
01:45 PM
Can you upload your file, just first few lines will be enough. Tnx.
... View more
05-27-2016
12:42 PM
I suspect it's not in a correct format. Can you open the file in your favorite editor and count characters on the first line between "abc" and "su", just move the cursor to the right. If there are 6 characters then try this, I assume you are on linux (make a copy of your file before): sed -i 's/\u00001/^A/g' datafilename To type ^A, keep "Control" key pressed and type "v" followed by "a". Replace your data filename and retry to list your table from Hive.
... View more
05-27-2016
12:09 PM
'\u0001' is a single character Ctrl-A. What do you have in your data file as the delimiter? A single ctrl-A or 6 characters '\u0001'? The delimiter in Hive must be a single character, and actually Ctrl-A is the default. The best way to generate it is programmatically. If you open a file with Ctrl-A's in vi you can see, for example: 1010^Abob^A2016-04-10 05:52:25.0 (here I have 3 fields: id, string and timestamp).
... View more
05-27-2016
12:34 AM
Yes, metadata is in stored in special HBase tables in the 'hbase' namespace. In latest versions of HBase you can inspect them by opening hbase shell and running, for example list_namespace_tables 'hbase'
scan 'hbase:meta' This approach won't work in distributed HBase, and some changes in "meta" would be required. Also in your case it might be necessary to restore the backup directory on the same path in the target system, but most likely not. Just give it a try.
... View more
05-26-2016
11:56 PM
2 Kudos
Just stop HBase and copy the contents of your hbase.rootdir directory (in hdfs version it is /apps/hbase/data, in your case it's somewhere on your file system). That's going to be your backup artifact. Then, as a test try to restore it to another environment and make sure it works by listing and scanning some tables. All HBase metadata are included there, and it should work as-is.
... View more
05-24-2016
11:43 PM
1 Kudo
One more way to look at this: If you have 3 ZKs you can afford to lose one, if you have 5 you can afford to lose two. If your organization is agressively applying security patches and other upgrades, like firmware, kernel, Java, other packages used by Hadoop, and taking nodes down to do the job, then during those upgrades with 3 ZKs, you ZK runs with only two nodes, and if you are unlucky and one of them goes down, then your whole cluster will go down. So, in this case 5 are bettter. However, the more ZK nodes you have, the slower the ZK becomes for writes.
... View more
05-23-2016
10:16 PM
1 Kudo
There is no way to tell Sqoop to import all tables into HBase, becuase you have to use "--hbase-table" which is incompatible with "--import-all-tables". Note that HBase is not a general purpose data-base/storage, it's used to store a relatively small number of tables and provide real-time access to them, so it doesn't make sense to import 100s of tables into HBase. For a reasonably small number of tables you can create a script: for t in t1 t2 t3; do
sqoop --connect jdc:mysql://... --table $t --hbase-table $t --hbase-create-table ...
done Note that it's a good idea to pre-create HBase tables, for example to set splitting and compression etc, because Sqoop will not do that. Another approach for your project can be to import all your tables into Hive, create a few Hive tables mapped onto HBase, and populate them using your Hive imported tables.
... View more