Member since
09-29-2015
44
Posts
33
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2451 | 03-07-2017 02:30 PM | |
1256 | 12-14-2016 02:53 AM | |
8721 | 12-07-2015 03:58 PM | |
3321 | 11-06-2015 07:40 PM | |
1904 | 10-26-2015 05:59 PM |
12-07-2015
03:58 PM
2 Kudos
Hey @Wes Floyd , not that I'm aware of but if you look at RMP-3737 looks like this is coming in DAL-M20 or HDP 2.3.4
... View more
11-25-2015
08:43 PM
Yes syncsort is a partner look here.
... View more
11-25-2015
08:41 PM
@pbalasundaram I've seen a customer use JRecord to build a mapreduce inputformat for HDFS, mapreduce, etc. Look around within github and you should see examples. Obviously this would take more work than just using something off the shelf. Besides syncsort, there are capabilities within attunity also.
... View more
11-12-2015
01:40 AM
1 Kudo
Martin, I have some contacts over at PepperData that I can introduce you to. Also I do know of a customer successfully using PepperData. Give me a call tomorrow to discuss.
... View more
11-06-2015
07:40 PM
1 Kudo
@vsomani@hortonworks.com Similar question here Answer I copied below. each HDFS block occupies ~250 bytes of RAM on NameNode (NN), plus an additional ~250 bytes will be required for each file and directory. Block size by default is 128 MB so you can do the calculation pertaining to how much RAM will support how many files. To guarantee persistence of the filesystem metadata the NN has to keep a copy of its memory structures on disk also the NN dirs and they will hold the fsimage and editlogs. Editlogs captures all changes that are happening to HDFS (such as new files and directories), think redo logs that most RDBM's use. The fsimage is a full snapshot of the metadata state. The fsimage file will not grow beyond the allocated NN memory set and the edit logs will get rotated once it hits a specific size. It always safest to allocate significantly more capacity for NN directory then needed example say 4 times what is configured for NN memory, but if disk capacity isn't an issue allocate 500 GB+ if can spare (more capacity is very common especially when setting up a 3+3 or 4+4 RAID 10 mirrored set). Setting up RAID at the disk level like RAID1 or RAID 1/0 makes sense and thus having RAID allows for a single directory to be just fine.
... View more
10-26-2015
05:59 PM
Got this information from Sai Nukavarapu and wanted to add it back to the thread in case anyone else needs to have more context to these meanings. - What MemorySeconds means and vCoreSeconds mean. 1. memorySeconds - The amount of memory the application has allocated (megabyte-seconds); Aggregated amount of memory (in megabytes) the application has allocated X the number of seconds the application has been running. 2. The amount of CPU resources the application has allocated (virtual core-seconds) Aggregated number of vcores that the application has allocated X the number of seconds the application has been running.
... View more
10-21-2015
03:38 AM
1 Kudo
@hfaouaz@hortonworks.com - each HDFS block occupies ~250 bytes of RAM on NameNode (NN), plus an additional ~250 bytes will be required for each file and directory. Block size by default is 128 MB so you can do the calculation pertaining to how much RAM will support how many files. To guarantee persistence of the filesystem metadata the NN has to keep a copy of its memory structures on disk also the NN dirs as you mentioned and they will hold the fsimage and editlogs. Editlogs captures all changes that are happening to HDFS (such as new files and directories), think redo logs that most RDBM's use. The fsimage is a full snapshot of the metadata state. The fsimage file will not grow beyond the allocated NN memory set and the edit logs will get rotated once it hits a specific size. It always safest to allocate significantly more capacity for NN directory then needed say 4 times what is configured for NN memory, but if disk capacity isn't and issue allocate 500 GB+ if can spare (more capacity is very common especially when setting up a 3+3 or 4+4 RAID 10 mirrored set). Setting up RAID at the disk level like RAID1 or RAID 1/0 makes sense and thus having RAID allows for a single directory to be just fine.
... View more
10-20-2015
02:07 AM
2 Kudos
Hey @Randy Gelhausen, I went through an exercise like this for a client and took notes on what I did so I can repeat. Note this was with an earlier version of HDP and I haven't tried it HDP 2.3 yet but maybe worth a shot..... Step 1 Here is how I created the hbase table and placed data in it. hbase shell
hbase(main):001:0>create 'short_urls', {NAME => 'u'}, {NAME => 's’}
hbase(main):062:0> put 'short_urls', ‘bit.ly/aaaa', 's:hits', '100'
hbase(main):063:0> put 'short_urls', ‘bit.ly/aaaa', 'u:url', 'hbase.apache.org'
hbase(main):062:0> put 'short_urls', ‘bit.ly/abcd', 's:hits', ‘123'
hbase(main):063:0> put 'short_urls', ‘bit.ly/abcd', 'u:url', ‘example.com/foo'
hbase(main):064:0> scan 'short_urls'
ROW COLUMN+CELL bit.lyaaaa
column=s:hits, timestamp=1412121062283, value=100 bit.lyaaaa
column=u:url, timestamp=1412121071821,value=hbase.apache.org
1 row(s) in 0.0080 seconds Step2: This is how I launched hive and created an external table pointing at HBase. hive --auxpath /usr/lib/hive-hcatalog/share/hcatalog/storage-handlers/hbase/lib/hive-hcatalog-hbase-storage-handler-0.13.0.2.1.1.0-385.jar,/usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.2.1.1.0-385.jar,/usr/lib/hive/lib/zookeeper-3.4.5.2.1.1.0-385.jar,/usr/lib/hive/lib/guava-11.0.2.jar
CREATE EXTERNAL TABLE short_urls(short_url string, url string, hit_count string)STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’ WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key, u:url, s:hits”)TBLPROPERTIES("hbase.table.name" = "short_urls”);
Step3: From Hive now you can query HBase. hive> select * from short_urls;
OK
bit.lyaaaa hbase.apache.org 100
bit.lyabcd example.com/foo 123
Time taken: 0.445 seconds, Fetched: 2 row(s)
... View more
10-15-2015
02:08 PM
Looks like it is because of the application being completed because when running there is much more detail in this output of this call. I think I'm ok at this point, but please share experiences
... View more
10-15-2015
01:57 PM
Some information in this link is useful looking under this section Elements of the app (Application) object , but seems a little vague, with what the meanings are to the above output. Maybe the -1's are due to the application being completed already? Curious to get others thoughts on how or if they have done a similar scenario for charge back purposes before.
... View more
- « Previous
-
- 1
- 2
- Next »