Member since
09-29-2015
44
Posts
33
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
507 | 03-07-2017 02:30 PM | |
289 | 12-14-2016 02:53 AM | |
2994 | 12-07-2015 03:58 PM | |
874 | 11-06-2015 07:40 PM | |
443 | 10-26-2015 05:59 PM |
04-11-2017
01:10 PM
1 Kudo
@Naseem Rafique Hello Naseem, I'm not sure if you've seen the Hortonworks product documentation which walks through how to install Ambari http://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-installation/content/ch_Getting_Ready.html This maybe an easier route to go. Thanks -Dan
... View more
03-07-2017
02:32 PM
Once you recreate the hive table try to rerun your Pig script. Inside the pig script don't forget to add the argument -useHCatalog.....
... View more
03-07-2017
02:30 PM
2 Kudos
hello @voca voca , I ran into the same problem but realized that the totalmiles column within Hive should be a Double Column and not an INT as described in the tutorial. So if you take this block of code below and rerun in hive view. It should work for you. drop table riskfactor;
CREATE TABLE riskfactor (driverid string,events bigint,totmiles double,riskfactor float)
STORED AS ORC;
... View more
03-02-2017
01:51 PM
Great to hear that you got things working @Prasanna G
... View more
03-01-2017
02:24 PM
1 Kudo
@Prasanna G
You'll have to first sudo su as (or just use sudo docker ps). See screen shot
... View more
02-28-2017
09:10 PM
@Adedayo Adekeye Ambari is the web ui that is used to administer, monitor, and provision out a hadoop cluster. It also has the concept of VIEWs which allow for browsing the Hadoop Distributed filesystem (HDFS) as well as querying data through Hive, writing pig scripts amongst other things (even extensible to do something custom). Regardless within Ambari (example link - http:// TO AZURE PUBLIC IP>:8080/#/main/views/FILES/1.0.0/AUTO_FILES_INSTANCE ) You can log in as raj_ops (with password as raj_ops) in order to get to the files view. Don't just click the link but you'll have to change the above link to match your Azure sandbox Public IP address. This also assumes you have port 8080 open in Azures Network Security Group setting. You may want to follow the instructions on how to reset the ambari admin password - https://hortonworks.com/hadoop-tutorial/learning-the-ropes-of-the-hortonworks-sandbox/#setup-ambari-admin-password Hope this helps
... View more
02-28-2017
07:48 PM
@Adedayo Adekeye Ah I see....within the HDP sandbox on Azure (and locally sandbox for the matter) everything is through docker. So you'll need to execute the CLI command through docker. See my answer to an HCC question here on this very topic. Let me know if this helps.
... View more
02-28-2017
07:01 PM
Hello @Adedayo Adekeye Sounds like you are having problems using Putty to SSH into your Azure sandbox VM. If this is the case I would look at your Network Security Group settings within Azure and make sure that you have port 22 open for SSH access like shown in this screen shot.....basically like a firewall rule see this link for details https://docs.microsoft.com/en-us/azure/virtual-network/virtual-networks-nsg . Also when you created the HDP Sandbox in Azure did you specific a password or a SSH Public Key? If a password you'd need to enter the username and password into Putty. If you used the Public SSH key (default) you'll need to add that key into putty. I'll try to get screen shots uploaded for you but it seems the upload of picture options is having issue with HCC currently.
... View more
02-28-2017
05:46 PM
1 Kudo
Issue seems to be related to impersonation. What would help is to see your hue.ini file. But have a look at this site Issue is related to impersonation. Ensure to follow these steps here. http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/configure_hdp_hue.html similar issues in another HCC article: https://community.hortonworks.com/questions/26792/hue-is-not-allowed-to-impersonate-403.html Question: Why not use Ambari VIEWs?
... View more
02-28-2017
03:14 PM
Hi @Prasanna G You'll have to first copy the file from the local filesystem into the docker container like below. First I created a directory on my docker container like so docker exec sandbox mkdir /dan/ (can then run a docker exec sandbox ls to see that your mkdir worked) then I copied the file to the directory I just created docker cp /home/drice/test2.txt sandbox:dan/test2.txt (where sandbox is the name of the docker container running HDP....you can get a list of containers by running docker ps) once the file is in the docker container you can then copy to hadoop docker exec sandbox hadoop fs -put /dan/test2.txt /test2.txt [root@sandbox drice]# docker exec sandbox hadoop fs -ls / Found 13 items drwxrwxrwx - yarn hadoop 0 2016-10-25 08:10 /app-logs drwxr-xr-x - hdfs hdfs 0 2016-10-25 07:54 /apps drwxr-xr-x - yarn hadoop 0 2016-10-25 07:48 /ats drwxr-xr-x - hdfs hdfs 0 2016-10-25 08:01 /demo drwxr-xr-x - hdfs hdfs 0 2016-10-25 07:48 /hdp drwxr-xr-x - mapred hdfs 0 2016-10-25 07:48 /mapred drwxrwxrwx - mapred hadoop 0 2016-10-25 07:48 /mr-history drwxr-xr-x - hdfs hdfs 0 2016-10-25 07:47 /ranger drwxrwxrwx - spark hadoop 0 2017-02-28 15:05 /spark-history drwxrwxrwx - spark hadoop 0 2016-10-25 08:14 /spark2-history -rw-r--r-- 1 root hdfs 15 2017-02-28 15:04 /test.txt drwxrwxrwx - hdfs hdfs 0 2016-10-25 08:11 /tmp drwxr-xr-x - hdfs hdfs 0 2016-10-25 08:11 /user NOTE: Another way to do this is to just use the ambari file browser view to copy files graphically.
... View more
12-14-2016
02:53 AM
@Kiran Kumar have a look at this page from SAS here Looks like it is HDP 2.5 or greater.
... View more
12-13-2016
04:17 PM
@PW186004 based on IBM documentation here it looks like this version of IBM Information Server supports
Hortonworks, Versions 2.1, 2.2, and 2.3 I'd always look to see what the partner has certified first. Version HDP 2.4 could possibly work, but I'd stick with what is official on IBM's site.
... View more
09-19-2016
02:11 PM
2 Kudos
Repo Description Simple sample program on how to read/write data to HDFS Repo Info Github Repo URL https://github.com/drnice/HDFSReadWrite Github account name drnice Repo name HDFSReadWrite
... View more
- Find more articles tagged with:
- Hadoop Core
- HDFS
- sample-aps
Labels:
08-22-2016
12:28 PM
3 Kudos
Repo Description A utility for testing Avro File formats to ensure that are well formed. Repo Info Github Repo URL https://github.com/drnice/AvroTest Github account name drnice Repo name AvroTest
... View more
- Find more articles tagged with:
- Avro
- Data Processing
- utilities
08-08-2016
04:52 PM
NOTE the WAL is the write ahead log for all puts/deletes execute to a table. The WAL ensures Hbase has durable writes....did you happen to check to see if the file you purged had any content first? Since this is a sandbox may not matter so much for you.
... View more
08-08-2016
03:29 PM
Hey Vincent. I noticed you posted this question twice by accident...see response here https://community.hortonworks.com/questions/50351/javaioioexception-error-or-interrupted-while-split.html
... View more
08-08-2016
03:26 PM
@Vincent Peres sometimes if just powering off the VM without first taking down the services gracefully could cause the underlying filesystem to want to be checked before it can start up successfully. Kind of like a safety check. Sometimes this can be taken care of automatically but in some scenarios just running a manual check/fix is needed. Could you just try an hbase hbck command to see is any tables have an issue (see information here http://hbase.apache.org/0.94/book/hbck.in.depth.html)? I'd also look at the sandbox HBase Master UI to see if it shows any glaring issues http://sandbox.hortonworks.com:16010/master-status (or just hit the quick link through Ambari). If there are problem with a table the hbck command should help get you out of an issue.....hope this helps.
... View more
06-17-2016
09:50 PM
2 Kudos
Repo Description This spark streaming example connects a ClientApplication to a Spark Stream. As data is written to the ClientApplication on a specific socket (9087). A Spark Stream will connect to the port and read the data word count the line entered and output the data. Repo Info Github Repo URL https://github.com/drnice/Spark-Streaming Github account name drnice Repo name Spark-Streaming
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- sample-aps
- spark-streaming
Labels:
06-16-2016
05:35 AM
Same issue Mark reported on HDP 2.4 Sandbox using sqoop import on a single table. Example command sqoop import --connect jdbc:mysql://192.168.1.17:3306/test --username drice --password hadoop --table client --hive-table default.client --hive-import -m 1 NOTE Marks workaround worked new command sqoop import --connect jdbc:mysql://192.168.1.17:3306/test --username drice --password hadoop --table client --hive-table default.client --hive-import -m 1 --driver com.mysql.jdbc.Driver
... View more
04-07-2016
01:38 PM
1 Kudo
Tom is correct, but if using virtual client such as vmware fusion you'll have to do it differently. Grab the MAC address of the VM by selecting the VM->Virtual Machine->Settings->Network Adapter->Advanced Options and copy the MAC Address vi /Library/Preferences/VMware Fusion/vmnet8/dhcp.conf add the line at the bottom of the conf file host hdp24 { hardware ethernet 00:0C:29:42:61:D7; fixed-address 192.168.245.133; }
... View more
03-22-2016
04:39 PM
@Ram Note that disks are required for NN also. See post related to sizing of NN. https://community.hortonworks.com/questions/1692/any-recommendation-on-how-to-partition-disk-space.html#answer-1762
... View more
12-07-2015
03:58 PM
2 Kudos
Hey @Wes Floyd , not that I'm aware of but if you look at RMP-3737 looks like this is coming in DAL-M20 or HDP 2.3.4
... View more
11-25-2015
08:43 PM
Yes syncsort is a partner look here.
... View more
11-25-2015
08:41 PM
@pbalasundaram I've seen a customer use JRecord to build a mapreduce inputformat for HDFS, mapreduce, etc. Look around within github and you should see examples. Obviously this would take more work than just using something off the shelf. Besides syncsort, there are capabilities within attunity also.
... View more
11-12-2015
01:40 AM
1 Kudo
Martin, I have some contacts over at PepperData that I can introduce you to. Also I do know of a customer successfully using PepperData. Give me a call tomorrow to discuss.
... View more
11-06-2015
07:40 PM
1 Kudo
@vsomani@hortonworks.com Similar question here Answer I copied below. each HDFS block occupies ~250 bytes of RAM on NameNode (NN), plus an additional ~250 bytes will be required for each file and directory. Block size by default is 128 MB so you can do the calculation pertaining to how much RAM will support how many files. To guarantee persistence of the filesystem metadata the NN has to keep a copy of its memory structures on disk also the NN dirs and they will hold the fsimage and editlogs. Editlogs captures all changes that are happening to HDFS (such as new files and directories), think redo logs that most RDBM's use. The fsimage is a full snapshot of the metadata state. The fsimage file will not grow beyond the allocated NN memory set and the edit logs will get rotated once it hits a specific size. It always safest to allocate significantly more capacity for NN directory then needed example say 4 times what is configured for NN memory, but if disk capacity isn't an issue allocate 500 GB+ if can spare (more capacity is very common especially when setting up a 3+3 or 4+4 RAID 10 mirrored set). Setting up RAID at the disk level like RAID1 or RAID 1/0 makes sense and thus having RAID allows for a single directory to be just fine.
... View more
10-26-2015
05:59 PM
Got this information from Sai Nukavarapu and wanted to add it back to the thread in case anyone else needs to have more context to these meanings. - What MemorySeconds means and vCoreSeconds mean. 1. memorySeconds - The amount of memory the application has allocated (megabyte-seconds); Aggregated amount of memory (in megabytes) the application has allocated X the number of seconds the application has been running. 2. The amount of CPU resources the application has allocated (virtual core-seconds) Aggregated number of vcores that the application has allocated X the number of seconds the application has been running.
... View more
10-21-2015
03:38 AM
1 Kudo
@hfaouaz@hortonworks.com - each HDFS block occupies ~250 bytes of RAM on NameNode (NN), plus an additional ~250 bytes will be required for each file and directory. Block size by default is 128 MB so you can do the calculation pertaining to how much RAM will support how many files. To guarantee persistence of the filesystem metadata the NN has to keep a copy of its memory structures on disk also the NN dirs as you mentioned and they will hold the fsimage and editlogs. Editlogs captures all changes that are happening to HDFS (such as new files and directories), think redo logs that most RDBM's use. The fsimage is a full snapshot of the metadata state. The fsimage file will not grow beyond the allocated NN memory set and the edit logs will get rotated once it hits a specific size. It always safest to allocate significantly more capacity for NN directory then needed say 4 times what is configured for NN memory, but if disk capacity isn't and issue allocate 500 GB+ if can spare (more capacity is very common especially when setting up a 3+3 or 4+4 RAID 10 mirrored set). Setting up RAID at the disk level like RAID1 or RAID 1/0 makes sense and thus having RAID allows for a single directory to be just fine.
... View more
10-20-2015
02:07 AM
2 Kudos
Hey @Randy Gelhausen, I went through an exercise like this for a client and took notes on what I did so I can repeat. Note this was with an earlier version of HDP and I haven't tried it HDP 2.3 yet but maybe worth a shot..... Step 1 Here is how I created the hbase table and placed data in it. hbase shell
hbase(main):001:0>create 'short_urls', {NAME => 'u'}, {NAME => 's’}
hbase(main):062:0> put 'short_urls', ‘bit.ly/aaaa', 's:hits', '100'
hbase(main):063:0> put 'short_urls', ‘bit.ly/aaaa', 'u:url', 'hbase.apache.org'
hbase(main):062:0> put 'short_urls', ‘bit.ly/abcd', 's:hits', ‘123'
hbase(main):063:0> put 'short_urls', ‘bit.ly/abcd', 'u:url', ‘example.com/foo'
hbase(main):064:0> scan 'short_urls'
ROW COLUMN+CELL bit.lyaaaa
column=s:hits, timestamp=1412121062283, value=100 bit.lyaaaa
column=u:url, timestamp=1412121071821,value=hbase.apache.org
1 row(s) in 0.0080 seconds Step2: This is how I launched hive and created an external table pointing at HBase. hive --auxpath /usr/lib/hive-hcatalog/share/hcatalog/storage-handlers/hbase/lib/hive-hcatalog-hbase-storage-handler-0.13.0.2.1.1.0-385.jar,/usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.2.1.1.0-385.jar,/usr/lib/hive/lib/zookeeper-3.4.5.2.1.1.0-385.jar,/usr/lib/hive/lib/guava-11.0.2.jar
CREATE EXTERNAL TABLE short_urls(short_url string, url string, hit_count string)STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’ WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key, u:url, s:hits”)TBLPROPERTIES("hbase.table.name" = "short_urls”);
Step3: From Hive now you can query HBase. hive> select * from short_urls;
OK
bit.lyaaaa hbase.apache.org 100
bit.lyabcd example.com/foo 123
Time taken: 0.445 seconds, Fetched: 2 row(s)
... View more
10-15-2015
02:08 PM
Looks like it is because of the application being completed because when running there is much more detail in this output of this call. I think I'm ok at this point, but please share experiences
... View more