Member since
09-18-2015
3274
Posts
1159
Kudos Received
426
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2143 | 11-01-2016 05:43 PM | |
6537 | 11-01-2016 05:36 PM | |
4168 | 07-01-2016 03:20 PM | |
7119 | 05-25-2016 11:36 AM | |
3456 | 05-24-2016 05:27 PM |
05-22-2016
05:50 PM
@atul kumar You are looking for this http://hortonworks.com/solutions/ You will use Big Data tool sets to innovate and renovate. Innovation is "make changes in something established" Renovation is "process of improving an outdated structure."
... View more
05-22-2016
11:09 AM
1 Kudo
@atul kumar See this from my env. Please make sure that pig installation was done correctly. See this https://github.com/apache/pig/blob/27b153dbd688d8328e00d2d4bead84f3c879b2ae/RELEASE_NOTES.txt#L21 [root@ns02 ~]# pig -x local WARNING: Use "yarn jar" to launch YARN applications. 16/05/22 04:07:29 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 16/05/22 04:07:29 INFO pig.ExecTypeProvider: Picked LOCAL as the ExecType 2016-05-22 04:07:29,702 [main] INFO org.apache.pig.Main - Apache Pig version 0.15.0.2.4.2.0-258 (rexported) compiled Apr 25 2016, 07:16:15 2016-05-22 04:07:29,703 [main] INFO org.apache.pig.Main - Logging error messages to: /root/pig_1463915249700.log 2016-05-22 04:07:29,728 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /root/.pigbootup not found 2016-05-22 04:07:29,948 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// 2016-05-22 04:07:30,220 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-4b347df8-5816-4fc7-84a8-3d207cae44b3 2016-05-22 04:07:30,716 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: ://ns02.cloud.hortonworks:8188/ws/v1/timeline/ 2016-05-22 04:07:31,466 [main] INFO org.apache.pig.backend.hadoop.ATSService - Created ATS Hook grunt>
... View more
05-22-2016
11:02 AM
1 Kudo
@Doron Veeder I would have done the same thing to fix the issue. This is a minor alteration. I won't change column names or anything drastic.
... View more
05-20-2016
03:55 AM
2 Kudos
Hive: Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis.
HBase: Apache HBase™ is the Hadoop database, a distributed, scalable, big data store
Hawq: http://hawq.incubator.apache.org/
PXF: PXF is an extensible framework that allows HAWQ to query external system data
Let's learn Query federation
This topic describes how to access Hive data using PXF. Link
Previously, in order to query Hive tables using HAWQ and PXF, you needed to create an external table in PXF that described the target table's Hive metadata. Since HAWQ is now integrated with HCatalog, HAWQ can use metadata stored in HCatalog instead of external tables created for PXF. HCatalog is built on top of the Hive metastore and incorporates Hive's DDL. This provides several advantages:
You do not need to know the table schema of your Hive tables You do not need to manually enter information about Hive table location or format If Hive table metadata changes, HCatalog provides updated metadata. This is in contrast to the use of static external PXF tables to define Hive table metadata for HAWQ.
HAWQ retrieves table metadata from HCatalog using PXF. HAWQ creates in-memory catalog tables from the retrieved metadata. If a table is referenced multiple times in a transaction, HAWQ uses its in-memory metadata to reduce external calls to HCatalog. PXF queries Hive using table metadata that is stored in the HAWQ in-memory catalog tables. Table metadata is dropped at the end of the transaction.
Demo
Tools used
Hive,Hawq,Zeppelin
HBase tables Follow this to create hbase tables perl create_hbase_tables.pl Create table in HAWQ to access HBASE table Note: Port is 51200 not 50070 Links Gist PXF docs Must see this Zeppelin interpreter settings
... View more
Labels:
05-20-2016
03:52 AM
@Ali Bajwa Just created this https://www.linkedin.com/pulse/hawqhdb-hadoop-hive-hbase-neeraj-sabharwal
... View more
05-18-2016
12:34 PM
@Raghu Ramamoorthi See this link and this Look at this rpm repo http://rpmfind.net/linux/rpm2html/search.php?query=libc.so.6 You did not mention the OS Error: libc = ctypes.CDLL('/lib/x86_64-linux-gnu/libc.so.6')File"/usr/lib64/python2.7/ctypes/__init__.py", line 360,in __init__self._handle = _dlopen(self._name, mode)
OSError:/lib/x86_64-linux-gnu/libc.so.6: cannot open shared object file:No such file or directory This is from my env [root@amtest01 ~]# rpm -qa | grep libc libcom_err-1.41.12-22.el6.x86_64 libcgroup-0.40.rc1-5.el6_5.1.x86_64 libcap-2.16-5.5.el6.x86_64 glibc-common-2.12-1.166.el6_7.7.x86_64 glibc-devel-2.12-1.166.el6_7.7.x86_64 libcom_err-1.41.12-22.el6.i686 glibc-2.12-1.166.el6_7.7.i686 libcap-ng-0.6.4-3.el6_0.1.x86_64 glibc-2.12-1.166.el6_7.7.x86_64 glibc-headers-2.12-1.166.el6_7.7.x86_64 libcurl-7.19.7-37.el6_5.3.x86_64 [root@amtest01 ~]# find / -name libc.so.6 /lib/libc.so.6 /lib/i686/nosegneg/libc.so.6 /lib64/libc.so.6 /root/initrd/lib64/libc.so.6 [root@amtest01 ~]# uname -a Linux amtest01 2.6.32-431.20.3.el6.x86_64 #1 SMP Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux [root@amtest01 ~]# cat /etc/redhat-release CentOS release 6.5 (Final) [root@amtest01 ~]#
... View more
05-18-2016
12:24 PM
@Salvatore Bel See this http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/_operating_systems_requirements.html Ubuntu 16.0 is not supported. @Utkarsh Sopan see this https://github.com/apache/ambari/blob/e4418ee382aadc20d50a830ff19331d0da54739b/ambari-funtest/src/test/resources/os_family.json
... View more
05-18-2016
12:18 PM
@Harini Yadav
If service is not managed by ambari then it's not possible. Please see this doc to go through kerberos setup. I am sure that you have seen this http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Ambari_Security_Guide/content/ch_configuring_amb_hdp_for_kerberos.html
... View more
05-18-2016
12:09 PM
@Fazil Aijaz You will have access to the docs. No interenet access though.
... View more
05-18-2016
12:07 PM
Chronos is a replacement for cron.
A fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules
Marathon is a framework for Mesos that is designed to launch long-running applications, and, in Mesosphere, serves as a replacement for a traditional system
In Mesosphere, Chronos compliments Marathon as it provides another way to run applications, according to a schedule or other conditions, such as the completion of another job. It is also capable of scheduling jobs on multiple Mesos slave nodes, and provides statistics about job failures and successes. Source
Install https://mesos.github.io/chronos/docs/ and gist
... View more