Member since
12-10-2015
48
Posts
27
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1952 | 04-27-2016 07:48 AM | |
4207 | 02-04-2016 03:27 PM |
04-21-2016
02:24 PM
If you are using VirtualBox, you can access ambari simply opening the browser on you host and pointing to localhost:8080. If it doesn't work, you have to set port forwarding from Machine -> Settings -> Network -> Port Forwarding
... View more
03-15-2016
08:33 AM
Hi Tamás, I see you are using openjdk 1.7. Try using openjdk 1.8 instead. Davide
... View more
03-10-2016
10:46 AM
1 Kudo
Hi Ryan, could you check if you have the right permissions on the local directory? [hawqadmin@hdpmaster01 ~]$ ls -ld /data01/hawq/masterdd/
drwx------ 16 hawqadmin hadoop 4096 Mar 1 09:19 /data01/hawq/masterdd/
[hawqadmin@hdpmaster01 ~]$ ls -l /data01/hawq/masterdd/
total 40
drwx------ 5 hawqadmin hawqadmin 38 Feb 29 15:38 base
drwx------ 2 hawqadmin hawqadmin 4096 Mar 1 09:19 global
drwx------ 2 hawqadmin hawqadmin 6 Feb 29 15:38 pg_changetracking
drwx------ 2 hawqadmin hawqadmin 17 Feb 29 15:38 pg_clog
drwx------ 2 hawqadmin hawqadmin 6 Feb 29 15:38 pg_distributedlog
drwx------ 2 hawqadmin hawqadmin 6 Feb 29 15:38 pg_distributedxidmap
-rw-rw-r-- 1 hawqadmin hawqadmin 4021 Feb 29 15:38 pg_hba.conf
-rw------- 1 hawqadmin hawqadmin 1636 Feb 29 15:38 pg_ident.conf
drwx------ 2 hawqadmin hawqadmin 156 Mar 1 00:00 pg_log
drwx------ 4 hawqadmin hawqadmin 34 Feb 29 15:38 pg_multixact
drwx------ 2 hawqadmin hawqadmin 6 Mar 1 09:19 pg_stat_tmp
drwx------ 2 hawqadmin hawqadmin 17 Feb 29 15:38 pg_subtrans
drwx------ 2 hawqadmin hawqadmin 6 Feb 29 15:38 pg_tblspc
drwx------ 2 hawqadmin hawqadmin 6 Feb 29 15:38 pg_twophase
drwx------ 2 hawqadmin hawqadmin 6 Feb 29 15:38 pg_utilitymodedtmredo
-rw------- 1 hawqadmin hawqadmin 4 Feb 29 15:38 PG_VERSION
drwx------ 3 hawqadmin hawqadmin 58 Feb 29 15:38 pg_xlog
-rw------- 1 hawqadmin hawqadmin 18393 Feb 29 15:38 postgresql.conf
-rw------- 1 hawqadmin hawqadmin 104 Feb 29 15:40 postmaster.opts
[hawqadmin@hdpmaster01 ~]$
Also, what are the permissions on the directory on hdfs? [hawqadmin@hdpmaster01 ~]$ hdfs dfs -ls -d /hawq_default
drwxr-xr-x - hawqadmin hdfs 0 2016-02-29 15:38 /hawq_default
[hawqadmin@hdpmaster01 ~]$ hdfs dfs -ls -R /hawq_default
drwx------ - hawqadmin hdfs 0 2016-02-29 15:47 /hawq_default/16385
drwx------ - hawqadmin hdfs 0 2016-03-01 08:54 /hawq_default/16385/16387
drwx------ - hawqadmin hdfs 0 2016-03-01 08:55 /hawq_default/16385/16387/16513
-rw------- 3 hawqadmin hdfs 48 2016-03-01 08:55 /hawq_default/16385/16387/16513/1
-rw------- 3 hawqadmin hdfs 4 2016-02-29 15:47 /hawq_default/16385/16387/PG_VERSION
[hawqadmin@hdpmaster01 ~]$
... View more
03-10-2016
10:30 AM
1 Kudo
We used the guides you posted and now all works right! Thank you!
... View more
03-09-2016
11:56 AM
3 Kudos
Hi all, we are developing a storm topology to write streaming data in hive database but the following errors occurs during executions:
1) Using Hive library version 1.2.1 (http://search.maven.org/#artifactdetails|org.apache.hive|hive|1.2.1|pom) and configuration as in the attached pom1.xml file, the error is: 43088 [Thread-12-hiveBolt] ERROR b.s.d.executor -
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NoSuchFieldError: METASTORE_FILTER_HOOK
at org.apache.storm.hive.common.HiveWriter.callWithTimeout(HiveWriter.java:357) ~[StormTopology-0.1.jar:?]
at org.apache.storm.hive.common.HiveWriter.newConnection(HiveWriter.java:226) ~[StormTopology-0.1.jar:?]
at org.apache.storm.hive.common.HiveWriter.<init>(HiveWriter.java:69) ~[StormTopology-0.1.jar:?]
at org.apache.storm.hive.common.HiveUtils.makeHiveWriter(HiveUtils.java:45) ~[StormTopology-0.1.jar:?]
at org.apache.storm.hive.bolt.HiveBolt.getOrCreateWriter(HiveBolt.java:219) ~[StormTopology-0.1.jar:?]
at org.apache.storm.hive.bolt.HiveBolt.execute(HiveBolt.java:102) [StormTopology-0.1.jar:?]
at backtype.storm.daemon.executor$fn__5694$tuple_action_fn__5696.invoke(executor.clj:690) [StormTopology-0.1.jar:?]
at backtype.storm.daemon.executor$mk_task_receiver$fn__5615.invoke(executor.clj:436) [StormTopology-0.1.jar:?]
at backtype.storm.disruptor$clojure_handler$reify__5189.onEvent(disruptor.clj:58) [StormTopology-0.1.jar:?]
at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:132) [StormTopology-0.1.jar:?]
at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:106) [StormTopology-0.1.jar:?]
at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [StormTopology-0.1.jar:?]
at backtype.storm.daemon.executor$fn__5694$fn__5707$fn__5758.invoke(executor.clj:819) [StormTopology-0.1.jar:?]
at backtype.storm.util$async_loop$fn__545.invoke(util.clj:479) [StormTopology-0.1.jar:?]
at clojure.lang.AFn.run(AFn.java:22) [StormTopology-0.1.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_71]
Caused by: java.util.concurrent.ExecutionException: java.lang.NoSuchFieldError: METASTORE_FILTER_HOOK
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_71]
at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_71]
at org.apache.storm.hive.common.HiveWriter.callWithTimeout(HiveWriter.java:337) ~[StormTopology-0.1.jar:?]
... 15 more
Caused by: java.lang.NoSuchFieldError: METASTORE_FILTER_HOOK
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.loadFilterHooks(HiveMetaStoreClient.java:240) ~[StormTopology-0.1.jar:?]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:192) ~[StormTopology-0.1.jar:?]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:181) ~[StormTopology-0.1.jar:?]
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.getMetaStoreClient(HiveEndPoint.java:448) ~[StormTopology-0.1.jar:?]
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:274) ~[StormTopology-0.1.jar:?]
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:243) ~[StormTopology-0.1.jar:?]
at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:180) ~[StormTopology-0.1.jar:?]
at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:157) ~[StormTopology-0.1.jar:?]
at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:229) ~[StormTopology-0.1.jar:?]
at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:226) ~[StormTopology-0.1.jar:?]
at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:332) ~[StormTopology-0.1.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_71]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_71]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_71]
... 1 more
2) Using Hive library version 2.0.0 (http://search.maven.org/#artifactdetails|org.apache.hive|hive|2.0.0|pom) and configuration as in the attached pom2.xml file, the error returned is:
32028 [Thread-12-hiveBolt] ERROR b.s.d.executor -
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hive.conf.HiveConf
at org.apache.storm.hive.common.HiveWriter.callWithTimeout(HiveWriter.java:357) ~[storm-hive-0.10.0.jar:0.10.0]
at org.apache.storm.hive.common.HiveWriter.newConnection(HiveWriter.java:226) ~[storm-hive-0.10.0.jar:0.10.0]
at org.apache.storm.hive.common.HiveWriter.<init>(HiveWriter.java:69) ~[storm-hive-0.10.0.jar:0.10.0]
at org.apache.storm.hive.common.HiveUtils.makeHiveWriter(HiveUtils.java:45) ~[storm-hive-0.10.0.jar:0.10.0]
at org.apache.storm.hive.bolt.HiveBolt.getOrCreateWriter(HiveBolt.java:219) ~[storm-hive-0.10.0.jar:0.10.0]
at org.apache.storm.hive.bolt.HiveBolt.execute(HiveBolt.java:102) [storm-hive-0.10.0.jar:0.10.0]
at backtype.storm.daemon.executor$fn__5694$tuple_action_fn__5696.invoke(executor.clj:690) [storm-core-0.10.0.jar:0.10.0]
at backtype.storm.daemon.executor$mk_task_receiver$fn__5615.invoke(executor.clj:436) [storm-core-0.10.0.jar:0.10.0]
at backtype.storm.disruptor$clojure_handler$reify__5189.onEvent(disruptor.clj:58) [storm-core-0.10.0.jar:0.10.0]
at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:132) [storm-core-0.10.0.jar:0.10.0]
at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:106) [storm-core-0.10.0.jar:0.10.0]
at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.jar:0.10.0]
at backtype.storm.daemon.executor$fn__5694$fn__5707$fn__5758.invoke(executor.clj:819) [storm-core-0.10.0.jar:0.10.0]
at backtype.storm.util$async_loop$fn__545.invoke(util.clj:479) [storm-core-0.10.0.jar:0.10.0]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_31]
Caused by: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hive.conf.HiveConf
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_31]
at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_31]
at org.apache.storm.hive.common.HiveWriter.callWithTimeout(HiveWriter.java:337) ~[storm-hive-0.10.0.jar:0.10.0]
... 15 more
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hive.conf.HiveConf
at org.apache.hive.hcatalog.streaming.HiveEndPoint.createHiveConf(HiveEndPoint.java:842) ~[hive-hcatalog-streaming-0.14.0.jar:0.14.0]
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:268) ~[hive-hcatalog-streaming-0.14.0.jar:0.14.0]
at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:243) ~[hive-hcatalog-streaming-0.14.0.jar:0.14.0]
at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:180) ~[hive-hcatalog-streaming-0.14.0.jar:0.14.0]
at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:157) ~[hive-hcatalog-streaming-0.14.0.jar:0.14.0]
at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:229) ~[storm-hive-0.10.0.jar:0.10.0]
at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:226) ~[storm-hive-0.10.0.jar:0.10.0]
at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:332) ~[storm-hive-0.10.0.jar:0.10.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_31]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_31]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_31]
... 1 more
32029 [Thread-14-__acker] INFO b.s.d.executor - BOLT ack TASK: 1 TIME: TUPLE: source: parserBolt:5, stream: __ack_ack, id: {}, [820336490148731685 6454746331808199053]
32029 [Thread-14-__acker] INFO b.s.d.executor - Execute done TUPLE source
Also, we included external configuration files in project (hive-site.xml and hive-env.sh) as indicated in hortonworks guidelines. This is the hive’s bolt code:
private void createHiveBolt(TopologyBuilder builder)
{
try
{
// Record Writer configuration
DelimitedRecordHiveMapper mapper = new DelimitedRecordHiveMapper()
.withColumnFields(DataScheme.GetHiveFields());
HiveOptions hiveOptions;
hiveOptions = new HiveOptions(topologyConf.HiveMetastore, topologyConf.HiveDbName, topologyConf.HiveTableName, mapper)
.withTxnsPerBatch(2)
.withBatchSize(100)
.withIdleTimeout(10);
builder.setBolt(HIVE_BOLT_ID, new HiveBolt(hiveOptions), topologyConf.ParallelHint).shuffleGrouping(PARSER_BOLT_ID);
}
catch(Exception ex)
{
logger.error(ex.getMessage());
}
}
How we can solve this issues? Thank you
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Storm
03-01-2016
11:57 AM
5 Kudos
In this article, we will install Apache Hawq 2.0.0.0_beta in a cluster composed by: 2 masters (1 active, 1 standby) 3 segments (slaves) On each node on the cluster install repository to fetch libhdfs3: [root@hdpmaster01~]# curl -s -L "https://bintray.com/wangzw/rpm/rpm" -o /etc/yum.repos.d/bintray-wangzw-rpm.repo
[root@hdpmaster01~]# Install epel repository: [root@hdpmaster01~]# yum -y install epel-release
Install missing
dependencies: [root@hdpamaster01~]# yum -y install man passwd sudo tar which git mlocate links make bzip2 net-tools autoconf automake libtool m4 gcc gcc-c++ gdb bison flex cmake gperf maven indent libuuid-devel krb5-devel libgsasl-devel expat-devel libxml2-devel perl-ExtUtils-Embed pam-devel python-devel libcurl-devel snappy-devel thrift-devel libyaml-devel libevent-devel bzip2-devel openssl-devel openldap-devel protobuf-devel readline-devel net-snmp-devel apr-devel libesmtp-devel xerces-c-devel python-pip json-c-devel libhdfs3-devel apache-ivy java-1.7.0-openjdk-devel openssh-clients openssh-server Install
postgresql-devel to compile python dependencies [root@hdpmaster01~]# yum install postgresql-devel Now, install python
dependencies with pip: pip install pg8000 simplejson unittest2 pycrypto pygresql pyyaml lockfile paramiko psi You can now remove postgresql-*, be sure to not erase existing psql instances Download the source
code from github: [root@hdpmaster01~]# cd /root
[root@hdpmaster01~]# git clone https://github.com/apache/incubator-hawq.git
Cloning into 'incubator-hawq'...
remote: Counting
objects: 34883, done.
remote: Total 34883
(delta 0), reused 0 (delta 0), pack-reused 34883
Receiving objects:
100% (34883/34883), 144.95 MiB | 30.04 MiB/s, done.
Resolving deltas:
100% (21155/21155), done.
[root@hdpmaster01~]# Before compile hawq, you need to compile and install libyarn, c/c++ interface to yarn, that is shipped with the hawq source code [root@hdpmaster01~]# cd /root/incubator-hawq/depends/libyarn/ && mkdir build/ && cd build
[root@hdpmaster01 build]# pwd /root/incubator-hawq/depends/libyarn/build
[root@hdpmaster01 build]#
[root@hdpmaster01 build]# ../bootstrap
[...]
bootstrap success.
Run "make" to build.
[root@hdpmaster01 build]# make -j && make install
[...]
-- Installing: /root/incubator-hawq/depends/libyarn/dist/include/libyarn/records/YARN_containermanagement_protocol.pb.h
-- Installing: /root/incubator-hawq/depends/libyarn/dist/include/libyarn/records/YARNSecurity.pb.h
-- Installing: /root/incubator-hawq/depends/libyarn/dist/include/libyarn/libyarncommon/Token.h
[root@hdpmaster01 build]#
Copy the include dir and the lib dir in the correct file system path, and make the library visible to the operating system with ldconfig: [root@hdpmaster01 build]# cp -R /root/incubator-hawq/depends/libyarn/dist/include/libyarn/ /usr/include/
[root@hdpmaster01 build]# cp /root/incubator-hawq/depends/libyarn/dist/lib/libyarn.so.0.1.13 /usr/lib64/
[root@hdpmaster01 build]#
[root@hdpmaster01 build]# ln -s /usr/lib64/libyarn.so.0.1.13 /usr/lib64/libyarn.so.1
[root@hdpmaster01 build]# ln -s /usr/lib64/libyarn.so.1 /usr/lib64/libyarn.so
[root@hdpmaster01 build]# ldconfig && ldconfig -p | grep libyarn
libyarn.so.1 (libc6,x86-64) => /lib64/libyarn.so.1
libyarn.so (libc6,x86-64) => /lib64/libyarn.so
[root@hdpmaster01 build]
Now, we can compile
and install apache hawq. I use /opt/ as installation directory: [root@hdpmaster01 build]# cd /root/incubator-hawq
[root@hdpmaster01 incubator-hawq]# ./configure –prefix=/opt/hawq
[...]
[root@hdpmaster01 incubator-hawq]# make -j8 && make install
[...]
make[2]: Leaving directory `/root/incubator-hawq/tools/gpnetbench'
make[1]: Leaving directory `/root/incubator-hawq/tools'
HAWQ installation complete. Create the user
hawqadmin and change the ownership for hawq installation directory [root@hdpmaster01 incubator-hawq]# useradd -s /bin/bash hawqadmin
[root@hdpmaster01 incubator-hawq]# passwd hawqadmin
Changing password for user hawqadmin.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
[root@hdpmaster01 incubator-hawq]# chown -R hawqadmin.hawqadmin /opt/hawq/
[root@hdpmaster01 incubator-hawq]# Repeat the previous steps on all hosts in your cluster. Now, on the primary master, create the key for user hawqadmin and distribuite the public key to the other hosts (do not set a password for your private key). As hawqadmin user: [hawqadmin@hdpmaster01~]$ ssh-keygen
[...]
[hawqadmin@hdpmaster01~]$ for i in hdpmaster01 hdpmaster02 hdpslave01 hdpslave02 hdpslave03; do
> ssh-copy-id $i
>done
[...]
[hawqadmin@hdpmaster01~]$ Repeat the previous loop on
the standby master. On the primary
master host, edit /opt/hawq/etc/hdfs-client.xml and
/opt/hawq/etc/yarn-client.xml as they fit your needs (eg. For
namenode and resourcemanager high availability or for kerberos
authentication), then edit the following properties in hawq-site.xml <property>
<name>hawq_master_address_host</name>
<value>hdpmaster01</value>
<description>The host name of hawq master.</description>
</property>
<property>
<name>hawq_master_address_port</name>
<value>5432</value>
<description>The port of hawq master.</description>
</property>
<property>
<name>hawq_standby_address_host</name>
<value>hdpmaster02</value>
<description>The host name of hawq standby master.</description>
</property>
<property>
<name>hawq_segment_address_port</name>
<value>40000</value>
<description>The port of hawq segment.</description>
</property>
<property>
<name>hawq_dfs_url</name>
<value>hdfsha/hawq_default</value>
<description>URL for accessing HDFS.</description>
</property>
<property>
<name>hawq_master_directory</name>
<value>/data01/hawq/masterdd</value>
<description>The directory of hawq master.</description>
</property>
<property>
<name>hawq_segment_directory</name>
<value>/data01/hawq/segmentdd</value>
<description>The directory of hawq segment.</description>
</property>
<property>
<name>hawq_global_rm_type</name>
<value>yarn</value>
</property>
<property>
<name>hawq_rm_yarn_address</name>
<value>hdpmaster02:8032</value>
</property>
<property>
<name>hawq_rm_yarn_scheduler_address</name>
<value>hdpmaster02:8030</value>
</property>
<property>
<name>hawq_rm_yarn_queue_name</name>
<value>default</value>
<description>The YARN queue name to register hawq resource manager.</description>
</property>
<property>
<name>hawq_rm_yarn_app_name</name>
<value>hawq</value>
<description>The application name to register hawq resource manager in YARN.</description>
</property> You can leave the others options unchanged. NOTE: if you have a postgresql instance running on the master nodes, you must change the property hawq_master_address_port Write the slaves
FQDN in the /opt/hawq/etc/slaves file e.g. [hawqadmin@hdpmaster01 etc]$ echo -e "hdpslave01\nhdpslave02\nhdpslave03" > slaves
[hawqadmin@hdpmaster01 etc]$ cat slaves
hdpslave01
hdpslave02
hdpslave03 Copy the configuration files on all other hosts, in the /opt/hawq/etc/ directory Now, as hdfs user, create the hawqadmin home and the data dir on hdfs [hdfs@hdpmaster01~]$ hdfs dfs -mkdir /user/hawqadmin && hdfs dfs -chown hawqadmin /user/hawqadmin
[hdfs@hdpmaster01~]$ hdfs dfs -mkdir /hawq_default && hdfs dfs -chown hawqadmin /hawq_default
[hdfs@hdpmaster01~]$ On both masters,
create the master data dir: mkdir -p /data01/hawq/masterdd && chown -R hawqadmin /data01/hawq Create the segments
data dir on all slaves mkdir -p /data01/hawq/segmentdd && chown -R hawqadmin /data01/hawq Initialize the cluster as hawqadmin user. Remember to source the environment file before execute any action (/opt/hawq/greenplum_path.sh) [hawqadmin@hdpmaster01 hawq]$ cd /opt/hawq/
[hawqadmin@hdpmaster01 hawq]$ source greenplum_path.sh
[hawqadmin@hdpmaster01 hawq]$ hawq init cluster -av
[...]
20160229:15:42:40:158114 hawq_init:hdpmaster01:hawqadmin-[INFO]:-Init HAWQ cluster successfully
[hawqadmin@hdpmaster01 hawq]$ The init statement also starts the cluster, so you can now check the cluster state with the following command hawq state cluster You can also see the running application on YARN: [hawqadmin@hdpmaster01 ~]$ yarn application -list | awk '/application_/ {printf ("%s\t%s\t%s\t%s\t%s\n", $1,$2,$3,$4,$5)}'
application_1456240841318_0026 hawq YARN hawqadmin default Now, connect to the
database and create a sample table: [hawqadmin@hdpmaster01 hawq]$ psql -d postgres
psql (8.2.15)
Type "help" for help.
postgres=# \d
No relations found.
postgres=# create table test (field1 int, field2 varchar(30));
CREATE TABLE
postgres=# \d+ test
Append-Only Table "public.test"
Column | Type | Modifiers | Storage | Description
--------+-----------------------+-----------+----------+-------------
field1 | integer | | plain |
field2 | character varying(30) | | extended |
Compression Type: None
Compression Level: 0
Block Size: 32768
Checksum: f
Has OIDs: no
Options: appendonly=true
Distributed randomly
postgres=# insert into test (field1, field2) values (1, 'May the hawq be with you');
INSERT 0 1
postgres=# select * from test;
field1 | field2
--------+---------------------------
1 | May the hawq be with you
(1 row)
postgres=#
That's all! 🙂
... View more
Labels:
02-23-2016
03:47 PM
1 Kudo
Hi all, I get an java.lang.IndexOutOfBoundsException while trying to execute a select distinct(...) on a big hive table (about 60 GB). This is the log of the Tez vertex: 2016-02-23 16:35:03,039 [ERROR] [TezChild] |tez.TezProcessor|: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.IndexOutOfBoundsException
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:326)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:355)
at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:141)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
... 16 more
Caused by: java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkBounds(Buffer.java:567)
at java.nio.ByteBuffer.get(ByteBuffer.java:686)
at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:285)
at org.apache.hadoop.hdfs.BlockReaderLocal.readWithBounceBuffer(BlockReaderLocal.java:609)
at org.apache.hadoop.hdfs.BlockReaderLocal.read(BlockReaderLocal.java:569)
at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:737)
at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:793)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:853)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:59)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:91)
at org.apache.hadoop.mapred.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:208)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:246)
at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:48)
at org.apache.hadoop.hive.ql.exec.Utilities.skipHeader(Utilities.java:3911)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:337)
... 22 more
I already tried to disable vectorization and to increment the tez container size, but nothing changed. If I execute the query on the same table, but with less data inside, all goes right. Do you already seen this kind of error? Thank you, D.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache Tez
02-16-2016
10:28 AM
You're right, I had not though about config groups! Sorry and thank you a lot! 🙂
... View more
02-16-2016
08:53 AM
1 Kudo
Hi all, I have an ambari-managed cluster in which there are 2 ingestion server on which runs Flume. While I need to have different flume agents, I have to define them on ambari and make them run only on one server. So, I need to start a single agent on server ingestion1 and stop it on server ingestion2. This way, ambari check the flume service as stopped and send me a notification about this. Is there a way to monitor flume in this configuration, or I can tell ambari to not define the single agent in both servers? Thank you, D.
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Flume
02-05-2016
10:44 AM
3 Kudos
After installing zeppelin all goes fine, but when you try to connect you may get a "disconnected" status on the notebook tab and you're unable to create any new note, while the interpreter tab works fine. If you checked you can reach the port on which zeppelin is listen, it may be an issue with your content filter firewall and you should see this log in your firewall application: 2016-02-04 15:47:58 Deny 192.168.0.128 40.112.76.49 http/tcp 57772 9995 1-Ecube 0-Internet ProxyDeny: HTTP Invalid Request-Line Format (TCP-UDP-isoardi OUT-00) HTTP-Client.isoardi proc_id="http-proxy" rc="594" msg_id="1AFF-0005" proxy_act="HTTP-Client.isoardi" line="\x81\x8d\xf2\x9eW\xfe\x89\xbc8\x8e\xd0\xa4u\xae\xbb\xd0\x10\xdc\x8f\x81\x8d\xf8]\xb8z\x83\x7f\xd7\x0a" Traffic The solution is to disable the content filter for the domain on which zeppelin is running
... View more
Labels: