Member since
11-12-2015
90
Posts
1
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4168 | 06-09-2017 01:52 PM | |
10106 | 02-24-2017 02:32 PM | |
7677 | 11-30-2016 02:48 PM | |
3013 | 03-02-2016 11:14 AM | |
1992 | 12-16-2015 07:11 AM |
05-14-2019
10:16 AM
Hello,
I'm installing a new Cloudera 6.2 Cluster and I used to use The sentry policy file for creating roles, groups and users. Now I'm trying to migrate that to the sentry service configuration. But I'm stuck in this issue, and I think I missed a step.
This is what I did:
Enable the sentry service in Hive and Impala.
Enable Sentry Synchronization in HDFS.
Create an admin user (in my case I used the impala user).
Create a test group (group_testdb_admin) in the "Manage user" section in HUE.
Create a test role (testdb_admin_role) in the security section. ( server= server1 db=testdb action= ALL)
Assing the role to the group.
Create a testuser1 and assigned the group that I just created to the user.
I can confirm that Sentry is Synchronized with HDFS:
sudo -u hdfs hdfs dfs -getfacl /user/hive/warehouse/testdb.db
group:group_testdb_admin:rwx
Also, the roles and groups are created
SHOW ROLE GRANT GROUP group_testdb_admin;
testdb_admin_role
But here is my problem. When I login as testuser1 and try to access the testdb database I get an AuthorizationException
show tables in testdb;
AuthorizationException: User 'usertest1' does not have privileges to access: testdb.*.*
Considerations:
- I'm not using a Kerberized Cluster.
- I didn't create the user in the local FS.
So, what step I'm missing?.
Regards,
Silva
... View more
Labels:
05-08-2019
10:21 AM
I'm having the same issue. This is my agent configuration: [General]
server_host=cloudera-1
server_port=7182
max_collection_wait_seconds=10.0
metrics_url_timeout_seconds=30.0
task_metrics_timeout_seconds=5.0
monitored_nodev_filesystem_types=nfs,nfs4,tmpfs
local_filesystem_whitelist=ext2,ext3,ext4,xfs
impala_profile_bundle_max_bytes=1073741824
stacks_log_bundle_max_bytes=1073741824
stacks_log_max_uncompressed_file_size_bytes=5242880
orphan_process_dir_staleness_threshold=5184000
orphan_process_dir_refresh_interval=3600
scm_debug=INFO
dns_resolution_collection_interval_seconds=60
dns_resolution_collection_timeout_seconds=30 This is my Cloudera Server config: Link to the image: https://imgur.com/SIciDFr Regards, Silva
... View more
04-12-2019
08:31 AM
UP I'm adding the Flume conf: tier1.sources = source1
tier1.channels = channel1
tier1.sinks = sink1
# For each source, channel, and sink, set
# standard properties.
#Source
tier1.sources.source1.channels=channel1
tier1.sources.source1.type=exec
tier1.sources.source1.command=/root/flume-source.sh
#Channel
tier1.channels.channel1.type = memory
tier1.channels.channel1.capacity = 10000
tier1.channels.channel1.transactionCapacity = 1000
#Sink
tier1.sinks.sink1.type = org.apache.flume.sink.kudu.KuduSink
tier1.sinks.sink1.masterAddresses = localhost
tier1.sinks.sink1.tableName = stats
tier1.sinks.sink1.channel = channel1
tier1.sinks.sink1.batchSize = 50
tier1.sinks.sink1.producer = org.apache.kudu.flume.sink.SimpleKuduEventProducer And the Code of the SimpleKuduEventProducer: https://gist.github.com/JoaquinSV/de9432e8ac0478934d3affdacd463762
... View more
01-30-2019
03:03 PM
Hello, I'm trying to use the Kudu Sink but I'm getting this error when I start Flume: Unhandled error
java.lang.NoSuchMethodError: org.apache.flume.Context.getSubProperties(Ljava/lang/String;)Lorg/apache/kudu/shaded/com/google/common/collect/ImmutableMap;
at org.apache.kudu.flume.sink.KuduSink.configure(KuduSink.java:226)
at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:411)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745) This is my software versions: Flume NG 1.6.0+cdh5.13.0+169 Kudu Sink kudu-flume-sink-1.8.1-SNAPSHOT.jar Kudu 1.5.0+cdh5.13.0+0 Also, I tried adding this: <relocation>
<pattern>com.google.common</pattern>
<shadedPattern>org.apache.kudu.shaded.com.google.common</shadedPattern>
<excludes>
<exclude>com.google.common.collect.ImmutableMap*</exclude>
<exclude>com.google.common.collect.ImmutableEnumMap*</exclude>
</excludes>
</relocation> To the pom.xml (KUDU-2241), but didn't work and I had the same error. Regards, Silva
... View more
Labels:
- Labels:
-
Apache Flume
-
Apache Kudu
01-10-2019
08:01 AM
But I need to know which specific queries spills into disk, generating the scratch files. Is possible to have that kind of information?.
... View more
01-09-2019
11:59 AM
Hello, A simple question: How can I know which queries generate Scratch files? I'm inspecting the Impalad logs and I couldn't find any information about the scratch file generation. Regards, Silva
... View more
Labels:
- Labels:
-
Apache Impala
07-09-2018
01:33 PM
I'm using this one: /api/v18/clusters/cluster/services/impala/impalaQueries?from=2018-05-31T0%3A0%3A0&filter=(user=userX)" Also, I solved this issue by restarting the Monitoring Service. Regards,
... View more
05-31-2018
03:14 PM
Hello, I was using the CM API and I think that I reached the maximum number of requests. ¿What is the maximum of requests and how can I increase this value? /api/v7/clusters/cluster/services/impala/impalaQueries?from=2018-05-31T0%3A0%3A0 {
"queries" : [ ],
"warnings" : [ "Impala query scan limit reached. Last end time considered is 2018-05-31T16:21:46.409Z" ]
} Regards, Joaquin
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Manager
10-15-2017
06:24 AM
Hello, In one of our Cloudera nodes the OS disk failed. We replaced that disk and re-installed the agent, setting: server_host=scm-server-host But the agent is not recognized as the same as before. Instead a new node is created with no role information: ¿How can I merge both information?.
... View more
Labels:
- Labels:
-
Cloudera Manager
10-06-2017
11:55 AM
Hello, I have this issue, the duration of some queries are very high. I'm seen it in the CM->Impala->queries table. For example, use <database>; in some cases took 45 min. I don't know if this is a problem with Impala, CM or Kudu. I'm using: Impala: 2.7.0-cdh5-IMPALA_KUDU-cdh5 CDH: 5.8.0-1.cdh5.8.0.p0.42 Kudu: 1.2.0-1.cdh5.10.0.p0.55 Regards,
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Kudu
-
Cloudera Manager
08-10-2017
08:20 AM
If you want to use spark2-shell and spark2-submit, you don't have to set those ENV variables. I set it because I wanted to point the current spark-shell/submit to spark2. This should be done in all the nodes that you want to use the shell and/or the submit. I forgot to add the changes that I made for spark-sumbit. In these files: /opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/bin/spark-submit
/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/bin/spark-submit Add this ENV var: SPARK_HOME=/opt/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2
... View more
06-09-2017
01:52 PM
1 Kudo
@saranvisa Thanks, that worked. Also, in order to point to the new Spark, I had to change some symbolic links and ENV variables. export SPARK_DIST_CLASSPATH=$(hadoop classpath)
ln -sf /opt/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2/bin/spark-shell /etc/alternatives/spark-shell
export SPARK_HOME=/opt/cloudera/parcels/SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904/lib/spark2
... View more
06-09-2017
09:06 AM
I forgot to mention that Spark 1.6 came with CDH 5.8. I don't know how CDH installed Spark.
... View more
06-09-2017
08:33 AM
Hello, I want to remove Spark 1.6 in order to Install Spark 2.1. But when I try to remove it with this command: sudo yum remove spark-core spark-master spark-worker spark-history-server spark-python source: https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cdh_ig_cdh_comp_uninstall.html The packages are not found. What should I do in order to remove Spark 1.6 from my Cluester?. Also, in a previous step I deleted it from my services. I'm using CDH 5.8.0.
... View more
Labels:
- Labels:
-
Apache Spark
-
Cloudera Manager
04-19-2017
09:20 AM
Hello, I'm getting this error very often from the Cloudera Monitor: [19/Apr/2017 12:56:07 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 902, in _get_completed_query_profiles
self._query_monitor.get_completed_queries(query_log_file)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 582, in get_completed_queries
completed_query_report_limit)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 457, in get_completed_queries
last_accessed_file_timestamp)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 257, in _get_completed_queries
file_filter=filters)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsLogStreaming-UNKNOWN-py2.7.egg/clusterstats/log/streaming/event_streamer.py", line 114, in __init__
self.__filtered_file_list = self.__apply_file_filter()
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsLogStreaming-UNKNOWN-py2.7.egg/clusterstats/log/streaming/event_streamer.py", line 186, in __apply_file_filter
self.__file_filter(filter_context)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/ClusterStatsCommon-0.1-py2.7.egg/clusterstats/common/chain.py", line 22, in __call__
succ = command(context)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 96, in __call__
self.__set_end_offset(f, evt2.get_datetime())
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.1-py2.7.egg/cmf/monitor/impalad/query_monitor.py", line 116, in __set_end_offset
f.set_end_offset(event.get_offset())
AttributeError: 'NoneType' object has no attribute 'get_offset' This is the frecuency that i'm getting this error: [16/Apr/2017 16:52:04 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[16/Apr/2017 20:44:04 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[18/Apr/2017 08:42:05 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[18/Apr/2017 15:08:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[18/Apr/2017 22:22:05 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 00:10:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 00:46:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 01:02:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 02:02:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 02:34:07 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 03:22:07 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 03:42:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 04:02:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 04:16:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 05:38:07 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 06:32:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 07:22:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 07:48:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 08:04:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 10:14:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 10:20:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 11:24:07 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 12:56:07 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log'
[19/Apr/2017 13:06:06 +0000] 118840 ImpalaDaemonQueryMonitoring query_monitor ERROR Error fetching completed queries from '/var/log/impalad/profiles/impala_profile_log' I'm using CDH 5.8.0-1.cdh5.8.0.p0.42 and IMPALA_KUDU 2.7.0-1.cdh5.9.0.p0.11 Regards, Joaquín Silva
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Manager
02-28-2017
04:26 AM
Thanks Todd, I will try that. Regards, Joaquín Silva
... View more
02-27-2017
07:16 AM
Hello, I'm getting this warning in the start up of impala-shell: /opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/impala-shell/ext-py/sasl-0.1.1-py2.7-linux-x86_64.egg/_saslwrapper.py:3: UserWarning: Module backports was already imported from None, but /usr/lib/python2.7/site-packages is being added to sys.path Tried with the impala administrator user and a client user, and I got the same warning. What is this warning?. I'm using Impala_kudu 2.7.0-1.cdh5.9.0.p0.11 Regards, Joaquín Silva
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Kudu
02-24-2017
01:42 PM
This is the result: /opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/parquet/bin/parquet-tools cat /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0
File /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0 does not exist But the file exists sudo -u hdfs hdfs dfs -ls /user/spot/flow/hive/y=2017/m=02/d=24/h=20
Found 12 items
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:05 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:10 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_1
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:55 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_10
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 18:00 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_11
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:15 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_2
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:20 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_3
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:25 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_4
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:30 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_5
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:35 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_6
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:40 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_7
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:45 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_8
-rwxr-xr-x 3 spot supergroup 440 2017-02-24 17:50 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_9
... View more
02-24-2017
01:08 PM
This is the result: show partitions spotdb.flow;
OK
y=2017/m=02/d=23/h=20
y=2017/m=02/d=23/h=21
y=2017/m=02/d=23/h=22
y=2017/m=02/d=23/h=23
y=2017/m=02/d=24/h=00
y=2017/m=02/d=24/h=01
y=2017/m=02/d=24/h=02
y=2017/m=02/d=24/h=03
y=2017/m=02/d=24/h=04
y=2017/m=02/d=24/h=05
y=2017/m=02/d=24/h=06
y=2017/m=02/d=24/h=07
y=2017/m=02/d=24/h=08
y=2017/m=02/d=24/h=09
y=2017/m=02/d=24/h=10
y=2017/m=02/d=24/h=11
y=2017/m=02/d=24/h=12
y=2017/m=02/d=24/h=13
y=2017/m=02/d=24/h=14
y=2017/m=02/d=24/h=15
y=2017/m=02/d=24/h=16
y=2017/m=02/d=24/h=17
y=2017/m=02/d=24/h=18
y=2017/m=02/d=24/h=19
y=2017/m=02/d=24/h=20 As I can see it recognize the partitions.
... View more
02-24-2017
12:53 PM
Hello, I have a table that is pointing to a location in HDFS, like this: # col_name data_type comment
treceived string
unix_tstamp bigint
tryear int
trmonth int
trday int
trhour int
trminute int
trsec int
tdur float
sip string
dip string
sport int
dport int
proto string
flag string
fwd int
stos int
ipkt bigint
ibyt bigint
opkt bigint
obyt bigint
input int
output int
sas int
das int
dtos int
dir int
rip string
# Partition Information
# col_name data_type comment
y int
m int
d int
h int
# Detailed Table Information
Database: spotdb
Owner: spot
CreateTime: Thu Feb 23 16:41:20 CLST 2017
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://HDFS-namenode:8020/user/spot/flow/hive
Table Type: EXTERNAL_TABLE
Table Parameters:
EXTERNAL TRUE
avro.schema.literal {\n \"type\": \"record\"\n , \"name\": \"FlowRecord\"\n , \"namespace\" : \"com.cloudera.accelerators.flows.avro\"\n , \"fields\": [\n {\"name\": \"treceived\", \"type\":[\"string\", \"null\"]}\n , {\"name\": \"unix_tstamp\", \"type\":[\"long\", \"null\"]}\n , {\"name\": \"tryear\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"trmonth\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"trday\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"trhour\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"trminute\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"trsec\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"tdur\", \"type\":[\"float\", \"null\"]}\n , {\"name\": \"sip\", \"type\":[\"string\", \"null\"]}\n , {\"name\": \"sport\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"dip\", \"type\":[\"string\", \"null\"]}\n , {\"name\": \"dport\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"proto\", \"type\":[\"string\", \"null\"]}\n , {\"name\": \"flag\", \"type\":[\"string\", \"null\"]}\n , {\"name\": \"fwd\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"stos\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"ipkt\", \"type\":[\"bigint\", \"null\"]}\n , {\"name\": \"ibytt\", \"type\":[\"bigint\", \"null\"]}\n , {\"name\": \"opkt\", \"type\":[\"bigint\", \"null\"]}\n , {\"name\": \"obyt\", \"type\":[\"bigint\", \"null\"]}\n , {\"name\": \"input\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"output\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"sas\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"das\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"dtos\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"dir\", \"type\":[\"int\", \"null\"]}\n , {\"name\": \"rip\", \"type\":[\"string\", \"null\"]}\n ]\n}
transient_lastDdlTime 1487878880
# Storage Information
SerDe Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim ,
serialization.format , But when I select that table it says that it's empty. I checked that HDFS loacation and it has parquet files: sudo -u hdfs hdfs dfs -ls /user/spot/flow/hive/y=2017/m=02/d=23/h=23
Found 12 items
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:05 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:10 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_1
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:55 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_10
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 21:00 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_11
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:15 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_2
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:20 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_3
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:25 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_4
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:30 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_5
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:35 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_6
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:40 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_7
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:45 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_8
-rwxr-xr-x 3 spot supergroup 440 2017-02-23 20:50 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_9 What did i do wrong? Regards, Joaquín Silva
... View more
Labels:
- Labels:
-
Apache Hive
-
HDFS
02-21-2017
08:21 AM
Hello, I have a string like "0010011" and I want to convert it to int. It is the inverse of function bin(a int) I think cast("0010011" as binary) Doesn't exists How can I do that? Regards, Joaquín Silva
... View more
Labels:
- Labels:
-
Apache Impala
11-30-2016
02:48 PM
Solved. I had install the JCE Policy file.
... View more
11-30-2016
01:32 PM
Hello, I'm having this error when I try to start HDFS after enabling Kerberos: java.io.IOException: Login failure for hdfs/levante.akainix.local@lebeche.akainix.local from keytab hdfs.keytab: javax.security.auth.login.LoginException: No supported encryption types listed in default_tkt_enctypes The encriptation type that i'm using is aes256-cts-hmac-sha1-99, as show the /etc/krb5.conf file: default_tgs_enctypes = aes256-cts-hmac-sha1-96
default_tkt_enctypes = aes256-cts-hmac-sha1-96
permitted_enctypes = aes256-cts-hmac-sha1-96 Other thing is that the node that contains the KDC started correctly but the rest of them showed the error. Thanks Joaquín
... View more
- Tags:
- Kerberos
Labels:
- Labels:
-
Cloudera Manager
-
Kerberos
11-26-2016
06:57 AM
No one stopped or uninstalled the agent manually because I'm the only one that manages that server. What I did that day was reinstall a MySQL server, I don't know if that is related with this issue. Running cloudera-scm-agent seems that is was uninstalled: Failed to start cloudera-scm-agent.service: Unit cloudera-scm-agent.service failed to load: No such file or directory. So I reinstalled the agent and now is working. Thanks
... View more
11-24-2016
07:00 AM
Hello, One of the Cloudera Agents shut down with this error: [15/Nov/2016 19:48:05 +0000] 14910 MainThread agent INFO Stopping agent...
[15/Nov/2016 19:48:05 +0000] 14910 MainThread agent INFO No extant cgroups; unmounting any cgroup roots
[15/Nov/2016 19:48:05 +0000] 14910 MainThread agent INFO 10 processes are being managed; Supervisor will continue to run.
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus STOPPING
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('totoro.akainix.local', 9000)) shut down
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Stopped thread '_TimeoutMonitor'.
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus STOPPED
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus STOPPING
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('totoro.akainix.local', 9000)) already shut down
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE No thread running for None.
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus STOPPED
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus EXITING
[15/Nov/2016 19:48:05 +0000] 14910 MainThread _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus EXITED
[15/Nov/2016 19:48:05 +0000] 14910 MainThread agent INFO Cleaning up daemon
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 agent INFO Stopping agent...
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 agent INFO No extant cgroups; unmounting any cgroup roots
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 agent ERROR Shutdown callback failed.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.1-py2.7.egg/cmf/agent.py", line 2777, in stop
f()
File "/usr/lib64/python2.7/asyncore.py", line 409, in close
self.socket.close()
File "/usr/lib64/python2.7/asyncore.py", line 636, in close
os.close(self.fd)
OSError: [Errno 9] Bad file descriptor
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 agent INFO 10 processes are being managed; Supervisor will continue to run.
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 agent ERROR Shutdown callback failed.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.1-py2.7.egg/cmf/agent.py", line 2777, in stop
f()
File "/usr/lib64/python2.7/asyncore.py", line 409, in close
self.socket.close()
File "/usr/lib64/python2.7/asyncore.py", line 636, in close
os.close(self.fd)
OSError: [Errno 9] Bad file descriptor
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus STOPPING
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('totoro.akainix.local', 9000)) already shut down
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE No thread running for None.
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus STOPPED
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus STOPPING
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('totoro.akainix.local', 9000)) already shut down
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE No thread running for None.
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus STOPPED
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus EXITING
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 _cplogging INFO [15/Nov/2016:19:48:05] ENGINE Bus EXITED
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 agent ERROR Shutdown callback failed.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.1-py2.7.egg/cmf/agent.py", line 2777, in stop
f()
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/pyinotify-0.9.3-py2.7.egg/pyinotify.py", line 1424, in stop
self._pollobj.unregister(self._fd)
KeyError: 15
[15/Nov/2016 19:48:05 +0000] 14910 Dummy-14 agent INFO Cleaning up daemon And I cant restart the agent because it seems that it was unistalled. All the services work fine, only the Clouder agent stopped working. Please help, I'm very lost with this. Regards, Joaquín
... View more
- Tags:
- cloudera agent
Labels:
- Labels:
-
Cloudera Manager
07-11-2016
08:15 AM
Hello, I'm trying to run a Spark submit,but I get this error:
WARN scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 70, totoro.akainix.local): java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:51)
at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60)
at org.apache.spark.Logging$class.logInfo(Logging.scala:58)
at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60)
at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:93)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:575)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:565)
at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
ERROR cluster.YarnScheduler: Lost executor 18 on totoro.akainix.local: Container marked as failed: container_1468247436212_0003_01_000019 on host: totoro.akainix.local. Exit status: 50. Diagnostics: Exception from container-launch.
Container id: container_1468247436212_0003_01_000019
Exit code: 50
Stack trace: ExitCodeException exitCode=50:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 50
I'm using Spark 1.6.0 on YARN, and this is the tutorial that I'm following.
Please help me, I'm completely lost.
EDIT: here is more info about the error:
16/07/11 12:32:23 ERROR executor.Executor: Exception in task 0.0 in stage 3.0 (TID 72) java.lang.AbstractMethodError at org.apache.spark.Logging$class.log(Logging.scala:51) at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60) at org.apache.spark.Logging$class.logInfo(Logging.scala:58) at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60) at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:93) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:575) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:565) at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 16/07/11 12:32:23 ERROR util.SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,main] java.lang.AbstractMethodError at org.apache.spark.Logging$class.log(Logging.scala:51) at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60) at org.apache.spark.Logging$class.logInfo(Logging.scala:58) at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60) at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:93) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:575) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:565) at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
06-28-2016
10:30 AM
Hello, This is my problem, I have a string columns with values that are separated by ';' , and I want to see it as an array using cast. Here is what I want to do: select cast("hello;how;are;you" as ARRAY(separated by ";")); It is possible to do this?, I'm using Impala 2.5 on CDH 5.7. Regards,
... View more
Labels: