Member since
07-10-2017
68
Posts
30
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4186 | 02-20-2018 11:18 AM | |
3415 | 09-20-2017 02:59 PM | |
18056 | 09-19-2017 02:22 PM | |
3614 | 08-03-2017 10:34 AM | |
2291 | 07-28-2017 10:01 AM |
09-19-2017
08:34 AM
@Funamizu Koshi import re from pyspark.sql.functions import UserDefinedFunction from pyspark.sql.types import * udf = UserDefinedFunction(lambda x: re.sub(',','',x), StringType()) new_df = df.select(*[udf(column).alias(column) for column in df.columns]) new_df.collect()
... View more
09-19-2017
07:28 AM
1 Kudo
Hi Jasper, I don't think you have a problem while logging in. https://github.com/streamsets/datacollector/blob/master/apache-kafka_0_9-lib/src/main/java/org/apache/kafka/common/security/kerberos/Login.java After skimming through the above link, I guess if there was any error w.r.t. having a valid ticket, you would've got a log in KerberosLogin itself. What principal are you using? Can you check contents of kafka_client_jaas.conf. Is it of the form below: KafkaClient {
com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true;
}; Or you have keytab configuration? If former, please see kafka_jaas.conf for Client section and kinit with user/keytab mentioned there. Try running the command again as: /usr/hdp/2.5.5.0-157/kafka/bin/kafka-consumer-groups.sh --bootstrap-server $BROKER_LIST --security-protocol PLAINTEXTSASL --new-consumer --describe --group spoutconsumer -Djava.security.auth.login.config= /etc/kafka/kafka_jaas.conf If above command does not work, try exporting that variable.
... View more
09-04-2017
06:59 AM
2 Kudos
Hi @pp z Check whether console consumer works with --bootstrap-server = broker instead of --zookeeper. Check advertised listener in server.properties. Is it of the form PLAINTEXTSASL://host:port? Check whether security_protocol = PLAINTEXTSASL exists in server.properties. Are you getting the exception: WARN SASL configuration failed:javax.security.auth.login.LoginException: No JAAS configuration section named 'Client' was found in specified JAAS configuration file. If yes, then modify kafka_client_jaas.conf to include a Client section, you'll most likely get that in kafka_jaas.conf. If not, then ignore this. Use --security-protocol PLAINTEXTSASL instead of SASL_PLAINTEXT. Check whether you're able to start a authenticated connection to zookeeper-client. If not, give the path of conf file that has client section as a JVM param. For example: export JVMFLAGS="-Djava.security.auth.login.config= /usr/hdp/2.6.1.0-129/kafka/conf/kafka_client_jaas.conf" Try these and let me know. Thanks
... View more
08-13-2017
08:50 PM
@Sofian Benabdelhak Please check the answer here, It may help. https://community.hortonworks.com/questions/114024/invalid-kdc-administrator-credentials.html?childToView=117774#answer-117774 Also use that API and set credentials to persisted.
... View more
08-12-2017
02:20 AM
@Mugdha One more thing, I was seeing the config versions for YARN, and in v1 (just after installation), yarn.acl.enable is set to false. Is this property set to true while Kerberizing?
... View more
08-11-2017
11:31 AM
Hi, I set up Ambari-server security https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Ambari_Security_Guide/content/_configuring_http_authentication_for_HDFS_YARN_MapReduce2_HBase_Oozie_Falcon_and_Storm.html And followed https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_ambari_views_guide/content/section_kerberos_setup_tez_view.html for setting up Tez view with Kerberos. However, I'm not seeing any data in query tab anymore, though the same query gets listed in DAGs and all information is available there. Before setting up my cluster for Kerberos, I saw the query tab populated with old/new queries. Does someone know why this may be happening? Thanks screen-shot-2017-08-11-at-45052-pm.png screen-shot-2017-08-11-at-45032-pm.png screen-shot-2017-08-11-at-45121-pm.png
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez
08-03-2017
10:34 AM
IPC is a generic concept. It's not particular to Hive. In fact several hadoop service communicate this way. https://wiki.apache.org/hadoop/ipc Using IPC, clients can connect to Server components at a certain port and invoke methods exposed by a server. See properties related to ipc.client here : https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
... View more
08-03-2017
06:56 AM
@Ann A You can use the concept of time window. These two links may help you: http://blog.madhukaraphatak.com/introduction-to-spark-two-part-5/ https://stackoverflow.com/questions/37632238/how-to-group-by-time-interval-in-spark-sql
... View more
08-01-2017
02:58 PM
Naveen, Can you check Kerberos ACL? RHEL/CentOS/Oracle Linux vi /var/kerberos/krb5kdc/kadm5.acl SLES vi /var/lib/kerberos/krb5kdc/kadm5.acl Ubuntu/Debian vi /etc/krb5kdc/kadm5.acl Default settings would be similar to: */admin@EXAMPLE.COM* or in your case */admin@DEV.DATAQUEST.COM* This means that only principals matching the above regex would be considered as admins. So try changing your principal to kadmin/admin@DEV.DATAQUEST.COM instead. Or add a line in the acl giving permission to kadmin. Let me know if this works.
... View more
07-29-2017
04:23 AM
Hi, I'm not very sure but you could use flume to get data into HDFS by using an hdfs sink. https://flume.apache.org/FlumeUserGuide.html The location in hdfs is mentioned in flume-agent.conf file, for example: agent_foo.sinks.hdfs-Cluster1-sink.hdfs.path = hdfs://namenode/flume/webdata You could write a script to modify this directory with a timestamp and restart the flume agent. And then run that every week through cron.
... View more
- « Previous
- Next »