Member since
10-20-2017
59
Posts
0
Kudos Received
0
Solutions
02-14-2019
10:07 PM
sc = SparkContext()
sqlContext = SQLContext(sc)
try:
df = sqlContext.createDataFrame(jsonobj)
except IOError:
logger.exception(jsonobj)
schema = df.printSchema()
sc.stop()
return schema The above code throws, cannot infer schema on empty dataset for some datasets. What does this error mean? How do I fix this? df = sqlContext.createDataFrame(jsonobj)
File "/remote/vgrnd104/guntaka/anaconda3/lib/python3.6/site-packages/pyspark/sql/context.py", line 302, in createDataFrame
return self.sparkSession.createDataFrame(data, schema, samplingRatio, verifySchema)
File "/remote/vgrnd104/guntaka/anaconda3/lib/python3.6/site-packages/pyspark/sql/session.py", line 691, in createDataFrame
rdd, schema = self._createFromLocal(map(prepare, data), schema)
File "/remote/vgrnd104/guntaka/anaconda3/lib/python3.6/site-packages/pyspark/sql/session.py", line 410, in _createFromLocal
struct = self._inferSchemaFromList(data, names=schema)
File "/remote/vgrnd104/guntaka/anaconda3/lib/python3.6/site-packages/pyspark/sql/session.py", line 337, in _inferSchemaFromList
raise ValueError("can not infer schema from empty dataset")
ValueError: can not infer schema from empty dataset
... View more
Labels:
01-13-2019
06:00 AM
I was able to GET schema fine. But to POST a new schema, I need some clarification. How do I post a schemaobj with a schema name using the REST? The below command throws error curl -X POST "http://<registry-host>:7788/api/v1/schemaregistry/schemas" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"type\": \"avro\", \"schemaGroup\": \"myproject\", \"name\": \"testschema\", \"description\": \"testing schema\", \"compatibility\": \"NONE\", \"validationLevel\": \"LATEST\"}"
Unmatched ".
... View more
Labels:
01-06-2019
07:04 AM
I have registered the schema for my data manually using the Registry UI and then started ingesting data. If I have to validate the schema against the version registered or identify new schemas that may come during ingestion, how do I automate the schema generation part (the schema file)?
... View more
Labels:
01-02-2019
11:04 PM
After trying an extensive set of things, what worked was adding the path to the mysql-connector jar to the CLASSPATH in the .env file for streamline.
... View more
01-08-2019
04:30 PM
@Swaapnika Guntaka Can you provide more details on which connector and class path you used? I am seeing this same error today. I have tried to put the connector into my class path and classpath/bin as well as a few other locations but the error remains. TIA for your reply.
... View more
12-11-2018
11:57 PM
I found that root is trying to execute the below command. Why is root executing this command or how do I make root execute this command successfully? My service account for schemaregistry is 'registry' which is already in the sudoers file. Execute['export JAVA_HOME=/SCRATCH/jdk1.8.0_91 ; source /usr/hdf/current/registry/conf/registry-env.sh ; /usr/hdf/current/registry/bootstrap/bootstrap-storage.sh migrate'] {'user': 'root'}
... View more
09-26-2018
09:01 PM
if you have existing data in your HBase, it's possible that is not serialized in the way Phoenix deserializes it for reading and hence the Illegal data exception, so it is recommended that you try declaring all your fields Varchar for string types or Unsigned_int for integer types. If you want to use other data types, it is better you insert data and read through Phoenix only.
... View more
04-23-2018
07:29 AM
@Swaapnika Guntaka Hey don't panic the files is right in there, the .Trash hides some subdirectories /Current/user see below. Replace {xxx} with the hdfs who deleted the file and after the last / you will see all the files deleted and are not yet expurged from HDFS in your case 360. As the user or hdfs run the below $hdfs dfs -ls /user/{xxx}/.Trash/Current/user/{xxx}/ and to restore the file $ hdfs dfs -cp /user/{xxx}/.Trash/Current/user/{xxx}/deleted_file /user/{xxx}/ Hope that helps
... View more
03-23-2018
12:58 AM
What HDP version you are using?
... View more
03-17-2018
06:56 AM
Is it an issue with a variable used in the code but not defined? Very difficult to say using the limited details you have provided.
... View more
02-28-2018
12:31 AM
I think I found the answer. The dfs.datanode.dir was found inconsistent as I saw it from the logs. I added a healthy datanode, balanced the cluster then deleted the data direcories from the other inconsistent nodes after taking a backup at /tmp. Restarting after that works fine now.
... View more
02-05-2018
06:48 PM
I'm using an existing MySQL DB for the Hive metastore(I have created the hive user and created the hive DB and setup ambariserver using the jdbc driver). And when trying to install HiveServer2 , I get the below error. Caught an exception while executing custom service command: <type 'exceptions.OSError'>: [Errno 2] No such file or directory: '/var/lib/ambari-agent/cred/conf/hive_server/hive-site.jceks'; [Errno 2] No such file or directory: '/var/lib/ambari-agent/cred/conf/hive_server/hive-site.jceks'
... View more
Labels:
02-02-2018
11:01 PM
I still face the issue. I'm doing a non-root installation. I build the psutil as root which went fine. But when I try to restart the metrics monitor it fails.
... View more
02-10-2018
03:00 PM
@Swaapnika Guntaka, The blueprint itself works fine with 2.6. You will have to update the versions of Ambari and the HDP repositories to deploy it on 2.6. It is not sufficient to just change the stack version in the blueprint.
... View more
04-12-2018
08:15 AM
@Swaapnika Guntaka Yes, that error may occur if the "zookeeper-client" is a Directory instead of Symlink. So in that case you need to do this: # mv /usr/hdp/current/zookeeper-client /usr/hdp/current/zookeeper-client_BAK . Then you can either try creating that symlink on your own Or try instsalling the client again.
... View more
05-16-2018
02:26 PM
dont understand why this is needed , but , it did it. thanks
... View more
01-16-2018
04:08 AM
1 Kudo
@Swaapnika Guntaka, Can you try adding a space between "X-Requested-By: ambari" and -X PUT and try again curl -H "X-Requested-By: ambari" -X PUT -u admin:admin http://<ambari-server.com>:8080/api/v1/stacks/HDP/versions/2.6/operating_systems/redhat6/repositories/HDP-2.6 -d @repo.json Thanks, Aditya
... View more
11-14-2017
05:25 PM
Setting to 6667 worked. Thanks
... View more
11-15-2017
07:29 PM
Confluent is the support company for Kafka. I personally would trust their code more than someone else's.
... View more
11-10-2017
10:52 PM
I'm using flume to get data from Kafka to HDFS. (Kafka Source and HDFS Sink). These are the versions I'm using. HDP 2.6.2.0-205
Flume -1.5.2.2.6.2-205 This is my flume.conf. agent1.sources = kafka-source
agent1.channels = memory-channel
agent1.sinks = hdfs-sink
agent1.sources.kafka-source.type = org.apache.flume.source.kafka.KafkaSource
agent1.sources.kafka-source.batchSize = 5
agent1.sources.kafka-source.kafka.consumer.timeout.ms = 100
agent1.sources.kafka-source.kafka.topics = test
agent1.sources.kafka-source.kafka.bootstrap.servers = localhost:9092
agent1.channels.memory-channel.type = memory
agent1.channels.memory-channel.capacity = 10000
agent1.channels.memory-channel.transactionCapacity = 1000
agent1.sinks.hdfs-sink.type = hdfs
agent1.sinks.hdfs-sink.hdfs.path = /tmp/kafka/%{topic}/%y-%m-%d
agent1.sinks.hdfs-sink.channel = memory-channel
I'm getting the below. This keeps coming and I don't see data in HDFS. 2017-11-10 14:36:21,713 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:409)] Closing /tmp/kafka/test/17-11-10/FlumeData.1510353261651.tmp
2017-11-10 14:36:21,713 (hdfs-hdfs-sink-call-runner-1) [INFO - org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:675)] Renaming /tmp/kafka/test/17-11-10/FlumeData.1510353261651.tmp to /tmp/kafka/test/17-11-10/FlumeData.1510353261651
2017-11-10 14:36:21,716 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:284)] Creating /tmp/kafka/test/17-11-10/FlumeData.1510353261652.tmp
2017-11-10 14:36:21,722 (hdfs-hdfs-sink-call-runner-0) [INFO - org.apache.flume.sink.hdfs.AbstractHDFSWriter.reflectGetNumCurrentReplicas(AbstractHDFSWriter.java:184)] FileSystem's output stream doesn't support getNumCurrentReplicas; --HDFS-826 not available; fsOut=org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer; err=java.lang.NoSuchMethodException: org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.getNumCurrentReplicas()
2017-11-10 14:36:21,722 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.getRefIsClosed(BucketWriter.java:233)] isFileClosed() is not available in the version of the distributed filesystem being used. Flume will not attempt to re-close files if the close fails on the first attempt
... View more
Labels:
10-30-2017
05:03 PM
I have started ambari server and agents as non root user. I've configured /etc/sudoers to give necessary permissions. But due to LDAP setup I had to create the service users like hdfs, hive etc manually as a sudo user again.(sudo useradd -G hadoop hdfs). I'm unable to start hadoop services. When I do su hdfs from the user I created the user, it asks me for a password. Is this causinf=g a trouble to start hdfs ? Or if I have to start hdfs manually as hdfs user, is it ok if su hdfs from root and do that(su hdfs from root doesnt ask me for a password)?
... View more
Labels:
10-26-2017
02:30 AM
@Swaapnika Guntaka The doc states that: Configuring Ambari Agents to run as non-root requires that you manually
install agents on all nodes in the cluster. For these details, see Installing Ambari Agents Manually. After installing each agent, you must configure the agent to run as the desired, non-root user. https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-security/content/how_to_configure_an_ambari_agent_for_non-root.html
... View more
10-26-2017
02:52 AM
@Swaapnika Guntaka This thread looks duplicate of other thread: https://community.hortonworks.com/questions/142515/ambari-as-non-root-failing-to-register.html Can we close any one of them and continue on any one of the thread.
... View more
10-26-2017
03:00 AM
Hi @Swaapnika Guntaka, I hope this is link might help you configuring ambari-agent and ambari-server using two different users : https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-security/content/configuring_ambari_for_non-root.html
... View more
10-24-2017
06:34 PM
@Swaapnika Guntaka Looks like the Ambari Agent which is basically a Python script is not able to determine the hostname (FQDN) So you can also try running the following command on the problematic host to see if it is returning the correct hostname or not? [root@sandbox ~]# python -c "import socket; print socket.getfqdn()"
sandbox.hortonworks.com . Also please check if you have setup the Hostname as mentioned in the following 3 links: https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-installation-ppc/content/set_the_hostname.html https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-installation-ppc/content/edit_the_host_file.html https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-installation-ppc/content/edit_the_host_file.html . On Centos Or RHEL you can also use the following command to setup the hostname: # sysctl kernel.hostname = agent1.example.com .
... View more
10-24-2017
02:39 AM
@Swaapnika Guntaka Can you please share the output of the following command: # grep 'jdbc' /etc/ambari-server/conf/ambari.properties Also just want to make double check that you would have followed the below mentioned steps as it is including "FLUSH PRIVILEGES"
# mysql -u root -p
mysql> CREATE DATABASE ambari;
ambari> USE ambari;
ambari> SOURCE /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql Then ambari> use mysql;
CREATE USER 'ambari'@'%' IDENTIFIED BY 'bigdata';
GRANT ALL PRIVILEGES ON *.* TO 'ambari'@'%';
CREATE USER 'ambari'@'localhost' IDENTIFIED BY 'bigdata';
GRANT ALL PRIVILEGES ON *.* TO 'ambari'@'localhost';
CREATE USER 'ambari'@'standaloneambari.example.com' IDENTIFIED BY 'bigdata'; <---- Replace the host name
GRANT ALL PRIVILEGES ON *.* TO 'ambari'@'standaloneambari.example.com'; <---- Replace the host name
FLUSH PRIVILEGES; Also please check if your mysql is listening on all addresses: # netstat -tnlpa | grep 3306
tcp 0 0 :::3306 :::* LISTEN 314/mysqld Also please double check your Ambari Server Hostname (FQDN) to verify that it has proper entry inside it's /etc/hosts file. # cat /etc/hosts
# hostname -f Setting up proper FQDN is one of the requirement for ambari managed cluster. .
... View more
10-20-2017
11:55 PM
even changing to localhost didn't help. I also have all the ports accessible form all the other hosts to ambari-server host.
... View more
01-18-2018
09:33 PM
with HDP-2.6, I'm facing an issue with the zookeeper-server and client install with the above config. I tried removing and re-installing but that didn't work either. mkdir: cannot create directory `/usr/hdp/current/zookeeper-client': File exists
... View more