About bjorn_olsen

bjorn_olsen · ‎05-24-2019

I'm running RHEL and ran into similar problems due to the fun configuration of Python2 and 3 and SCL on RedHat. The root cause is the /usr/bin/hdp-select script was written to be Python2 compatible. The differences between Python2 and 3 are causing these issues, the script is unfortunately not compatible with both versions. To resolve, we had to modify the hdp-select script to be compatible with both versions. I would attach mine but it might break your environment as there is a lot of hardcoded values such as your HDP component versions. So you'll need to do these steps manually. Steps: 1. Make a backup of the file. sudo cp -p /usr/bin/hdp-select /usr/bin/hdp-select_original 2. As root, edit the file. 3. Add parenthesis around all print statements. Example below. Change all occurrences from: print "a", "b", var, 123 to: print ("a","b", var, 123) Be careful of multi-line print statements, ending with \, or using multi-line strings. Recommend editing in a text editor that supports syntax highlighting to avoid any issues. Also be aware that Python is sensitive to indentation so you don't want to change any spaces / tabs at the start of a line. 4. Change os.mkdir from: os.mkdir(current, 0755) to: os.mkdir(current, 0o755) 5. Comment out the packages.sorted() From packages.sorted() To #packages.sorted() (There are online tools for converting code from Python2 to 3 but they miss some of the above steps.) 6. Save and close the file 7. Test that hdp-select still works from shell. If so, you should be able to run spark-submit without issue. A word of caution: While these changes should be backwards compatible with Python 2, I am not sure what the longer-term impacts of these changes are, it may cause problems with other HDP components (though it seems highly unlikely). Making changes to scripts outside of Ambari has other risks - Ambari or some other installation or upgrade process might replace the script with the one from your HDP software bundle, so your spark-submit could stop working if/when that happens. I would file a bug report but we don't have Cloudera support at this time.

bjorn_olsen · ‎02-28-2019

In OPs case, it might be that the hdfs-site files need to be available when trying to connect to HBase. If I recall correctly, some HBase clients (such as the NiFi processor) need the Hadoop configuration files core-site and hdfs-site to be specified. If they can't be found or don't have the attributes above, it might cause the same error.

bjorn_olsen · ‎02-28-2019

I had the same issue with Spark2 and HDP3.1, using Isilon/OneFS as storage instead of HDFS. The OneFS service management pack doesn't provide configuration for some of the HDFS parameters that are expected by Spark2 (they aren't available at all in Ambari), such as dfs.datanode.kerberos.principal. Without these parameters Spark2 HistoryServer may fail to start and report errors such as "Failed to specify server's principal name". I added the following properties to OneFS under Custom hdfs-site: dfs.datanode.kerberos.principal=hdfs/_HOST@<MY REALM> dfs.datanode.keytab.file=/etc/security/keytabs/hdfs.service.keytab dfs.namenode.kerberos.principal=hdfs/_HOST@<MY REALM> dfs.namenode.keytab.file=/etc/security/keytabs/hdfs.service.keytab This resolved the initial error. Thereafter I was getting an error of the following form: Server has invalid Kerberos principal: hdfs/<isilon>.my.realm.com@my.realm.com, expecting: hdfs/somewhere.else.entirely@my.realm.com This was related to cross-realm authentication. Resolved by adding the below setting to custom hdfs-site: dfs.namenode.kerberos.principal.pattern=* (Reposting my answer from https://stackoverflow.com/questions/35325720/connecting-to-kerberrized-hdfs-java-lang-illegalargumentexception-failed-to-s/54926487#54926487 )

bjorn_olsen · ‎10-30-2018

Hi Sherrine The ZKFC expects to run on the same host as the NN service. Assuming that this is what has been configured, the issue might be networking related. Please check your hostname / IP configuration on your hosts, and that DNS is also giving the same correct result.

bjorn_olsen · ‎07-10-2018

In 2018 I'm having a similar issue - missing block alert displayed on Ambari although the fsck is displaying the file system as healthy. This support article came up first on Google but the below support article resolved my issue. Linking for the benefit of others. https://community.hortonworks.com/content/supportkb/185879/missing-block-alert-displayed-on-ambari-although-t.html

bjorn_olsen · ‎02-19-2018

To pull data from the insecure cluster to the secure cluster, connect to the secure cluster and run the below: kinit as your user of choice. klist to confirm hdfs dfs -D ipc.client.fallback-to-simple-auth-allowed=true -ls hdfs://<insecure cluster>/ hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://<insecure cluster>/path/to/source destination Note the space after the -D parameter. I prefer this way rather than modifying the global HDFS configuration to allow simple auth. If not using HA on the insecure cluster you can use the insecure cluster's active namenode address and port for <insecure cluster>. Otherwise if using HA, you can use the nameservice for the insecure cluster as the value for <insecure cluster>, but only if the secure cluster HDFS has already been configured to know about the insecure cluster nameservice. Let me know if you need help with that also.

bjorn_olsen · ‎12-05-2017

@Abdelkrim Hadjidj I can't seem to find a way to do that

bjorn_olsen · ‎12-04-2017

@mayki wogno, I was using NiFi 1.2.0 with HDF 3.0.0.0-453 You can check the component versions of the processors in the screenshot and compare to yours. I found that there were many combinations of "type" and "logical type" which did not work as expected for timestamp values. Perhaps check the exact type of data you have and then try some of the different options above, one might work for you. In my case the input timestamp was in this format: "YYYYMMDD HH:MM:SS" and I wanted a valid Avro timestamp (long milliseconds) as the output.

bjorn_olsen · ‎08-31-2017

I know this is an older thread but I had this issue recently with HDP 2.6.0. This error can also happen because region servers failed to start. A common cause of this is that the system clock on one or more region servers does not match the system clock on the HBase master service, so the region server service will not start.

bjorn_olsen · ‎07-14-2017

The above answer is not fully correct. A schema can be modified but a schema + version appears to be immutable. Since delete functionality does not exist - can it be added? If a mistake is made when adding a schema then it cannot be deleted. The schema evolution compatibility prevents me from completely replacing an existing schema and I can't rename it. So instead it seems we need to create a new schema each time, and change configuration elsewhere to now use the new schema name. The old schema still appears in the list which over time becomes confusing as to which schema is correct and in use. For example in development environments I end up with a lot of schemas which are no longer needed, eg my_schema, my_schema_1, test_my_schema, test_1, schema, working_schema .... which will cause confusion.

Online	Offline
Last Visited	‎11-04-2019 05:37 AM

Member Since	‎08-19-2016 08:18 AM
Last Visited	‎11-04-2019 05:37 AM
Posts	19
Kudos received	7

Cloudera Community

Re: Use NiFi to change the format of numeric, date...

Re: How to use Python3.7 in virtualenv with Spark

Re: Failed to specify server's Kerberos principal ...

Re: Failed to specify server's Kerberos principal ...

Re: Error when enabling namenode HA

Re: ambari showing corrupt blocks but not fsck

Re: SIMPLE authentication is not enabled.

Re: Use NiFi to change the format of numeric, date...

Re: Use NiFi to change the format of numeric, date...

Re: HBASE issue during cluster setup

Re: Delete schema from Schema Registry