Member since
03-01-2017
62
Posts
7
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3515 | 02-07-2019 02:28 PM |
07-30-2024
03:55 AM
1 Kudo
..for those who are interested...i figured out a solution...earlier i tried to use a "Classification" and use a Hyperlink for an Attribute there, and this does not work. But if you use "Business Metadata" and assign them to your data asset, it allows to define an Attribute with a Hyperlink...
... View more
05-16-2024
05:48 AM
1 Kudo
Because I ran into this thread when looking how to solve this error and because we found a solution, I thought it might still serve some people if I share what solution we found. We needed HWC to profile Hive managed + transactional tables from Ataccama (data quality solution). And we found someone who successfully got spark-submit working. We checked their settings and changed the spark-submit as follows: COMMAND="$SPARK_HOME/bin/$SPARK_SUBMIT \ --files $MYDIR/$LOG4J_FILE_NAME $SPARK_DRIVER_JAVA_OPTS $SPARK_DRIVER_OPTS \ --jars {{ hwc_jar_path }} \ --conf spark.security.credentials.hiveserver2.enabled=false \ --conf "spark.sql.hive.hiveserver2.jdbc.url.principal=hive/_HOST@{{ ad_realm }}" \ --conf spark.dynamicAllocation.enable=false \ --conf spark.hadoop.metastore.catalog.default=hive \ --conf spark.yarn.maxAppAttempts=1 \ --conf spark.sql.legacy.parquet.int96RebaseModeInRead=CORRECTED \ --conf spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED \ --conf spark.sql.legacy.parquet.datetimeRebaseModeInRead=CORRECTED \ --conf spark.sql.legacy.timeParserPolicy=LEGACY \ --conf spark.sql.legacy.typeCoercion.datetimeToString.enabled=true \ --conf spark.sql.parquet.int96TimestampConversion=true \ --conf spark.sql.extensions=com.hortonworks.spark.sql.rule.Extensions \ --conf spark.sql.extensions=com.qubole.spark.hiveacid.HiveAcidAutoConvertExtension \ --conf spark.kryo.registrator=com.qubole.spark.hiveacid.util.HiveAcidKyroRegistrator \ --conf spark.sql.sources.commitProtocolClass=org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol \ --conf spark.datasource.hive.warehouse.read.mode=DIRECT_READER_V2 \ --class $CLASS $JARS $MYLIB $PROPF $LAUNCH $*"; exec $COMMAND Probably the difference was in the spark.hadoop.metastore.catalog.default=hive setting. In the above example are some Ansible variables: hwc_jar_path: "/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p1000.24102687/jars/hive-warehouse-connector-assembly-1.0.0.7.1.7.1000-141.jar" ad_realm is our LDAP realm. Hope it helps anyone.
... View more
02-07-2019
02:28 PM
1 Kudo
This is now the winning REST API query: curl -u myaccount -i -H "Content-Type: application/json" -X GET "https://atlasnode.domain.com:21000/api/atlas/discovery/search/dsl?query=hive_table+where+qualifiedName%3D%27testdb.mytable.id@CLUSTERNAME%27" It gives a list of all columns for a table, including deleted ones. In my Python code I pick the column with ACTIVE state.
... View more
02-01-2019
03:30 PM
@Sandeep Nemuri I think we responded at almost the same time, when some is clicking submit, there is no logic that checks whether a similar answer has already been give 🙂 Maybe you should have added that he needs to run the script as Atlas admin user as illustrated which he wasn't aware of 🙂
... View more
10-09-2018
06:45 PM
Yes, you are right, That is what i meant. I haven't played with the sandbox, but the key would be to make sure that the user NiFi uses to run has access and permissions to the resources you are adding to the processor. Executestreamprocessor will have the same issue if the path is wrong or non existent in the server. I would first try to find out what user is used to run nifi (ps -ef would be your friend here). Then i would make sure that that user has access to the path in the console. (Use 'ls -l path') from the home directory. Path in this case would be both the path of the executable, and the path of the working directory (Make sure both are accessible) Lastly, try to execute your script from the command line. Thanks! Regards
... View more
09-06-2018
02:24 PM
Awesome, @Marcel-Jan Krijgsman, glad we got it working 🙂 and thank you for sharing the trimmed result!
... View more
08-07-2018
09:12 AM
@Felix Albani Thanks for that answer. Looks like I stand for an interesting choice: Change hive.server2.enable.doAs=true and run Hive on HDFS as HiveServer2 process. But then I can restrict access to columns to users in Hive, without them getting access to the HDFS files. So the choice of Hive permissions I make will be much more important. Keep hive.server2.enable.doAs=false and I will not be able to do column based access in Hive. But be in the comfort that if someone gets access to Hive table without the HDFS access, they still can not get to the data. I'll have to think about this.
... View more
02-28-2018
04:03 PM
It seems these messages only occur on my sandbox environment. On my customers HDP 2.6.3 environment I haven't seen any ATLAS-500-00-007 errors yet.
... View more
12-14-2017
12:36 PM
2 Kudos
I tried this and it returns only the queried table curl -X GET \
'http://sandbox.hortonworks.com:21000/api/atlas/v2/search/dsl?typeName=hive_table&query=where%20name%3D%22asteroids%22' \
-H 'authorization: Basic YWRtaW46YWRtaW4=' Since the thread was long. I put the correct answer separately. You will find an "Accept" button beside this answer. Please click on it to accept it to make it as Best answer. Thanks a lot.
... View more