About Shelton

Shelton · ‎12-16-2019

@Bindal Thanks for sharing the screenshot. I can see from the screenshot your riskfactor and riskfactor1 are directories !! not files Can you double click on either of them and see the contents. I have mounted an old HDP 2.6.x for illustration whatever filesystem you see under Ambari view is in HDFS. Here is the local filesystem My Ambari view before the creation of the /Bindal/data the equivalent to /tmp/data I created a directory in hdfs Make the directory this is the local fine system Copy the riskfactor1.csv from local file system /tmp/data Check the copied file in hdfs So a walk through from the Linux CLI as root user I created a directory in /tmp/data and placed the riskfactor1.csv in there then create a directory in HDFS /Bindal/data/. I then copied the file from the local Linux boy to HDFS , I hope that explains the difference between local filesystem and hdfs. Below is again a screenshot to show the difference Once the file is in HDFS your zeppelin should run successfully, as reiterated in your screenshot you share you need to double click on riskfactor and riskfactor1 which are directory to see if the difference with my screenshots HTH

Shelton · ‎12-13-2019

@Shaneg To understand your problem you should know the difference between a view and a table? Views are definitions built on top of other tables or other views and do not hold data themselves. If data is changing in the underlying table, the same change is reflected in the view. A view can be built on top of a single table or multiple tables. Now to answer your question a table contains data, a view is just a SELECT statement that has been saved in the database or metastore depending on your database. This explains why the columns are getting resolved because the definition of the view exists but the NULL pointer exception tells you there is no data in that view as it ONLY holds the definition but not the data! I haven't tried it yet in Oracle but I think it's not possible to import data from a view too, However, a materialized view is a physical copy of the base table could work in the export I am not sure, you could test that. Happy hadooping

Shelton · ‎12-11-2019

@Bindal Spark expects the riskfactor1.csv file to be in hdfs path /tmp/data/ but to me it seems you have the file riskfactor1.csv on your local filesystem /tmp/data I have run the below from a sandbox Please follow the below steps to resolve that "Path does not exist" error. Log on the CLI on your sandbox as user root then Switch user to hdfs [root@sandbox-hdp ~]# su - hdfs Check the current hdfs directory [hdfs@sandbox-hdp ~]$ hdfs dfs -ls / Found 13 items drwxrwxrwt+ - yarn hadoop 0 2019-10-01 18:34 /app-logs drwxr-xr-x+ - hdfs hdfs 0 2018-11-29 19:01 /apps drwxr-xr-x+ - yarn hadoop 0 2018-11-29 17:25 /ats drwxr-xr-x+ - hdfs hdfs 0 2018-11-29 17:26 /atsv2 drwxr-xr-x+ - hdfs hdfs 0 2018-11-29 17:26 /hdp drwx------+ - livy hdfs 0 2018-11-29 17:55 /livy2-recovery drwxr-xr-x+ - mapred hdfs 0 2018-11-29 17:26 /mapred drwxrwxrwx+ - mapred hadoop 0 2018-11-29 17:26 /mr-history drwxr-xr-x+ - hdfs hdfs 0 2018-11-29 18:54 /ranger drwxrwxrwx+ - spark hadoop 0 2019-11-24 22:41 /spark2-history drwxrwxrwx+ - hdfs hdfs 0 2018-11-29 19:01 /tmp drwxr-xr-x+ - hdfs hdfs 0 2019-09-21 13:32 /user Create the directory in hdfs usually under /user/xxxx depending on the user but here we are creating a directory /tmp/data and giving an open permission 777 so any user can execute the spark Create directory in hdfs $ hdfs dfs -mkdir -p /tmp/data/ Change permissions $ hdfs dfs -chmod 777 /tmp/data/ Now copy the riskfactor1.csv in the local filesystem to hdfs, here I am assuming the file is in /tmp [hdfs@sandbox-hdp tmp]$ hdfs dfs -copyFromLocal /tmp/riskfactor1.csv /tmp/data The above copies the riskfactor1.csv from local temp to hdfs location /tmp/data you can validate by running the below command [hdfs@sandbox-hdp ]$ hdfs dfs -ls /tmp/data Found 1 items -rw-r--r-- 1 hdfs hdfs 0 2019-12-11 18:40 /tmp/data/riskfactor1.csv Now you can run your spark in zeppelin it should succeed. Please revert !

Shelton · ‎12-09-2019

@saivenkatg55 Please don't forget to vote a helpful answer and accept the best answer. If you found this answer addressed your initial question, please take a moment to login and click "accept" on the answer.

Shelton · ‎12-09-2019

@eswarloges From HDP 3.x onwards, to work with hive databases you should use the HiveWarehouseConnector library /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar as show in the below example spark-shell --conf spark.sql.hive.hiveserver2.jdbc.url="jdbc:hive2://FQDN or IP:10000/" spark.datasource.hive.warehouse.load.staging.dir="/staging_dir" spark.hadoop.hive.zookeeper.quorum="zk_Quorum_ip's:2181" --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.0.0.0-1634.jar val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build() hive.showDatabases().show(100, false) Could you try that and revert

Shelton · ‎12-04-2019

@RobertCare Nothing stupid 🙂 The credentials are tricky and documented here the learning the Ropes of the HDP Sandbox See the below screenshots the atlas user& password is holger_gov/holger_gov Atlas user & Passowrd Explanation Hope that helps

Shelton · ‎12-02-2019

@SushantRao Just tried now Access Restricted You must be a CDP Data Center customer to access these downloads. If you believe you should have this entitlement then please reach out to support or your customer service representative.

Shelton · ‎12-02-2019

@mike_bronson7 You can change the ownership of the HDFS directory to airflow:hadoop please do run the -chown command on / ??? It should something like /users/airflow/xxx Please let me know

Shelton · ‎12-02-2019

@rwinters You could be right that your issue is incompatibility? Can you check exactly what version of Postgres you upgraded to? It's usually advisable to check the Hortonworks support matrix before launching any upgrade I only hope it's a dev environment see screenshot Here you see PostgreSQL 10.7 is only compatible with Ambari 2.7.4 and HDP 3.1.4 very useful tool. Happy hadooping

Shelton · ‎12-02-2019

@mike_bronson7 The hadoop group encapsulates all the users including hdfs You do run a # cat /etc/group You should see someing like like hadoop:x:1007:yarn-ats,hive,storm,infra-solr,zookeeper,oozie,atlas,ams,ranger,tez,zeppelin,kms,accumulo,livy,druid,spark,ambari-qa,kafka,hdfs,s qoop,yarn,mapred,hbase,knox So running the -chown should only target the directory in the Diagnostics logs NEVER run the -chown command on / which is the root directory !! Can you share your log please

Online	Offline
Last Visited	‎06-05-2025 02:03 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎06-05-2025 02:03 PM
Posts	3,676
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: "Path does not exist" error message received w...

Re: SQOOP export Hive View to Oracle

Re: "Path does not exist" error message received w...

Re: Ambari UI is responding very Slow

Re: HDP 3.1 Hive Connectivity Issue

Re: How to login to Atlas after kerberizing sandbo...

Re: CDP On-premise ?

Re: SPARK Application + HDFS + User Airflow is not...

Re: Ranger Admin Unable to acquire connections to ...

Re: SPARK Application + HDFS + User Airflow is not...