Member since
09-13-2017
4
Posts
0
Kudos Received
0
Solutions
10-12-2017
01:03 AM
When I have two Tables A(id, name) and B(id, age) I want to join. through: SELECT * FROM A
INNER JOIN B ON A.id=B.id and also through: SELECT * FROM A
INNER JOIN B USING(id) in both ways, I get a table with duplicate key columns "id" from both previous tables: (id, name, id, age) What I want is (id, name, age), so the key columns should merge. EDIT: I know I could do it through "SELECT A.id, name, age..." instead of "SELECT * ...", but I have many columns, that I don't want to go this workaround.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
-
Cloudera Hue
09-19-2017
06:11 AM
Hello Zsolt, thanks for the reply. The problem was, that I don't have the permissions to install python packages like pydoop. I was not sure if there is a native way, but I will ask the sysadmin to install some packages.
... View more
09-13-2017
04:26 AM
Hello, I have some python standalone files, which acces data through the common command: with open("filename") as f:
for lines in f:
[...] I want make the python scripts able to run, without changing too much of the code and without dependencies, if possible. Right now I start the files as spark-programms in the Workflow in HUE. Are there built-in packages I can use? I tried to import pydoop and hdfs, but they didnt exist. My goal is to make these scripts run and be able to read/write files on the HDFS. Thanks for the help.
... View more
Labels:
- Labels:
-
Apache Oozie
-
Apache Spark
-
Cloudera Hue
-
HDFS