About pminovic

pminovic · ‎05-14-2016

The error says that you have declared m columns in your Hive table, and n cf:column mappings in your hbase.columns.mapping string and that m!=n. Can you check which of these numbers is wrong? As I'm sure you know, you can declare Hive columns in a free text block using many lines and spaces but the hbase.columns.mapping string is very restrictive, it allows no "beautifying" spaces, only the key and cf:column parts separated by commas. Hive HBase Integration page doesn't mention any limit on the length of the string, though it admits that the string is somewhat cumbersome and restrictive. Alternatively, you can try to map columns from a HBase table column family "cf" using the ":key,cf:" string. They will map into a Hive map<...> element composed map(column,value) for each key, see an example here. You can then keep on working with the map, or explode it using Hive's explode(map) UDF.

pminovic · ‎05-14-2016

You can upgrade to any version, including Pivotal, by taking the so-called "data migration" approach: setup your new cluster, and transfer data from the old one to the new one. You can copy data directly from one cluster to another using distcp for HDFS files and CopyTable for HBase tables. You can copy Hive tables using export/import of tables. Obviously it's not practical for a large amount of data. For "in-place migration", upgrading binaries, keeping data as-is, I agree with Artem and Timothy that it's the best to engage Support, otherwise you can run into a lot of troubles, and even damage or lose your data.

pminovic · ‎05-14-2016

The mappings you cited are the ones in Ambari. Based on settings of respective services Ambari will replace {{...}} parameters with real hostnames, ports etc. You can inspect the real values used by Knox by checking files under /etc/knox/conf, like gateway-site.xml and topologies/*.xml files. If some settings are not right you can try to fix them, for example by replace mappings in Ambari with specific values. For example, you can replace {{rm_host}}:{{jt_rpc_port}} with my-rm-fqdn.hadoop.com:{{jt_rpc_port}}, and you can replace the port as well.

pminovic · ‎05-13-2016

Hi @mark doutre, I checked your files (2 days ago, but couldn't post sooner), and my conclusion is that Hive cannot handle Avro files without schema. From AvroSerDe page you can see which Avro versions are supported (1.5.3 to 1.7.5), and Avro specs say: Avro data is always serialized with its schema. Files that store Avro data should always also include the schema for that data in the same file. And it has been so from version 1. So, it's very clear that "standard" Avro files must include schema and Hive supports only such files. With schema-less files you are on your own, you would have to read "value" from HBase and apply your schema to read the data and store such records in Hive. You can also include schema, which will work, but you will waste some space in HBase by storing the same schema in each record. Hope this helps.

pminovic · ‎05-13-2016

Well, not sure, if curl from cli works, it should work. Can you try to restart ambari-server and all ambari-agents.

pminovic · ‎05-13-2016

Can you try to run just this command, make sure you get HTTP status 200 and no errors: curl -iv 'http://xxxx:50070/webhdfs/v1/ats/done?op=GETFILESTATUS&user.name=hdfs If /ats/done doesn't exist, replace it with /tmp in the command. Last time I saw this issue the reason was http proxy, so make sure you have no proxy to servers in the cluster.

pminovic · ‎05-11-2016

To repeat, 1 is the default, see defaults here, but just to be sure you can go ahead and set it directly in hdfs-site, and restart HDFS and dependent services.

pminovic · ‎05-11-2016

After you restart HDFS and Yarn, Ambari will show you which other services to restart, like MapRed, Hive and Oozie.

pminovic · ‎05-10-2016

Okay, then set "dfs.namenode.safemode.threshold-pct=0.999f" in Ambari, that's all you need to do.

pminovic · ‎05-10-2016

Okay, please uplaod your files somewhere (one of your exisiting questions, or a new one), and I'll try to read them with Hive.

Online	Offline
Last Visited	‎08-19-2019 01:20 AM

Member Since	‎09-24-2015 04:02 AM
Last Visited	‎08-19-2019 01:20 AM
Posts	816
Kudos received	481

Cloudera Community

Re: datanode + Error occurred during initializatio...

Re: Problem when Distcp between two HA Cluster.

Re: Beeline over KNOX fails with HTTP Response co...

Re: What does nclients option of performance evalu...

Re: missing directories in ambari installation pac...

Re: Number of column limitations in hive over hbas...

Re: Upgrading from Pivotal HD to Pivotal/Hortonwor...

Re: Knox - wrong mapping? help will be appriciated

Re: Unable to access Avro object in HBase from Hiv...

Re: Ambari setup not starting services

Re: Ambari setup not starting services

Re: SMART SENSE IMPLEMENTATION

Re: SMART SENSE IMPLEMENTATION

Re: SMART SENSE IMPLEMENTATION

Re: how to create and store the avro files in hive...