Member since
07-10-2017
78
Posts
6
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4440 | 10-17-2017 12:17 PM | |
7016 | 09-13-2017 12:36 PM | |
5397 | 07-14-2017 09:57 AM | |
3521 | 07-13-2017 12:52 PM |
12-01-2020
03:47 AM
I'm trying to run a dag with airflow 1.10.12 and HDP 3.0.0 when i run the dag it gets stuck in ```Connecting to jdbc:hive2://[Server2_FQDN]:2181,[Server1_FQDN]:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2``` when i run ```beeline -u "jdbc:hive2://[Server1_FQDN]:2181,[Server2_FQDN]:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"``` from shell, it connect to hive with no problem. I've also made a connection like this ``` Conn Id * hive_jdbc ------------- Conn Type ------------- Connection URL jdbc:hive2://centosserver.son.ir:2181,centosclient.son.ir:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 ------------- Login hive ------------- Password ****** ------------- Driver Path /usr/hdp/3.0.0.0-1634/hive/jdbc/hive-jdbc-3.1.0.3.0.0.0-1634-standalone.jar ------------- Driver Class org.apache.hive.jdbc.HiveDriver ``` and I'm not using kerberos I've also added ```hive.security.authorization.sqlstd.confwhitelist.append``` in the ambari ```Custom hive-site``` ``` radoop\.operation\.id|mapred\.job\.name||airflow\.ctx\.dag_id|airflow\.ctx\.task_id|airflow\.ctx\.execution_date|airflow\.ctx\.dag_run_id|airflow\.ctx\.dag_owner|airflow\.ctx\.dag_email|hive\.warehouse\.subdir\.inherit\.perms|hive\.exec\.max\.dynamic\.partitions|hive\.exec\.max\.dynamic\.partitions\.pernode|spark\.app\.name ``` any suggestions? I'm desperate, I've tried every way that i know but still nothing @nsabharwal @agillan @msumbul1 @deepesh1
... View more
10-22-2019
03:33 AM
Hey @axk , Thanks for letting us know. I'm glad it was helpful 🙂
... View more
06-14-2018
01:00 PM
@rajat puchnanda Merge a group of flowfile (or) records is possible with MergeContent/MergeRecord processors. Example: if flowfile(ff1) having 123 records then ff2 having 345 by using mergecontent/record processors we can merge these flowfiles in to one like 123345. Merge means combining the group of records/flowfiles(union all) ,if you want to remove duplicates(i.e 3 is duplicate record) from the combined record flowfile content then you can use QueryRecord Processor with row_number window function to eliminate duplicates. This scenario is possible with NiFi without using lookup record processors. But as you mentioned in one of the answer Scenario2: InputFile 1 deptid firstname lastname 1 Aman Sharma 2 Raman Verma InputFile 2 deptid salary email 1 20000 abc@gmail.com 2 30000 bgf@gmail.com OutputFile(By merging file1 and file2):- deptid firstname lastname salary email 1 Aman Sharma 20000 abc@gmail.com 2 Raman Verma 30000 bgf@gmail.com This is not possible with MergeContent/Record but you can try with QueryRecord processor by implementing group and collect as set (or) some sort of sql logic in queryrecord processor to transpose the data into your desired format.This query Would be intensive if you are doing on larger number of records.
... View more
04-12-2018
01:35 PM
As of NiFi 1.5.0 (via NIFI-4684), you can now specify the prefix in ConvertJSONToSQL. The property defaults to "sql" to maintain existing behavior, but can be changed to "hiveql" for use with PutHiveQL.
... View more
12-03-2018
12:01 PM
Looking for this info also. Sorry to bump the the thread, but any news on this wish?
... View more
09-21-2017
07:44 AM
@nallen The pcap_replay is install as a service by default with HCP 1.2? If not, how to install it manually? Thanks
... View more
04-13-2018
06:35 PM
I am also getting this error. I have check the config in ambari and the db port and http ports are both set. It seems almost as if when ambari runs superset the config file in /etc/superset/conf/ isn't being used.
... View more
08-16-2017
07:10 AM
@msumbul Please mark the question as answered if sufficiently answered
... View more
08-16-2017
08:11 AM
@msumbul Please mark the question as answered if sufficiently answered
... View more
08-01-2018
11:31 AM
To properly troubleshoot elasticsearch you first need to make sure that elasticsearch is actually running correctly. Go to QuickLinks and open Elasticsearch Health. Status must be green. Tail your elasticsearch master node log file while restarting to see if there are any issues: tail -f /var/log/elasticsearch/elasticsearch.log Additionally you can edit /etc/elasticsearch/log4j2.properties and set logger.action.level to debug for more verbose logging. Based on the log output, you will likely need to adjust config settings in Advanced elastic-site from Ambari. Here are my settings for a Master Node + 2 Data Nodes: bootstrap_memory_lock
true cluster_name
elasticsearch cluster_routing_allocation_disk_threshold_enabled
true cluster_routing_allocation_disk_watermark_high
0.99 cluster_routing_allocation_disk_watermark_low
.97 cluster_routing_allocation_node_concurrent_recoveries
4 discovery_zen_fd_ping_interval
15s discovery_zen_fd_ping_retries
5 discovery_zen_fd_ping_timeout
60s discovery_zen_ping_timeout
3s expected_data_nodes
0 gateway_recover_after_data_nodes
1 http_cors_enabled
"true" http_port
9200 index_merge_scheduler_max_thread_count
5
index_number_of_replicas
2 index_number_of_shards
4
index_refresh_interval
1s index_translog_flush_threshold_size
5g indices_cluster_send_refresh_mapping
false indices_fielddata_cache_size
25% indices_memory_index_buffer_size
10% indices_memory_index_store_throttle_type
none masters_also_are_datanodes
"true" network_host
[ 0.0.0.0 ]
network_publish_host
[] path_data
"/hadoop/elasticsearch/es_data" recover_after_time
15m threadpool_bulk_queue_size
3000 threadpool_index_queue_size
1000 transport_tcp_port
9300
zen_discovery_ping_unicast_hosts
[ "fqdn.hostname1.com", "fqdn.hostname2.com", "fqdn.hostname3.com" ]
... View more