About jstraub

jstraub · ‎12-02-2015

+1 .......

jstraub · ‎12-02-2015

Blueprint: { "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.3" }, "configurations" : [ { "hive-site" : { "properties" : { "javax.jdo.option.ConnectionDriverName" : "com.mysql.jdbc.Driver", "javax.jdo.option.ConnectionURL" : "jdbc:mysql://%HOSTGROUP::hg_master_node_3%/hive?createDatabaseIfNotExist=true", "javax.jdo.option.ConnectionUserName": "hive" } } }, { "oozie-site" : { "properties" : { "oozie.service.JPAService.jdbc.driver" : "com.mysql.jdbc.Driver", "oozie.service.JPAService.jdbc.url" : "jdbc:mysql://%HOSTGROUP::hg_master_node_3%/oozie", "oozie.service.JPAService.jdbc.username" : "oozie", "oozie.db.schema.name" : "oozie", } } }, { "oozie-env" : { "properties" : { "oozie_hostname" : "%HOSTGROUP::hg_master_node_3%", "oozie_database" : "Existing MySQL Database" } } }, { "hive-env" : { "properties" : { "hive_database" : "Existing MySQL Database", "hive_database_name" : "hive", "hive_database_type" : "mysql", "hive_hostname" : "%HOSTGROUP::hg_master_node_3%" } } } ], "host_groups" : [ ... ] } Hostgroup-Mapping/Cluster: { "blueprint" : "bigdata_blueprint", "default_password" : "my-super-secret-password", "configurations" : [ { "hive-site" : { "properties" : { "javax.jdo.option.ConnectionPassword" : "my-super-secret-password_2" } } }, { "oozie-site" : { "properties" : { "oozie.service.JPAService.jdbc.password" : "my-super-secret-password_3" } } } ], "host_groups" :[ ... ] }

jstraub · ‎12-02-2015

Update regarding the HDFS Replication configuration for solr files, there is an open Jira for this SOLR-6305 ("Ability to set the replication factor for index files created by HDFSDirectoryFactory")

jstraub · ‎12-02-2015

Thanks for adding this, this is a good source. We covered a lot of replication and SolrCloud topics in there 🙂

jstraub · ‎12-02-2015

@Jeremy Dyer Solr submits all its files to HDFS with the replication factor set to 1, meaning all Solr index and data files are stored with an HDFS replication of 1. Therefore, the Solr replication should be used. I am not sure if there is a Solr configuration parameter to set the HDFS replication for Solr files. Does that help?

jstraub · ‎12-02-2015

Agree with @Neeraj Sabharwal this is an API specific command, I dont think this option is available via "solr create_collection..."

jstraub · ‎12-02-2015

You have to use the option "createNodeSet" Allows defining the nodes to spread the new collection across. If not provided, the CREATE operation will create shard-replica spread across all live Solr nodes. The format is a comma-separated list of node_names, such as localhost:8983_solr, localhost:8984_solr,localhost:8985_solr. Alternatively, use the special value of EMPTY to initially create no shard-replica within the new collection and then later use the ADDREPLICA operation to add shard-replica when and where required. For example: http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&createNodeSet=localhost:8983_solr,localhost:8984_solr,localhost:8985_solr See full documentation here

jstraub · ‎12-01-2015

exactly, its basically the content of these two fields from ambari UI => As you pointed out, the actual root cause is usually available in the individual service log -> /var/log/...

jstraub · ‎12-01-2015

Can you share the commands that you used to setup the cluster?

jstraub · ‎12-01-2015

@pankaj singh Another option would be to get the log files directly from the individual nodes. For example, use the following request to get all the task information for a request: http://horton01.example.com:8080/api/v1/clusters/bigdata/requests/111?fields=*,tasks/Tasks/request_id,tasks/Tasks/command,tasks/Tasks/command_detail,tasks/Tasks/host_name,tasks/Tasks/id,tasks/Tasks/role,tasks/Tasks/status&minimal_response=true The response looks something like this => "tasks" : [ { "Tasks" : { "command" : "CUSTOM_COMMAND", "command_detail" : "RESTART MAPREDUCE2/MAPREDUCE2_CLIENT", "host_name" : "horton01.example.com", "id" : 1157, "request_id" : 111, "role" : "MAPREDUCE2_CLIENT", "status" : "COMPLETED" } }, { "Tasks" : { "command" : "CUSTOM_COMMAND", "command_detail" : "RESTART MAPREDUCE2/HISTORYSERVER", "host_name" : "horton02.example.com", "id" : 1158, "request_id" : 111, "role" : "HISTORYSERVER", "status" : "COMPLETED" } }, The import parts are the id and the host_name Now you can log into the host and retrieve the error and output logfile from: error: /var/lib/ambari-agent/data/errors-<id>.txt output: /var/lib/ambari-agent/data/output-<id>.txt For Example on host horton02.example.com: error log: /var/lib/ambari-agent/data/errors-1158.txt output log: /var/lib/ambari-agent/data/output-1158.txt Note: You might also be able to get the request and tasks ids from the Ambari Database.

Online	Offline
Last Visited	‎08-18-2019 08:21 AM

Member Since	‎09-15-2015 02:21 PM
Last Visited	‎08-18-2019 08:21 AM
Posts	457
Kudos received	472

Cloudera Community

Re: NiFi: How do I see the flowfile attributes nam...

Re: NiFi: JSON Array split

Re: Securing Solr with Ranger ERROR 500

Re: Is Ambari Infra open source?

Re: After disabling kerberos , ZKfailover not comi...

Re: Ext4 vs XFS Filesystem - Survey of Popularity

Re: What are the blueprint properties for specifyi...

Re: SolrCloud Replication factor with index files ...

Re: SolrCloud Replication factor with index files ...

Re: SolrCloud Replication factor with index files ...

Re: solr create_collection

Re: solr create_collection

Re: How to export all the output/error logs for a ...

Re: Ambari stuck with "Install Pending" when creat...

Re: How to export all the output/error logs for a ...