Member since
04-04-2016
147
Posts
40
Kudos Received
16
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
271 | 07-22-2016 12:37 AM | |
868 | 07-21-2016 11:48 PM | |
568 | 07-21-2016 11:28 PM | |
399 | 07-21-2016 09:53 PM | |
535 | 07-08-2016 07:56 PM |
03-10-2017
07:22 PM
1 Kudo
Adding TTL on Solr: cd to this directory Step1: Step2: Step3: vi managed-schema: add these 3 lines <field
name="_timestamp_" type="date" indexed="true"
stored="true" multiValued="false" />
<field name="_ttl_" type="string"
indexed="true" multiValued="false" stored="true"
/>
<field name="_expire_at_" type="date"
multiValued="false" indexed="true" stored="true"
/> Step4: vi solrconfig.xml Replace the below 3 lines with the lines after it: <updateRequestProcessorChain
name="add-unknown-fields-to-the-schema">
<!-- UUIDUpdateProcessorFactory will generate an id if none is present in
the incoming document -->
<processor class="solr.UUIDUpdateProcessorFactory"/> as <updateRequestProcessorChain
name="add-unknown-fields-to-the-schema"> <processor
class="solr.TimestampUpdateProcessorFactory"> <str
name="fieldName">_timestamp_</str> </processor> <processor
class="solr.DefaultValueUpdateProcessorFactory"> <str
name="fieldName">_ttl_</str> <str
name="value">+30SECONDS</str> </processor> <processor
class="solr.processor.DocExpirationUpdateProcessorFactory"> <str
name="ttlFieldName">_ttl_</str> <str
name="ttlParamName">_ttl_</str> <int
name="autoDeletePeriodSeconds">30</int> <str
name="expirationFieldName">_expire_at_</str> </processor> <processor
class="solr.FirstFieldValueUpdateProcessorFactory"> <str
name="fieldName">_expire_at_</str> </processor> <!--
UUIDUpdateProcessorFactory will generate an id if none is present in the
incoming document --> <processor
class="solr.UUIDUpdateProcessorFactory" />
Things that might be useful: Make sure to start solr like this so that configs related to
solr goes to /solr in zookeeper: 1./opt/lucidworks-hdpsearch/solr/bin/solr
start -c –z lake1.field.hortonworks.com:2181, lake2.field.hortonworks.com:2181,
lake3.field.hortonworks.com:2181/solr 2.create
the collection like this /opt/lucidworks-hdpsearch/solr/bin/solr create -c
tweets -d data_driven_schema_configs -s 1 -rf 1 3.to
delete the collection: http://testdemo.field.hortonworks.com:8983/solr/admin/collections?action=DELETE&name=tweets 4.also
remove it from zkCli.sh as rmr /solr/config/tweets Thanks, Sujitha Sanku please ping me or email me at ssanku@hortonworks.com in case of any issues.
... View more
- Find more articles tagged with:
- How-ToTutorial
- Sandbox & Learning
- solr
- solrcloud
Labels:
11-21-2016
09:37 PM
Special thanks to Michael Young for the help to be my mentor. Step1: cd /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs/conf/ screen-shot-2016-11-21-at-100524-am.png Step2: vi managed-schema: add these 3 lines <field
name="_timestamp_" type="date" indexed="true"
stored="true" multiValued="false" />
<field name="_ttl_" type="string"
indexed="true" multiValued="false" stored="true"
/>
<field name="_expire_at_" type="date"
multiValued="false" indexed="true" stored="true"
/> screen-shot-2016-11-21-at-100929-am.png Step3: vi solrconfig.xml on the same directory. Replace the below 3 lines with the lines after it:
<updateRequestProcessorChain name="add-unknown-fields-to-the-schema">
<!-- UUIDUpdateProcessorFactory will generate an id if none is present in
the incoming document -->
<processor /> as <updateRequestProcessorChain name="add-unknown-fields-to-the-schema">
<processor>
<str name="fieldName">_timestamp_</str>
</processor>
<processor>
<str name="fieldName">_ttl_</str>
<str name="value">+30SECONDS</str>
</processor>
<processor
class="solr.processor.DocExpirationUpdateProcessorFactory">
<str name="ttlFieldName">_ttl_</str>
<str name="ttlParamName">_ttl_</str>
<int name="autoDeletePeriodSeconds">30</int>
<str name="expirationFieldName">_expire_at_</str>
</processor>
<processor>
<str name="fieldName">_expire_at_</str>
</processor>
<!-- UUIDUpdateProcessorFactory will generate an id if none is present in
the incoming document --> <processor
class="solr.UUIDUpdateProcessorFactory" /> screen-shot-2016-11-21-at-101045-am.png Hope that helps. Thanks, Sujitha
... View more
- Find more articles tagged with:
- Data Processing
- help
- How-ToTutorial
- solr
- solrcloud
Labels:
11-03-2016
01:23 AM
@Artem Ervits, this solution still give me the same issue. Also I have these changes on the edge node. That is correct right? @brahmasree b did you find solution to this question? if so can you please post
... View more
10-25-2016
12:35 AM
1 Kudo
Hi @Bryan Bende, Thanks for the reply. Yes I realized the error and I followed these steps, https://community.hortonworks.com/articles/26551/accessing-kerberos-enabled-kafka-topics-using-getk.html Also I name my principle as "nifi/iotdemo.field.hortonworks.com@LAKE" also do I need to mention these lines in my zookeeper.properties? 3. Added 3 additional properties to the bottom of the zookeeper.properties file you have configured per the linked procedure above: authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider jaasLoginRenew=3600000 requireClientAuthScheme=sasl Right now my error is: "Caused by: javax.security.auth.login.LoginException: Could not login: the client is being asked for a password, but the Kafka client code does not currently support obtaining a password from the user. not available to garner authentication information from the user" Please find attached my PutKafka processor configurations. Any help is highly appreciated.. screen-shot-2016-10-24-at-53412-pm.png screen-shot-2016-10-24-at-53535-pm.png Thanks a lot, Sujitha
... View more
10-24-2016
07:44 PM
Hi, I have an error while trying to stream the data using NiFi flow, in a kerberized environment with LDAP integrated. The error "failed while waiting for acks from Kafka" , I attached the error and properties screenshot. By the way there is a conf called " Kerberos Service Name" is that the error, Any help is highly appreciated. Thanks, Sujitha screen-shot-2016-10-24-at-124017-pm.png screen-shot-2016-10-24-at-124006-pm.png
... View more
Labels:
08-29-2016
08:18 PM
Hi @mqureshi, Many thanks for the response. I will let the customer know about this. Thanks, Sujitha
... View more
08-26-2016
10:10 PM
Hi, There is a customer looking for answer to this question. I was completely not sure what they are looking for. What is Hortonworks’ position on native drivers, as compared to the emulated drivers currently with spark, hive, etc. MapR has a native driver, which is really helpful with I/O use cases, making it faster, with less resource expenditure What is Hortonworks position or strategy/position going forward? Not sure if this question makes sense. Any answer is highly appreciated. Thanks, Sujitha
... View more
Labels:
08-18-2016
05:43 AM
1 Kudo
Solr indexing the
MySQL database table on HDP 2.5 Tech Preview: Solr version used: solr 4.9.0 Step1: Downloaded the solr 4.9.0.zip from https://archive.apache.org/dist/lucene/solr/4.9.0/ Step2: Extract the file: Step3: modify the solrconfig.xml, schema.xml and add the
db-data-config.xml at Step4: add the jar at this location
a.vi solrconfig.xml: add these lines in between
the config tags. <lib
dir="../../../contrib/dataimporthandler/lib/"
regex=".*\.jar" /> <lib dir="../../../dist/"
regex="solr-dataimporthandler-\d.*\.jar" /> <lib dir="../../../lib/"
regex="mysql-connector-java-5.0.8-bin.jar" /> <requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">db-data-config.xml</str>
</lst> </requestHandler>
b.vi schema.xml add the below line: <dynamicField name="*_name" type="text_general" multiValued="false"
indexed="true" stored="true"
/>
c.Create a file called db-data-config.xml at the
same path later in this session I would create a database employee in mysql
server add these <dataConfig>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/employees" user="root"
password="hadoop" />
<document>
<entity name="id" query="select emp_no as 'id',
first_name, last_name from employees limit 1000;" />
</document> </dataConfig> After this
is complete run the below command (d) to start solr and check if solr is up and
running at url below: 8983 is the default port of solr d.java –jar start.jar http://localhost:8983/solr/#/
e.select the core selector as collection1. f.Click on Data Import, expand configuration and check if its
pointing to our db-data-config.xml file we created. g.After the completion of Step5 below click on execute on the page. Step5: Setting
up database: Import an already
available database into Mysql:
Ref:
https://dev.mysql.com/doc/employee/en/employees-installation.html shell> tar -xjf
employees_db-full-1.0.6.tar.bz2 shell> cd
employees_db/ shell> mysql -t
< employees.sql With this
installation of employees db in mysql is complete. Step6: With this our indexing is complete using
Solr. To do: I will try indexing the tables in Mysql using latest
version of Solr. Reference: http://blog.comperiosearch.com/blog/2014/08/28/indexing-database-using-solr/ Hope this helps…. Thanks, Sujitha
... View more
- Find more articles tagged with:
- How-ToTutorial
- indexing
- solr
- solutions
Labels:
08-15-2016
11:03 PM
Hi, I am trying to login to my mysql prompt on HDP 2.5. Its doesn't allow me to login. Not sure whats the password set on it. please find the attached. Any help is highly appreciated. Thanks, Sujitha
... View more
08-03-2016
06:47 PM
Hi, I am looking for a guide to scale the computing nodes. Any detail process or steps. Any response is highly appreciated. Thanks, Sujitha
... View more
Labels:
08-03-2016
04:41 PM
1 Kudo
Hi, I am looking forward for the list of supported http servers specifically for Oracle OHS, IBM IHS, Apache HTTP, and, NGINX. If so do we have a demo or use case to show the same. I am working on a RFP and looking forward for any help. Many thanks in advance. Thanks, Sujitha
... View more
07-29-2016
08:29 PM
Hi @milind pandit, Thanks for the info. Yes apart from Nifi can I focus on something http://hortonworks.com/partner/sap/ and http://hortonworks.com/partner/informatica/ would that makes sense to add these as examples. Thanks again for the reply. Thanks, Sujitha
... View more
07-29-2016
05:59 PM
Hi there, I am looking for a better way of answering this question with any references and documentation. Platform architected with “Ease of integration” with other applications or technologies? This is from one of my RFP questions. Any help is highly appreciated. Thanks, Sujitha
... View more
07-28-2016
10:54 PM
Hi, I am looking for a reference and demos to show Text & Data mining capabilities on our platform. I am trying to answer one of the RFP questions. Any help is highly appreciated. Thanks, Sujitha
... View more
07-22-2016
07:38 PM
Hi @srinivasa rao, glad that you are satisfied by the answer provided by Benjamin Leonhardi. Please let me know in case of any issues.
... View more
07-22-2016
12:37 AM
Hi @Juan Manuel Nieto, Generally /tmp directory mainly has temporary storage during MapReduce phases. Mapreduce adds the intermediate data that is kept under /tmp. These files will be automatically cleared out when Mapreduce job execution completes. Temporary files are also created by pig as it runs on Mapreduce phenomenon. Temporary files deletion happens at the end. Pig does not handle temporary files deletion if the script execution failed or killed. Then we have to handle the situation. This could be better handled by added the lines or changes in the script itself. For further details I found an article here: Hope that helps. Thanks, Sujitha
... View more
07-22-2016
12:18 AM
Hi @srinivasa rao, Answer1: Application master negotiates with the Resource Manager for the resources not for the containers. Container can be assumed as a box with resources for running an application. Resources are negotiated with resource manager through resource manager protocol by Application Master based on the User-code. Since it is essentially user-code, do not trust the ApplicationMaster(s) i.e. any ApplicationMaster is not a privileged service. The YARN system (ResourceManager and NodeManager) has to protect itself from faulty or malicious ApplicationMaster(s) and resources granted to them at all costs. Answer2: each job is performed within a Container. it could be multiple jobs or one job thats been done in a container based on the resources granted by the RM through AM. Answer3: the internals of how the resources are allocated or scheduled is always taken care by Resource Manager. May be 20% or rest of the 80% always it the job of the Resource Manager to allocate the resources to the Application Master working along with the node manager on that particular Node. Its always the responsibility of Node Manager and Resource Manager to check the status of the resources allocated. Hope that help. For more information here is the article which explains in simple terms. http://hortonworks.com/blog/apache-hadoop-yarn-concepts-and-applications/ Thanks, Sujitha
... View more
07-21-2016
11:48 PM
Hi @Johnny Fugers and @Suyog Nagaokar, I tried to provide the answer here . Please let me know if you still have issues. https://community.hortonworks.com/questions/46444/convert-millseconds-into-unix-timestamp.html#answer-46590 Thanks, Sujitha
... View more
07-21-2016
11:28 PM
1 Kudo
Hi @Johnny Fugers, Input file data as: dataset.csv This gives answer in CET 563355,1388481000000 563355,1388481000000 563355,1388481000000 563356,1388481000000 a = load '/tmp/dataset.csv' using PigStorage(',') as (id:chararray, at:chararray); b = foreach a generate id, ToString( ToDate( (long)at), 'yyyy-MM-dd hh:ss:mm' ); c = group b by id; dump c; This is how it works in GMT: a = load '/tmp/dataset.csv' using PigStorage(',') as (id:chararray, at:chararray);
b = foreach a generate id, ToDate(ToString(ToDate((long) at), 'yyyy-MM-dd hh:ss:mm'), 'yyyy-MM-dd hh:ss:mm', 'GMT'); c = group b by id; dump c; Hope that helps, Thanks, Sujitha
... View more
07-21-2016
09:53 PM
Hi @Rajinder Kaur, The step 8 should not take more than 3 seconds. Can you make sure you followed all the steps as instructed. Also wanted to check if you created sandbox in Azure? If so there are certain configurations that needs to be changed, thats also been instructed as part of the lab. Please rerun the steps and let me know if that works. Attached the output screen shots just for reference. Thanks, Sujitha
... View more
07-21-2016
09:21 PM
Hi @Dmitry Kondratiev, My maven build was successful. Here are my configurations, HDP2.4,Apache Maven 3.2.5,Java 1.7. Can you please provide us your configurations and also detailed error with logs with mvn clean package -e Also in the mentioned logs I see an exception as sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target I tried to read a few blogs for this java error and it seems one of this solutions might help. http://stackoverflow.com/questions/21076179/pkix-path-building-failed-and-unable-to-find-valid-certification-path-to-requ Can you please test and let me know. Thanks, Sujitha
... View more
07-19-2016
08:40 PM
Hi, I am working on a RFP and looking for an answer to: Ability to recalculate and alert when there are changes to historical data within a time period within your solution: What I don't understand is we cannot modify the data in HDFS. Its immutable. So the change of historical data, does that applies? Any help is highly appreciated. Thanks, Sujitha
... View more
07-18-2016
07:44 PM
Hi, I am working on a RFP and looking answer: Specify Administration Tools recommended and their functionality in HDP in short. I am looking for a way to keep this simple. Any help is most appreciated, Thanks, Sujitha
... View more
07-14-2016
06:53 PM
Hi @Marcus Matthews, If you think one of my answers helped. Can you please vote for it as best answer. Thanks, Sujitha
... View more
07-14-2016
06:52 PM
Hi @ghost k, If this resolved your problem can you please vote the best answer. Thanks, Sujitha
... View more
07-12-2016
05:54 PM
Hi @Ian Li, If this issue was resolved can you pick the best answer and we can think this as resolved. Thanks, Sujitha
... View more
07-08-2016
07:56 PM
Hi @ghost k, Step1: #Can you edit pg_hba.conf file in /var/lib/pgsql/data # Add the following line as the first line of pg_hba.conf. It allows access to all databases for all users with an encrypted password: host all all 0.0.0.0/0 md5 On postgres end:
su postgres psql \c ambari #list all tables \dt ambari.* select*from ambari.hosts; Step2: Download the jar and curl -L 'http://jdbc.postgresql.org/download/postgresql-9.2-1002.jdbc4.jar'-o postgresql-9.2-1002.jdbc4.jar Step3: sqoop list-tables --connect jdbc:postgresql://127.0.0.1/ambari --username ambari --password bigdata Step4: sqoop ambari hosts table into hdfs: password: bigdata sqoop import --connect jdbc:postgresql://127.0.0.1/ambari --username ambari -P --table hosts --target-dir /user/guest/ambari_hosts_table Hope this helps... Thanks, Sujitha
... View more
07-07-2016
04:13 AM
2 Kudos
Story: From
the documentation I was able to add a service to existing stack definition in
Ambari. Issue: But I
was either not able to stop the service or delete the service just in case. https://cwiki.apache.org/confluence/display/AMBARI/Defining+a+Custom+Stack+and+Services How did I solve the problem?
1.Create and
Add Stack: cd
/var/lib/ambari-server/resources/stacks/HDP/2.4/services
2.Create a directory that contains the service
definition for SAMPLESRV mkdir
/var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV cd
/var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV
3.Create
a metainfo.xml as show in the link above.
4.With
this we have a service name as SAMPLESRV and it contains SAMPLESRV_MASTER,
SAMPLESRV_SLAVE and SAMPLESRV_CLIENT
5.Next we need to
create the command scripts mkdir –p /var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV/package/scriptscd /var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV/package/scripts 6.Browse the scripts
directory and create the .py command scripts:master.py, slave.py and
sample_client.py under : /var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV/package/scripts Master.py
and slave.py here
was the issue: in the documentation
it doesn’t mention about the dummy.pid that needs to be created. Since we have not installed a real service, there is no PID file
created by it. Therefore, we are going to artificially create the PID, remove
the PID and check the process status of the dummy pid.
7.Then restart ambari: ambari-server
restart and add the service to the stack as shown the document. Just don't want to duplicate the process with steps here. Hope this helps....
... View more
- Find more articles tagged with:
- ambari-service
- Cloud & Operations
- How-ToTutorial
- service
Labels:
07-06-2016
09:05 PM
1 Kudo
Hi @Ian Li, I was able to get this to stop and start by adding these lines in the master.py and slave.py I added a dummy PID, Since we have not installed a real service, there is no PID file created by it.Therefore we are going to artificially create the PID,remove the PIDand check the process status of the dummy pid. screen-shot-2016-07-06-at-20304-pm.png screen-shot-2016-07-06-at-20322-pm.png screen-shot-2016-07-06-at-20340-pm.png Hope this helps. Let me know in case of any other issues. Thanks, Sujitha
... View more