Member since
04-04-2016
147
Posts
40
Kudos Received
16
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
490 | 07-22-2016 12:37 AM | |
1431 | 07-21-2016 11:48 PM | |
902 | 07-21-2016 11:28 PM | |
811 | 07-21-2016 09:53 PM | |
1052 | 07-08-2016 07:56 PM |
03-10-2017
07:22 PM
1 Kudo
Adding TTL on Solr: cd to this directory Step1: Step2: Step3: vi managed-schema: add these 3 lines <field
name="_timestamp_" type="date" indexed="true"
stored="true" multiValued="false" />
<field name="_ttl_" type="string"
indexed="true" multiValued="false" stored="true"
/>
<field name="_expire_at_" type="date"
multiValued="false" indexed="true" stored="true"
/> Step4: vi solrconfig.xml Replace the below 3 lines with the lines after it: <updateRequestProcessorChain
name="add-unknown-fields-to-the-schema">
<!-- UUIDUpdateProcessorFactory will generate an id if none is present in
the incoming document -->
<processor class="solr.UUIDUpdateProcessorFactory"/> as <updateRequestProcessorChain
name="add-unknown-fields-to-the-schema"> <processor
class="solr.TimestampUpdateProcessorFactory"> <str
name="fieldName">_timestamp_</str> </processor> <processor
class="solr.DefaultValueUpdateProcessorFactory"> <str
name="fieldName">_ttl_</str> <str
name="value">+30SECONDS</str> </processor> <processor
class="solr.processor.DocExpirationUpdateProcessorFactory"> <str
name="ttlFieldName">_ttl_</str> <str
name="ttlParamName">_ttl_</str> <int
name="autoDeletePeriodSeconds">30</int> <str
name="expirationFieldName">_expire_at_</str> </processor> <processor
class="solr.FirstFieldValueUpdateProcessorFactory"> <str
name="fieldName">_expire_at_</str> </processor> <!--
UUIDUpdateProcessorFactory will generate an id if none is present in the
incoming document --> <processor
class="solr.UUIDUpdateProcessorFactory" />
Things that might be useful: Make sure to start solr like this so that configs related to
solr goes to /solr in zookeeper: 1./opt/lucidworks-hdpsearch/solr/bin/solr
start -c –z lake1.field.hortonworks.com:2181, lake2.field.hortonworks.com:2181,
lake3.field.hortonworks.com:2181/solr 2.create
the collection like this /opt/lucidworks-hdpsearch/solr/bin/solr create -c
tweets -d data_driven_schema_configs -s 1 -rf 1 3.to
delete the collection: http://testdemo.field.hortonworks.com:8983/solr/admin/collections?action=DELETE&name=tweets 4.also
remove it from zkCli.sh as rmr /solr/config/tweets Thanks, Sujitha Sanku please ping me or email me at ssanku@hortonworks.com in case of any issues.
... View more
- Find more articles tagged with:
- How-ToTutorial
- Sandbox & Learning
- solr
- solrcloud
Labels:
11-21-2016
09:37 PM
Special thanks to Michael Young for the help to be my mentor. Step1: cd /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs/conf/ screen-shot-2016-11-21-at-100524-am.png Step2: vi managed-schema: add these 3 lines <field
name="_timestamp_" type="date" indexed="true"
stored="true" multiValued="false" />
<field name="_ttl_" type="string"
indexed="true" multiValued="false" stored="true"
/>
<field name="_expire_at_" type="date"
multiValued="false" indexed="true" stored="true"
/> screen-shot-2016-11-21-at-100929-am.png Step3: vi solrconfig.xml on the same directory. Replace the below 3 lines with the lines after it:
<updateRequestProcessorChain name="add-unknown-fields-to-the-schema">
<!-- UUIDUpdateProcessorFactory will generate an id if none is present in
the incoming document -->
<processor /> as <updateRequestProcessorChain name="add-unknown-fields-to-the-schema">
<processor>
<str name="fieldName">_timestamp_</str>
</processor>
<processor>
<str name="fieldName">_ttl_</str>
<str name="value">+30SECONDS</str>
</processor>
<processor
class="solr.processor.DocExpirationUpdateProcessorFactory">
<str name="ttlFieldName">_ttl_</str>
<str name="ttlParamName">_ttl_</str>
<int name="autoDeletePeriodSeconds">30</int>
<str name="expirationFieldName">_expire_at_</str>
</processor>
<processor>
<str name="fieldName">_expire_at_</str>
</processor>
<!-- UUIDUpdateProcessorFactory will generate an id if none is present in
the incoming document --> <processor
class="solr.UUIDUpdateProcessorFactory" /> screen-shot-2016-11-21-at-101045-am.png Hope that helps. Thanks, Sujitha
... View more
- Find more articles tagged with:
- Data Processing
- help
- How-ToTutorial
- solr
- solrcloud
Labels:
11-03-2016
01:23 AM
@Artem Ervits, this solution still give me the same issue. Also I have these changes on the edge node. That is correct right? @brahmasree b did you find solution to this question? if so can you please post
... View more
10-25-2016
12:35 AM
1 Kudo
Hi @Bryan Bende, Thanks for the reply. Yes I realized the error and I followed these steps, https://community.hortonworks.com/articles/26551/accessing-kerberos-enabled-kafka-topics-using-getk.html Also I name my principle as "nifi/iotdemo.field.hortonworks.com@LAKE" also do I need to mention these lines in my zookeeper.properties? 3. Added 3 additional properties to the bottom of the zookeeper.properties file you have configured per the linked procedure above: authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider jaasLoginRenew=3600000 requireClientAuthScheme=sasl Right now my error is: "Caused by: javax.security.auth.login.LoginException: Could not login: the client is being asked for a password, but the Kafka client code does not currently support obtaining a password from the user. not available to garner authentication information from the user" Please find attached my PutKafka processor configurations. Any help is highly appreciated.. screen-shot-2016-10-24-at-53412-pm.png screen-shot-2016-10-24-at-53535-pm.png Thanks a lot, Sujitha
... View more
10-24-2016
07:44 PM
Hi, I have an error while trying to stream the data using NiFi flow, in a kerberized environment with LDAP integrated. The error "failed while waiting for acks from Kafka" , I attached the error and properties screenshot. By the way there is a conf called " Kerberos Service Name" is that the error, Any help is highly appreciated. Thanks, Sujitha screen-shot-2016-10-24-at-124017-pm.png screen-shot-2016-10-24-at-124006-pm.png
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache NiFi
08-18-2016
05:43 AM
1 Kudo
Solr indexing the
MySQL database table on HDP 2.5 Tech Preview: Solr version used: solr 4.9.0 Step1: Downloaded the solr 4.9.0.zip from https://archive.apache.org/dist/lucene/solr/4.9.0/ Step2: Extract the file: Step3: modify the solrconfig.xml, schema.xml and add the
db-data-config.xml at Step4: add the jar at this location
a.vi solrconfig.xml: add these lines in between
the config tags. <lib
dir="../../../contrib/dataimporthandler/lib/"
regex=".*\.jar" /> <lib dir="../../../dist/"
regex="solr-dataimporthandler-\d.*\.jar" /> <lib dir="../../../lib/"
regex="mysql-connector-java-5.0.8-bin.jar" /> <requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">db-data-config.xml</str>
</lst> </requestHandler>
b.vi schema.xml add the below line: <dynamicField name="*_name" type="text_general" multiValued="false"
indexed="true" stored="true"
/>
c.Create a file called db-data-config.xml at the
same path later in this session I would create a database employee in mysql
server add these <dataConfig>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/employees" user="root"
password="hadoop" />
<document>
<entity name="id" query="select emp_no as 'id',
first_name, last_name from employees limit 1000;" />
</document> </dataConfig> After this
is complete run the below command (d) to start solr and check if solr is up and
running at url below: 8983 is the default port of solr d.java –jar start.jar http://localhost:8983/solr/#/
e.select the core selector as collection1. f.Click on Data Import, expand configuration and check if its
pointing to our db-data-config.xml file we created. g.After the completion of Step5 below click on execute on the page. Step5: Setting
up database: Import an already
available database into Mysql:
Ref:
https://dev.mysql.com/doc/employee/en/employees-installation.html shell> tar -xjf
employees_db-full-1.0.6.tar.bz2 shell> cd
employees_db/ shell> mysql -t
< employees.sql With this
installation of employees db in mysql is complete. Step6: With this our indexing is complete using
Solr. To do: I will try indexing the tables in Mysql using latest
version of Solr. Reference: http://blog.comperiosearch.com/blog/2014/08/28/indexing-database-using-solr/ Hope this helps…. Thanks, Sujitha
... View more
- Find more articles tagged with:
- How-ToTutorial
- indexing
- solr
- solutions
Labels:
08-15-2016
11:03 PM
Hi, I am trying to login to my mysql prompt on HDP 2.5. Its doesn't allow me to login. Not sure whats the password set on it. please find the attached. Any help is highly appreciated. Thanks, Sujitha
... View more
07-29-2016
08:29 PM
Hi @milind pandit, Thanks for the info. Yes apart from Nifi can I focus on something http://hortonworks.com/partner/sap/ and http://hortonworks.com/partner/informatica/ would that makes sense to add these as examples. Thanks again for the reply. Thanks, Sujitha
... View more
07-29-2016
05:59 PM
Hi there, I am looking for a better way of answering this question with any references and documentation. Platform architected with “Ease of integration” with other applications or technologies? This is from one of my RFP questions. Any help is highly appreciated. Thanks, Sujitha
... View more
07-28-2016
10:54 PM
Hi, I am looking for a reference and demos to show Text & Data mining capabilities on our platform. I am trying to answer one of the RFP questions. Any help is highly appreciated. Thanks, Sujitha
... View more
07-22-2016
07:38 PM
Hi @srinivasa rao, glad that you are satisfied by the answer provided by Benjamin Leonhardi. Please let me know in case of any issues.
... View more
07-22-2016
12:37 AM
Hi @Juan Manuel Nieto, Generally /tmp directory mainly has temporary storage during MapReduce phases. Mapreduce adds the intermediate data that is kept under /tmp. These files will be automatically cleared out when Mapreduce job execution completes. Temporary files are also created by pig as it runs on Mapreduce phenomenon. Temporary files deletion happens at the end. Pig does not handle temporary files deletion if the script execution failed or killed. Then we have to handle the situation. This could be better handled by added the lines or changes in the script itself. For further details I found an article here: Hope that helps. Thanks, Sujitha
... View more
07-22-2016
12:18 AM
Hi @srinivasa rao, Answer1: Application master negotiates with the Resource Manager for the resources not for the containers. Container can be assumed as a box with resources for running an application. Resources are negotiated with resource manager through resource manager protocol by Application Master based on the User-code. Since it is essentially user-code, do not trust the ApplicationMaster(s) i.e. any ApplicationMaster is not a privileged service. The YARN system (ResourceManager and NodeManager) has to protect itself from faulty or malicious ApplicationMaster(s) and resources granted to them at all costs. Answer2: each job is performed within a Container. it could be multiple jobs or one job thats been done in a container based on the resources granted by the RM through AM. Answer3: the internals of how the resources are allocated or scheduled is always taken care by Resource Manager. May be 20% or rest of the 80% always it the job of the Resource Manager to allocate the resources to the Application Master working along with the node manager on that particular Node. Its always the responsibility of Node Manager and Resource Manager to check the status of the resources allocated. Hope that help. For more information here is the article which explains in simple terms. http://hortonworks.com/blog/apache-hadoop-yarn-concepts-and-applications/ Thanks, Sujitha
... View more
07-21-2016
11:48 PM
Hi @Johnny Fugers and @Suyog Nagaokar, I tried to provide the answer here . Please let me know if you still have issues. https://community.hortonworks.com/questions/46444/convert-millseconds-into-unix-timestamp.html#answer-46590 Thanks, Sujitha
... View more
07-21-2016
11:28 PM
1 Kudo
Hi @Johnny Fugers, Input file data as: dataset.csv This gives answer in CET 563355,1388481000000 563355,1388481000000 563355,1388481000000 563356,1388481000000 a = load '/tmp/dataset.csv' using PigStorage(',') as (id:chararray, at:chararray); b = foreach a generate id, ToString( ToDate( (long)at), 'yyyy-MM-dd hh:ss:mm' ); c = group b by id; dump c; This is how it works in GMT: a = load '/tmp/dataset.csv' using PigStorage(',') as (id:chararray, at:chararray);
b = foreach a generate id, ToDate(ToString(ToDate((long) at), 'yyyy-MM-dd hh:ss:mm'), 'yyyy-MM-dd hh:ss:mm', 'GMT'); c = group b by id; dump c; Hope that helps, Thanks, Sujitha
... View more
07-21-2016
09:53 PM
Hi @Rajinder Kaur, The step 8 should not take more than 3 seconds. Can you make sure you followed all the steps as instructed. Also wanted to check if you created sandbox in Azure? If so there are certain configurations that needs to be changed, thats also been instructed as part of the lab. Please rerun the steps and let me know if that works. Attached the output screen shots just for reference. Thanks, Sujitha
... View more
07-19-2016
08:40 PM
Hi, I am working on a RFP and looking for an answer to: Ability to recalculate and alert when there are changes to historical data within a time period within your solution: What I don't understand is we cannot modify the data in HDFS. Its immutable. So the change of historical data, does that applies? Any help is highly appreciated. Thanks, Sujitha
... View more
07-18-2016
07:44 PM
Hi, I am working on a RFP and looking answer: Specify Administration Tools recommended and their functionality in HDP in short. I am looking for a way to keep this simple. Any help is most appreciated, Thanks, Sujitha
... View more
07-14-2016
06:52 PM
Hi @ghost k, If this resolved your problem can you please vote the best answer. Thanks, Sujitha
... View more
07-12-2016
05:54 PM
Hi @Ian Li, If this issue was resolved can you pick the best answer and we can think this as resolved. Thanks, Sujitha
... View more
07-08-2016
07:56 PM
Hi @ghost k, Step1: #Can you edit pg_hba.conf file in /var/lib/pgsql/data # Add the following line as the first line of pg_hba.conf. It allows access to all databases for all users with an encrypted password: host all all 0.0.0.0/0 md5 On postgres end:
su postgres psql \c ambari #list all tables \dt ambari.* select*from ambari.hosts; Step2: Download the jar and curl -L 'http://jdbc.postgresql.org/download/postgresql-9.2-1002.jdbc4.jar'-o postgresql-9.2-1002.jdbc4.jar Step3: sqoop list-tables --connect jdbc:postgresql://127.0.0.1/ambari --username ambari --password bigdata Step4: sqoop ambari hosts table into hdfs: password: bigdata sqoop import --connect jdbc:postgresql://127.0.0.1/ambari --username ambari -P --table hosts --target-dir /user/guest/ambari_hosts_table Hope this helps... Thanks, Sujitha
... View more
07-07-2016
04:13 AM
2 Kudos
Story: From
the documentation I was able to add a service to existing stack definition in
Ambari. Issue: But I
was either not able to stop the service or delete the service just in case. https://cwiki.apache.org/confluence/display/AMBARI/Defining+a+Custom+Stack+and+Services How did I solve the problem?
1.Create and
Add Stack: cd
/var/lib/ambari-server/resources/stacks/HDP/2.4/services
2.Create a directory that contains the service
definition for SAMPLESRV mkdir
/var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV cd
/var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV
3.Create
a metainfo.xml as show in the link above.
4.With
this we have a service name as SAMPLESRV and it contains SAMPLESRV_MASTER,
SAMPLESRV_SLAVE and SAMPLESRV_CLIENT
5.Next we need to
create the command scripts mkdir –p /var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV/package/scriptscd /var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV/package/scripts 6.Browse the scripts
directory and create the .py command scripts:master.py, slave.py and
sample_client.py under : /var/lib/ambari-server/resources/stacks/HDP/2.4/services/SAMPLESRV/package/scripts Master.py
and slave.py here
was the issue: in the documentation
it doesn’t mention about the dummy.pid that needs to be created. Since we have not installed a real service, there is no PID file
created by it. Therefore, we are going to artificially create the PID, remove
the PID and check the process status of the dummy pid.
7.Then restart ambari: ambari-server
restart and add the service to the stack as shown the document. Just don't want to duplicate the process with steps here. Hope this helps....
... View more
- Find more articles tagged with:
- ambari-service
- Cloud & Operations
- How-ToTutorial
- service
Labels:
07-06-2016
09:05 PM
1 Kudo
Hi @Ian Li, I was able to get this to stop and start by adding these lines in the master.py and slave.py I added a dummy PID, Since we have not installed a real service, there is no PID file created by it.Therefore we are going to artificially create the PID,remove the PIDand check the process status of the dummy pid. screen-shot-2016-07-06-at-20304-pm.png screen-shot-2016-07-06-at-20322-pm.png screen-shot-2016-07-06-at-20340-pm.png Hope this helps. Let me know in case of any other issues. Thanks, Sujitha
... View more
07-06-2016
01:23 AM
Hi @Ian Li, I followed the steps from https://cwiki.apache.org/confluence/display/AMBARI/Defining+a+Custom+Stack+and+Services, I was able to launch the service and it shows green . Please find the attached. My assumption is within the documentation I see the steps for applicable for HDP 2.0.6 -> 2.1. Can you try the steps in /var/lib/ambari-server/resources/stacks/HDP/2.4/services new-sample-service.png screenshot1.png Hope that helps. Let me know in case of issues. Thanks, Sujitha
... View more
07-05-2016
09:39 PM
Hi @Saurabh Kumar, Can you please confirm that you can see your Nifi UI at: Step1: http://127.0.0.1:9090/nifi/ or http://sandbox.hortonworks.com:9090/nifi/ If not, please ensure that port forwarding is applied as http://hortonworks.com/hadoop-tutorial/how-to-refine-and-visualize-sentiment-data/ Step2: Can you make sure you created the app and access token, access secret, consumer key and consumer secret keys are all correct. Sep3: In case of installation issue, please follow these steps to add Nifi service. https://community.hortonworks.com/articles/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.html Hope this help. Please let me know in case of issues. Thanks, Sujitha
... View more
07-05-2016
08:01 PM
1 Kudo
Hi @Mukesh Kumar, The below guide explains step by step process to install Nifi, run a simple twitter demo. Hope this helps. https://community.hortonworks.com/articles/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.html Thanks, Sujitha
... View more
07-05-2016
07:03 PM
Hi @ghost k Can you please share the queries you used for importing the tables,creating the tables and also can you share the complete logs from: /var/log/sqoop when you use sqoop list-tables if ran successfully it should show the list of tables in the database server in this case its postgre. Thanks, Sujitha
... View more
07-05-2016
06:32 PM
I echo Bala, this error is occurring due to connectivity issue with remote database. Thanks, Sujitha
... View more
07-01-2016
12:40 AM
Hi @Ravi Mutyala, Thanks for the response. With this I understand that the whole hadoop ecosystem uses UTF-8. Is that correct? Can you confirm. Thanks, Sujitha
... View more