Member since
01-10-2020
26
Posts
0
Kudos Received
0
Solutions
01-23-2020
07:35 AM
It's a bug in Oozie. CoordActionCheckXCommand doesn't take care of SUSPENDED state. It only handles SUCCEEDED, FAILED and KILLED. protected Void execute() throws CommandException {
try {
InstrumentUtils.incrJobCounter(getName(), 1, getInstrumentation());
Status slaStatus = null ;
CoordinatorAction.Status initialStatus = coordAction.getStatus();
if (workflowJob.getStatus() == WorkflowJob.Status.SUCCEEDED) {
coordAction.setStatus(CoordinatorAction.Status.SUCCEEDED);
// set pending to false as the status is SUCCEEDED coordAction.setPending(0);
slaStatus = Status.SUCCEEDED;
}
else {
if (workflowJob.getStatus() == WorkflowJob.Status.FAILED) {
coordAction.setStatus(CoordinatorAction.Status.FAILED);
slaStatus = Status.FAILED;
// set pending to false as the status is FAILED coordAction.setPending(0);
}
else {
if (workflowJob.getStatus() == WorkflowJob.Status.KILLED) {
coordAction.setStatus(CoordinatorAction.Status.KILLED);
slaStatus = Status.KILLED;
// set pending to false as the status is KILLED coordAction.setPending(0);
}
else {
LOG.warn( "Unexpected workflow " + workflowJob.getId() + " STATUS " + workflowJob.getStatus());
coordAction.setLastModifiedTime( new Date());
CoordActionQueryExecutor.getInstance().executeUpdate(
CoordActionQueryExecutor.CoordActionQuery.UPDATE_COORD_ACTION_FOR_MODIFIED_DATE,
coordAction);
return null ;
}
}
}
... View more
01-23-2020
07:28 AM
You can find the indices and matches of the header with the finditer of the re package. Then, use that to process the rest: import re import json
thefile = open ( "file.txt" ) line = thefile . readline () iter = re . finditer ( "\w+\s+" , line ) columns = [( m . group ( 0 ), m . start ( 0 ), m . end ( 0 )) for m in iter ] records = []
while line : line = thefile . readline () record = {}
for col in columns : record [ col [ 0 ]] = line [ col [ 1 ]: col [ 2 ]] records . append ( record )
print ( json . dumps ( records ))
... View more
01-22-2020
06:10 AM
This looks like a java memory error. The reason why a select * works, but a select column doesn't, is that the select * just pulls a row of data from HDFS rather than actually executing a map-reduce job.
You might be able to solve the problem by increasing the maximum heap size:
export HADOOP_CLIENT_OPTS="-Xmx512m"
would set the heap size to 512m, for example.
I hope this helps!
Regards,
Lewis
Tech-consultant
... View more
01-22-2020
06:06 AM
SHORT Cloudera has broken zookeeper 3.4.5-cdh5.4.0 in several places. Service is working but CLI is dead. No workaround other than rollback. LONG Assign a bounty on this ;-). I have stepped on this mine too and was angry enough to find the reason: Zookeeper checks JLine during ZooKeeperMain.run(). There is a try-catch block that loads a number of classes. Any exception during class loading fails the whole block and JLine support is reported to be disabled. But here is why this happens with CDH 5.4.0: Current opensource Zookeeper-3.4.6 works against jline-0.9.94. Has no such issue. In CDH 5.4 Cloudera has applied the following patch: roman@node4:$ diff zookeeper-3.4.5-cdh5.3.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java zookeeper-3.4.5-cdh5.4.0/src/java/main/org/apache/zookeeper/ZooKeeperMain.java
305,306c305,306
< Class consoleC = Class.forName("jline.ConsoleReader");
< Class completorC =
---
> Class consoleC = Class.forName("jline.ConsoleReader");
> Class completorC =
316,317c316,317
< Method addCompletor = consoleC.getMethod("addCompletor",
< Class.forName("jline.Completor"));
---
> Method addCompletor = consoleC.getMethod("addCompleter",
> Class.forName("jline.console.completer.Completer"));
CDH 5.4 uses jline-2.11.jar for ZooKeeper and it has no jline.ConsoleReader class (from 2.11 it is jline.console.ConsoleReader). Jline 0.9.94 in turn has no jline.console.completer.Completer. So there is incompatibility with any existing JLine. Any Cloudera CDH 5.4 user can run zookeeper-client on his/her cluster and find it does not work. Open-source zookeeper-3.4.6 depends on jline-0.9.94 which has no such patches. Don't know why Cloudera engineers have done such a mine. I see no clean way to fix it with 3.4.5-cdh5.4.0. I stayed with 3.4.5-cdh5.3.3 dependency where I need CLI and have production clusters. It seemed to me both jline-0.9.94.jar and jline.2.11.jar in classpath for zookeeper will fix the problem. But just have found Cloudera made another 'fix' in ZK for CDH 5.4.0, they have renamed org.apache.zookeeper.JLineZNodeCompletor class to org.apache.zookeeper.JLineZNodeCompleter. But here is the code from ZooKeeperMain.java Class <?> completorC = Class . forName ( "org.apache.zookeeper.JLineZNodeCompletor" ); And of course, it means practically it is not possible to start ZK CLI in CDH 5.4.0 proper way. Awful work. 😞
... View more
01-22-2020
06:01 AM
Look in the nifi-app.log and the nifi-bootstrap.log, they'll provide more information on why NiFi didn't start up. The default location of these files will be in the nifi-root-directory/logs.
... View more
01-22-2020
06:00 AM
You should be using the sys.columns catalog view. syscolumns is included only for backwards compatibility. It's really a SQL Server 2000 system table that shouldn't be used in SQL Server 2008 R2. select * from sys.columns where object_id = object_id('MyTable') order by column_id That should return the order of your columns. Note, though, these column id's might not be sequential.
... View more
01-21-2020
06:04 AM
You cannot upgrade to Cloudera Manager or CDH 6.0.0 from Cloudera Manager or CDH 5.15 or 5.16.
... View more
01-21-2020
05:59 AM
To set up the Kerberos configuration file in the default location: Obtain a krb5.conf configuration file. You can obtain this file from your Kerberos administrator, or from the /etc/krb5.conf folder on the machine that is hosting the Hive Server 2 instance. Rename the configuration file from krb5.conf to krb5.ini. Copy the krb5.ini file to the C:\ProgramData\MIT\Kerberos5 directory and overwrite the empty sample file. To set up the Kerberos configuration file in a custom location: Obtain a krb5.conf configuration file. You can obtain this file from your Kerberos administrator, or from the /etc/krb5.conf folder on the machine that is hosting the Hive Server 2 instance. Place the krb5.conf file in an accessible directory and make note of the full path name. Open the System window: If you are using Windows 7 or earlier, click Start , then right-click Computer, and then click Properties. Or, if you are using Windows 8 or later, right-click This PC on the Start screen, and then click Properties. Click Advanced System Settings. In the System Properties dialog box, click the Advanced tab and then click Environment Variables. In the Environment Variables dialog box, under the System Variables list, click New. In the New System Variable dialog box, in the Variable Name field, type KRB5_CONFIG. In the Variable Value field, type the full path to the krb5.conf file. Click OK to save the new variable. Make sure that the variable is listed in the System Variables list. Click OK to close the Environment Variables dialog box, and then click OK to close the System Properties dialog box.
... View more
01-21-2020
05:57 AM
found two values for the search "java configuration options for node manager" copy/paste to make the same (we added JMX parameters) this seems to have fixed it. needs verification.
... View more
01-21-2020
05:40 AM
I have faced the same issue while using Spark on Google Cloud Dataproc. If you will access Spark Job UI not through 4040 port directly, but through YARN Web UI (8088 port) you will see correctly rendered web pages. To work around this issue when accessing Spark UI directly through 4040 the port you need to reset spark.ui.proxyBase property inside your Spark job (not in CLI/job submission command), because it gets overridden by Spark UI proxy: sys . props . update ( "spark.ui.proxyBase" , "" )
... View more
01-16-2020
06:27 AM
Download HDP Sandbox
MySQL database (Should already be present in the sandbox)
NiFi 0.6 or later (Download and install a new version of NIFI or use Ambari to install NIFI in the sandbox)
Use this code:
insert into cdc_test values (3, 'cdc3', null, null);
insert into cdc_test values (4, 'cdc3', null, null);
insert into cdc_test values (5, 'cdc3', null, null);
insert into cdc_test values (6, 'cdc3', null, null);
insert into cdc_test values (7, 'cdc3', null, null);
insert into cdc_test values (8, 'cdc3', null, null);
insert into cdc_test values (9, 'cdc3', null, null);
insert into cdc_test values (10, 'cdc3', null, null);
insert into cdc_test values (11, 'cdc3', null, null);
insert into cdc_test values (12, 'cdc3', null, null);
insert into cdc_test values (13, 'cdc3', null, null);
I hope this helps!
Regards,
Lewis
... View more
01-16-2020
06:00 AM
I think you should raise the ticket for this issue. The support team can help you with an exact solution. Regards, Lewis
... View more
01-16-2020
05:56 AM
SELECT ID , ItemName , NetPrice , [ NetPrice ] + [ NetPrice ] / 100 * 19 AS GrossPrice FROM tblItems ; I hope this piece of code works for you! Regards, Lewis
... View more
01-16-2020
05:52 AM
I was also facing a similar kind of issue this code helped me! Try this link: https://github.com/mnemonic-no/act/blob/master/example-config/scio-act-workflow-2019-11-22.xml It might help you! Regards, Lewis
... View more
01-15-2020
06:37 AM
I cloned Knox git repo (commit 92b1505a), which includes KNOX-895 (2d236e78), run it locally with added WebSocket service to sandbox topology. [tulinski]$ wscat -n --auth 'user:password' -c wss://localhost:8443/gateway/sandbox/echows
[tulinski]$ sudo ngrep -W byline host echo.websocket.org
#
T 192.168.0.16:59952 -> 174.129.224.73:80 [AP]
GET / HTTP/1.1.
Host: echo.websocket.org.
Upgrade: websocket.
Connection: Upgrade.
Sec-WebSocket-Key: Z4Qa9Dxwr6Qvq2QAicsT5Q==.
Sec-WebSocket-Version: 13.
Pragma: no-cache.
Cache-Control: no-cache.
Authorization: Basic dXNlcjpwYXNzd29yZA==.
.
##
T 174.129.224.73:80 -> 192.168.0.16:59952 [AP]
HTTP/1.1 101 Web Socket Protocol Handshake.
Connection: Upgrade.
Date: Mon, 16 Oct 2017 14:23:49 GMT.
Sec-WebSocket-Accept: meply+6cIyjbH+Vk2OsAqKJDWic=.
Server: Kaazing Gateway.
Upgrade: websocket.
. Authorization header is passed to the backend service.
... View more
01-15-2020
06:33 AM
Try and use it after this modification. I succeeded like this. [You] jdbc:hive2://zk=hadoopcluster01:2181,hadoopcluster02:2181/hiveserver2 [My Proposal] jdbc:hive2://zk=hadoopcluster01:2181/hiveserver2,hadoopcluster02:2181/hiveserver2
... View more
01-15-2020
06:30 AM
Most web applications encounter problems of latency because they process data discretely instead of in streams. ndjsonstream() converts a ReadableStream of raw ndjson data into a ReadableStream of Javascript objects.
... View more
01-13-2020
06:09 AM
The first step is to create a Root Secure Sockets Layer (SSL) certificate. This root certificate can then be used to sign any number of certificates you might generate for individual domains. If you aren’t familiar with the SSL ecosystem, this article from DNSimple does a good job of introducing Root SSL certificates. Generate an RSA-2048 key and save it to a file rootCA.key. This file will be used as the key to generating the Root SSL certificate. You will be prompted for a passphrase which you’ll need to enter each time you use this particular key to generate a certificate. openssl genrsa -des3 -out rootCA.key 2048 You can use the key you generated to create a new Root SSL certificate. Save it to a file namedrootCA.pem. This certificate will have a validity of 1,024 days. Feel free to change it to any number of days you want. You’ll also be prompted for other optional information. openssl req -x509 -new -nodes -key rootCA.key -sha256 -days 1024 -out rootCA.pem I hope this information helps! Regards, Lewis
... View more
01-13-2020
06:06 AM
After Kafka is deployed and running, validate the installation. You can use the command-line interface to create a Kafka topic, send test messages, and consume the messages. Click the Ambari "Services" tab. In the Ambari "Actions" menu, select "Add Service." This starts the Add Service wizard, displaying the Choose Services page. Some of the services are enabled by default. Scroll through the alphabetic list of components on the Choose Services page, and select "Kafka". Click Next to continue. On the Assign Masters page, review the node assignments for Kafka nodes. The following screen shows node assignment for a single-node Kafka cluster. If you want Kafka to run with high availability, you must assign more than one node for Kafka brokers, resulting in Kafka brokers running on multiple nodes. Click the "+" symbol to add more broker nodes to the cluster: Click Next to continue. On the Assign Slaves and Clients page, choose the nodes that you want to run ZooKeeper clients. Click Next to continue. Ambari displays the Customize Services page, which lists a series of services. For your initial configuration, you should use the default values set by Ambari. If Ambari prompts you with the message "Some configurations need your attention before you can proceed," review the list of properties and provide the required information. For information about optional settings that are useful in production environments, see Configuring Apache Kafka for a Production Environment. Click Next to continue. When the wizard displays the Review page, ensure that all HDP components correspond to HDP 2.5 or later. Click Deploy to begin the installation. Ambari displays the Install, Start and Test page. Monitor the status bar and messages for progress updates. When the wizard presents a summary of results, click "Complete" to finish installing Kafka. I hope this information helps! Regards, Lewis
... View more
01-13-2020
06:02 AM
As per my knowledge, I think there is no such path available. But if I find anything relatable I will let you know. Also, you can contact the support team for information on this. They can help you with something. Regards, Lewis
... View more
01-10-2020
01:54 AM
I uninstall Cloudera Manager by reference to Cloudera manager uninstall. Then I install Cloudera Manager again. Hive works well. I hope this works for you as well. Regards, Lewis
... View more
01-10-2020
01:49 AM
There is always a resource pool named root.default . By default, all YARN applications run in this pool. You create additional pools when your workload includes identifiable groups of applications (such as from a particular application, or a particular group within your organization) that have their own requirements.
... View more
01-10-2020
01:37 AM
There is no such official news about SP5 support but you can search for other ways to install it in SUSE. If I find something on this I will let you know! Regards, Lewis
... View more
01-10-2020
01:16 AM
thanks for this information, it is quite helpful!
... View more