Member since
11-30-2020
33
Posts
0
Kudos Received
0
Solutions
08-18-2021
03:31 AM
@apathan thank you ! this is what I really needed. saved my time .. I had the same error while bringing up the CM UI. had Java 7 before and the jar file wasnt compatible. So updated the Java 8 and it worked like a charm.
... View more
06-16-2021
10:26 PM
Yes, I was doing both append and incremental last modified by giving warehouse directory which did not help in merging the records. But when I gave target directory it worked fine. Now, new records get updated on the hive table with incremental append script and updated records are entered into hive using lastmodified with updated_date but with target directory as parameter and without giving hive-import parameter. Thanks for suggesting it. That's what already worked,
... View more
06-13-2021
10:42 PM
Hi All, I am trying to achieve incremental sqoop import into hive when a new record is created as well when updates happen to the existing records. We have tables having both created date column and modified date column. But the modified date column will be updated only when there is any update in the existing records and will remain null when a new record is inserted. Only the created date will be updated when new records are created. So, when we try the incremental lastmodified option it always looks for the modified date column but our source tables doesnt have values in the modified date column always as it keeps NULL for new records and only updates the value when an existing table gets modified. So, for this kind of source DB tables with such columns is there any solution in sqoop import. Also, sqoop incremental with lastmodifed doesnt work with hive tables and we have to create a table already and then run this incremental lastmodified "INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. --incremental lastmodified option for hive imports is not supported. Please remove the parameter --incremental lastmodified." So, please help on which approach will suit better to do an incremental import of our data into Hive tables. Need a solution badly for this scenario. please help if you have done this before.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Sqoop
03-23-2021
10:20 AM
Hi All, Need help in fixing the bellow errors in our newly created CDSW cluster. all the services of the CDSW cluster are running. I guess I need to bring the kubernetes service up. "The connection to the server 10.127.116.126:6443 was refused - did you specify the right host or port?" health errors: CDSW StatusSuppress... Failed to run CDSW Nodes Check. * Failed to run CDSW system pods check. * Failed to run CDSW application pods check. * Failed to run CDSW services check. * Failed to run CDSW secrets check. * Failed to run CDSW persistent volumes check. * Failed to run ... -------------------------------------------- [root@server ~]# cdsw logs -x Generating Cloudera Data Science Workbench diagnostic bundle... Collecting basic system info... Collecting kernel parameters... Collecting kernel messages... Collecting the list of kernel modules... Collecting the list of systemd units... Collecting cdsw details... Collecting application configuration... Collecting disks information... Collecting Hadoop configuration... Collecting network information... Collecting system service statuses... Collecting nfs information... Collecting Docker info... Collecting Kubernetes info... Collecting Helm info... Collecting custom patches... cp: cannot stat ‘/etc/cdsw/patches’: No such file or directory Collecting Kubelet logs... Collecting CDSW Host Controller logs... Collecting system logs... Collecting Kubernetes cluster info dump... ls: cannot access cdsw-logs-cpcinchdv010813-2021-03-23--20-47-00/k8s-cluster-info/*/*/logs.txt: No such file or directory Exporting user ids... The connection to the server 10.127.116.126:6443 was refused - did you specify the right host or port? The connection to the server 10.127.116.126:6443 was refused - did you specify the right host or port? error: pod or type/name must be specified error: pod or type/name must be specified Collecting health logs... Collecting event logs... The connection to the server 10.127.116.126:6443 was refused - did you specify the right host or port? error: Filespec must match the canonical format: [[namespace/]pod:]file/path Exporting metrics... The connection to the server 10.127.116.126:6443 was refused - did you specify the right host or port? ERROR:: Unable to get service account credentials. Provide SERVICE_ACCOUNT_SECRET or run on master node.: 2 Producing logs tarball... Logs saved to: cdsw-logs-cpcinchdv010813-2021-03-23--20-47-00.tar.gz Cleaning up... ------------------------------------------------------------------------------------------- [root@server ~]# cdsw validate [Validating host configuration] > Prechecking OS Version........[OK] > Prechecking kernel Version........[OK] > Prechecking that SELinux is disabled........[OK] > Prechecking scaling limits for processes........[OK] > Prechecking scaling limits for open files........ WARNING: Cloudera Data Science Workbench recommends that all users have a max-open-files limit set to 1048576. It is currently set to [1024] as per 'ulimit -n' Press enter to continue > Loading kernel module [ip_tables]... > Loading kernel module [iptable_nat]... > Loading kernel module [iptable_filter]... > Prechecking that iptables are not configured........[OK] > Prechecking kernel parameters........[OK] > Prechecking to ensure kernel memory accounting disabled:........[OK] > Prechecking Java distribution and version........[OK] > Checking unlimited Java encryption policy for AES........[OK] > Prechecking size of root volume........ WARNING: The recommended minimum root volume size is 100G. Press enter to continue [Validating networking setup] > Checking if kubelet iptables rules exist The following chains are missing from iptables: [KUBE-EXTERNAL-SERVICES, WEAVE-NPC-EGRESS, WEAVE-NPC, WEAVE-NPC-EGRESS-ACCEPT, KUBE-SERVICES, WEAVE-NPC-INGRESS, WEAVE-NPC-EGRESS-DEFAULT, WEAVE-NPC-DEFAULT, WEAVE-NPC-EGRESS-CUSTOM] WARNING:: Verification of iptables rules failed: 1 > Checking if DNS server is running on localhost > Checking the number of DNS servers in resolv.conf > Checking DNS entries for CDSW main domain > Checking reverse DNS entries for CDSW main domain WARNING:: DNS doesn't resolve 10.127.116.126 to cdsw.cts.com; DNS is not configured properly: 1 > Checking DNS entries for CDSW wildcard domain > Checking that firewalld is disabled > Checking if ipv6 is enabled [Validating Kubernetes versions] > Checking kubernetes client version > Checking kubernetes server version WARNING:: Kubernetes server is not running, version couldn't be checked.: 1 [Validating NFS and Application Block Device setup] > Checking if nfs or nfs-server is active and enabled > Checking if rpcbind.socket is active and enabled > Checking if rpcbind.service is active and enabled > Checking if the project folder is exported over nfs WARNING:: The projects folder /var/lib/cdsw/current/projects must be exported over nfs: 1 > Checking if application mountpoint exists > Checking if the application directory is on a separate block device > Checking the root directory (/) free space WARNING:: The directory has less then 10% free capacity: 1 > Checking the application directory (/var/lib/cdsw) free space WARNING:: The directory has less then 10% free capacity: 1 [Validating Kubernetes cluster state] > Checking if we have exactly one master node WARNING:: There must be exactly one Kubernetes node labelled 'stateful=true': 1 > Checking if the Kubernetes nodes are ready > Checking kube-apiserver pod WARNING: Unable to reach k8s pod kube-apiserver. WARNING: [kube-apiserver] pod(s) are not ready under kube-system namespace. WARNING: Unable to bring up kube-apiserver in the kube-system cluster. Skipping other checks.. [Validating CDSW application] > Checking connectivity over ingress WARNING:: Could not curl the application over the ingress controller: 7 -------------------------------------------------------------------------- Errors detected. Please review the issues listed above. Further details can be collected by capturing logs from all nodes using "cdsw logs". -------------------------------------------------------------------------- cdsw status Sending detailed logs to [/tmp/cdsw_status_rwPmR6.log] ... CDSW Version: [1.9.0.7802354:5a39a73] Installed into namespace 'default' OK: Application running as root check OK: NFS service check OK: System process check for CSD install OK: Sysctl params check OK: Kernel memory slabs check Failed to run CDSW Nodes Check. Failed to run CDSW system pods check. Failed to run CDSW application pods check. Failed to run CDSW services check. Failed to run CDSW secrets check. Failed to run CDSW persistent volumes check. Failed to run CDSW persistent volumes claims check. Failed to run CDSW Ingresses check. Checking web at url: http://cdsw.cts.com Web is not yet up. Cloudera Data Science Workbench is not ready yet
... View more
03-19-2021
12:13 PM
Hello, Could someone help me in creating a Docker Block Device on my CDSW Master node. I am getting the below errors: Checking if [conntrack-tools.x86_64] is installed...OK
Setting up docker storage.
ERROR:: Error in pvcreate for [/dev/sdb]: 5
ERROR:: Unable to setup docker storage.: 5
ERROR:: Unable to create storage for docker.: 5 Installed:
conntrack-tools.x86_64 0:1.4.4-7.el7
Dependency Installed:
libnetfilter_cthelper.x86_64 0:1.0.0-11.el7
libnetfilter_cttimeout.x86_64 0:1.0.0-7.el7
libnetfilter_queue.x86_64 0:1.0.2-2.el7_2
Complete!
Setting up docker storage.
ERROR:: Entries in DOCKER_BLOCK_DEVICES must only be block devices: [/dev/sde]: 1
ERROR:: Unable to create storage for docker.: 1 here's my server disks look like . shoud this be done only on Master CDSW node or even in the worker CDSW node ? I have a 2TB sdb disk which can I give for a new /dev/sdb3 something ... how do I do this ? can you please share the commands for creating a new disk of size 500GB-1TB for Docker Image block device without mounting it. Also, would that docker image space be used later or I can follow this method of creating a files as a docker image block given in this solution. not sure if it would work and recommended. https://community.cloudera.com/t5/Support-Questions/CDSW-installation-Entries-in-DOCKER-BLOCK-DEVICES-must-only/m-p/80978 Also, please let me know if setting up DNS wildcard is adding the *domain name on /etc/hosts or any other DNS file of the linux server. and is it only on the Master CDSW and need to do the same on the worker CDSW node as well. Also, should Spark2 be installed manually or could be added as a service from the cluster but dont see the option in the cluster to add as a service. where are the parcels to downloaded for spark2 .. or should Install it like its shown in this doc https://docs.cloudera.com/cdsw/1.9.0/installation/topics/cdsw-configure-apache-spark-2-cdh6-or-7.html Please help , I'm stuck with this.. Thanks!
... View more
Labels:
03-15-2021
12:14 AM
Hi All, I am trying to install CDP , the CM server was succesfully installed along with CM agent and it got added in the cluster. But the Add Cluster wizard fails for all other 4 hosts of cluster while Installing the CM agents. I feel that CM server is unable to push the packages to the other hosts of the cluster. It fails with the error "yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true Cannot find a valid baseurl for repo: cloudera-manager" I have internet open only on my CM server. do I need internet on all the cluster hosts or CM server internet is enough to push the packages to the other hosts. I wont be able to get internet port open on all the hosts, so in that case should I create a local repo on CM host and install the cluster. But I remember installing a CDH dev cluster only having internet only on CM server and without using a local repo on the CM server with the internet. Please advice if we need internet on all the servers or CM server with internet will push the packages to the other hosts Thanks!
... View more
Labels:
03-08-2021
12:11 PM
Hello All, I am trying to install CDP cluster. after my Cloudera Manager is up, all my 6 hosts are detected, the user authenticated, when I go to the step of Installing agents on the nodes, Only the CM server gets succesful and rest of the nodes get failed with the error. "yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true Cannot find a valid baseurl for repo: cloudera-manager" I can sense that the Cloudera manager server is not able to connect to the other hosts due to something missing but I completed all the pre requisites. even the passwordless ssh between the hosts, thats why all the servers were detected and the installation proceeded with the user credentials. The cloudera agent logs say the below error on the succesful CM nodes 1446 Monitor-HostMonitor throttling_logger ERROR (359 skipped) ntpq: ntpq -np: not synchronized to any server 1446 MainThread agent INFO Stopping agent... With the research I did feel that the ntpq has some config issues and it is not synchronized with the other servers and I checked few things on it. I see the stratum is 16. which means it is not in sync.. and there is no star next to the network host name. ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== timesync.mts.co .XFAC. 16 u - 1024 0 0.000 0.000 0.000 timesync.mts.co .XFAC. 16 u - 1024 0 0.000 0.000 0.000 ntpq -c "rv 9464" associd=9464 status=8011 conf, sel_reject, 1 event, mobilize, srcadr=ctsinpunsxud.cts.com, srcport=123, dstadr=10.127.116.119, dstport=123, leap=11, stratum=16, precision=-24, rootdelay=0.000, rootdisp=0.000, refid=XFAC, I could see the /etc/yum.repos.d/ of the other cluster nodes haven't gotten the cloudera-manager repos yet. Questions: what could be the issue for the CM not connecting with the cluster nodes ? Is it the ntpq or some other issue ? please help in connecting other hosts from my CM server to install the parcels using the wizard. or should I manually install the agent packages. if I install manually how will future components or the CM-agent will communicate within the cluster. Thank you !
... View more
02-26-2021
11:30 AM
Hello, I am trying to install the cloud manager of the CDP cluster and getting Certificate ssl error during importing the signing GPG key for CDP installation. this is the command I am running sudo rpm --import https://[username]:[password]@archive.cloudera.com/p/cm7/7.2.4/redhat7/yum/RPM-GPG-KEY-cloudera I tried adding sslverify=false to the /etc/yum.repos.d file.. still no luck is this something to be fixed from server side or shou Error : curl: (60) Peer's certificate issuer has been marked as not trusted by the user. More details here: http://curl.haxx.se/docs/sslcerts.html curl performs SSL certificate verification by default, using a "bundle" of Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in the bundle, the certificate verification probably failed due to a problem with the certificate (it might be expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification of the certificate, use the -k (or --insecure) option.
... View more
02-21-2021
10:06 PM
Thanks for answering it @GangWar Now that we shouldnt remove these packages which will affect cloudera agent mysql-community-client x86_64 5.7.25-1.el7 @mysql57-community 107 M mysql-community-common x86_64 5.7.25-1.el7 @mysql57-community 2.6 M mysql-community-libs x86_64 5.7.25-1.el7 @mysql57-community 9.5 M mysql-community-libs-compat x86_64 5.7.25-1.el7 @mysql57-community 9.2 M To what version should I upgrade this Mysql and if I upgrade the Mysql will the agent take the newly installed package without any issues or need to edit some config file.. Our server team doesnt want us to keep the below Mysql version on the server as its old. mysql -V mysql Ver 14.14 Distrib 5.7.21, for Linux (x86_64) using EditLine wrapper and this is the package details from other server in the cluster which of these packages are used by the cloudera-agent now ?? mysql -V mysql Ver 15.1 Distrib 5.5.68-MariaDB, for Linux (x86_64) using readline 5.1 rpm -qa|grep -i mysql akonadi-mysql-1.9.2-4.el7.x86_64 perl-DBD-MySQL-4.023-6.el7.x86_64 MySQL-python-1.2.5-1.el7.x86_64 qt-mysql-4.8.7-9.el7_9.x86_64
... View more
02-19-2021
11:29 AM
Hello All, we recieved an alert from our server team to upgrade the current version of Postgresql to the latest one. When checked its not the DB server of the cluster having Postgresql as DB but other server of the cluster and I found this package "postgresql-libs-9.2.24-4.el7_8.x86_64" Just curious why this package exisits on only 1 server apart from the main DB server from out of 5 other cluster servers. Can I go ahead and remove this package.. will it affect the cluster in any way ? If i cant remove, how to upgrade this version of postgresql and update the same in the cluster so that the cluster works without any issues. thank you !
... View more
Labels:
- Labels:
-
Cloudera Manager