Member since
05-05-2014
21
Posts
0
Kudos Received
0
Solutions
10-29-2018
07:39 PM
Hi Team,
We are trying to access Azure BLOB storage from hdfs and somehow we are unable to do this at the moment .
We have secured environment with proxies, so any outgoing traffic passes through the proxy, i already whitelisted the blob URL and i can access and upload files into BLOB storage from local linux system on the same machine where hadoop is installed.
However when i try to access to azure BLOB storage with hdfs command, it just stucks and does not give any error.
Following is the command and the output:
hdfs dfs -ls wasbs://xxxx@xxxxxxxx.blob.core.windows.net/
it get stuck after these steps :
16/12/05 15:45:57 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
16/12/05 15:45:57 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 60 second(s).
16/12/05 15:45:57 INFO impl.MetricsSystemImpl: azure-file-system metrics system started
Even enabling debug:
export HADOOP_ROOT_LOGGER=DEBUG,console
18/10/29 10:49:47 DEBUG util.Shell: setsid exited with exit code 0
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: i :: Ignore failures during copy ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: p [ARG] :: preserve status (rbugpcaxt)(replication, block-size, user, group, permission, checksum-type, ACL, XATTR, timestamps). If -p is specified with no <arg>, then preserves replication, block size, user, group, permission, checksum type and timestamps. raw.* xattrs are preserved when both the source and destination paths are in the /.reserved/raw hierarchy (HDFS only). raw.* xattrpreservation is independent of the -p flag. Refer to the DistCp documentation for more details. ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: update :: Update target, copying only missingfiles or directories ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: delete :: Delete from target, files missing in source ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: mapredSslConf [ARG] :: Configuration for ssl config file, to use with hftps://. Must be in the classpath. ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: numListstatusThreads [ARG] :: Number of threads to use for building file listing (max 40). ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: m [ARG] :: Max number of concurrent maps to use for copy ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: f [ARG] :: List of files that need to be copied ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: atomic :: Commit all changes or none ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: tmp [ARG] :: Intermediate work path to be used for atomic commit ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: log [ARG] :: Folder on DFS where distcp execution logs are saved ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: v :: Log additional info (path, size) in the SKIP/COPY log ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: strategy [ARG] :: Copy strategy to use. Default is dividing work based on file sizes ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: skipcrccheck :: Whether to skip CRC checks between source and target paths. ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: overwrite :: Choose to overwrite target files unconditionally, even if they exist. ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: append :: Reuse existing data in target files and append new data to them if possible ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: diff [ARG...] :: Use snapshot diff report to identify the difference between source and target ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: async :: Should distcp execution be blocking ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: filelimit [ARG] :: (Deprecated!) Limit number of files copied to <= n ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: sizelimit [ARG] :: (Deprecated!) Limit number of files copied to <= n bytes ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: bandwidth [ARG] :: Specify bandwidth per map in MB ]
18/10/29 10:49:47 DEBUG tools.OptionsParser: Adding option [ option: filters [ARG] :: The path to a file containing a list of strings for paths to be excluded from the copy. ]
18/10/29 10:49:47 DEBUG security.SecurityUtil: Setting hadoop.security.token.service.use_ip to true
18/10/29 10:49:47 DEBUG security.Groups: Creating new Groups object
18/10/29 10:49:47 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
18/10/29 10:49:47 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library
18/10/29 10:49:47 DEBUG security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution
18/10/29 10:49:47 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping
18/10/29 10:49:48 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
18/10/29 10:49:48 DEBUG security.UserGroupInformation: hadoop login
18/10/29 10:49:48 DEBUG security.UserGroupInformation: hadoop login commit
18/10/29 10:49:48 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: svc_hdfs
18/10/29 10:49:48 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: svc_hdfs" with name svc_hdfs
18/10/29 10:49:48 DEBUG security.UserGroupInformation: User entry: "svc_hdfs"
18/10/29 10:49:48 DEBUG security.UserGroupInformation: Assuming keytab is managed externally since logged in from subject.
18/10/29 10:49:48 DEBUG security.UserGroupInformation: UGI loginUser:svc_hdfs (auth:SIMPLE)
18/10/29 10:49:48 DEBUG gcs.GoogleHadoopFileSystemBase: GHFS version: 1.8.1.2.6.5.0-292
18/10/29 10:49:48 DEBUG configuration.ConfigurationUtils: ConfigurationUtils.locate(): base is null, name is hadoop-metrics2-azure-file-system.properties
18/10/29 10:49:48 DEBUG configuration.ConfigurationUtils: ConfigurationUtils.locate(): base is null, name is hadoop-metrics2.properties
18/10/29 10:49:48 DEBUG configuration.ConfigurationUtils: Loading configuration from the context classpath (hadoop-metrics2.properties)
18/10/29 10:49:48 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
18/10/29 10:49:48 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
18/10/29 10:49:48 INFO impl.MetricsSystemImpl: azure-file-system metrics system started
18/10/29 10:49:48 DEBUG azure.AzureNativeFileSystemStore: AzureNativeFileSystemStore init. Settings=8,false,90,{3000,3000,30000,30},{true,1.0,1.0}
18/10/29 10:49:48 DEBUG azure.AzureNativeFileSystemStore: Page blob directories:
18/10/29 10:49:48 DEBUG azure.AzureNativeFileSystemStore: Block blobs with compaction directories:
18/10/29 10:49:48 DEBUG azure.AzureNativeFileSystemStore: Atomic rename directories: /hbase
18/10/29 10:49:48 DEBUG azure.NativeAzureFileSystem: NativeAzureFileSystem. Initializing.
18/10/29 10:49:48 DEBUG azure.NativeAzureFileSystem: blockSize = 536870912
18/10/29 10:49:48 DEBUG azure.NativeAzureFileSystem: Getting the file status for wasbs:// /user
18/10/29 10:49:48 DEBUG azure.AzureNativeFileSystemStore: Retrieving metadata for user
18/10/29 10:49:48 DEBUG azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: SendingRequest: threadId=1, requestType=read , isFirstRequest=true, sleepDuration=0
Can anyone please help on this .Let me know if any additional configuration is required .
Regards,
Vishal
... View more
Labels:
- Labels:
-
Apache Hadoop
06-12-2018
06:57 AM
Knoxsso.xml federation pac4j true pac4j.callbackUrl https://knoxhost:8443/gateway/knoxsso/api/v1/websso clientName SAML2Client saml.identityProviderMetadataPath https://xxxxxxxx/app/exk1bs9c6clt0ttLo2p7/sso/saml/metadata saml.serviceProviderMetadataPath /tmp/sp-metadata.xml saml.serviceProviderEntityId https://knoxhost:8443/gateway/knoxsso/api/v1/websso?pac4jCallback=true&client_name=SAML2Client identity-assertion Default true principal.mappingtest1@jmfamily.com=tester,admin=admin KNOXSSO knoxsso.cookie.secure.only true knoxsso.token.ttl 30000 knoxsso.redirect.whitelist.regex ^https:\/\/xxxxx\.xxxxx\.com|localhost|127\.0\.0\.1|0:0:0:0:0:0:0:1|::1):[0-9].*{replace49}lt;/value>
... View more
06-12-2018
06:57 AM
Hi There, Following the below document from Hortonworks we have configured the KNOXSSO using OKTA(SAML). But, while accessing ambari web UI using Okta single sign on, the redirecturl is unable access the KNOX end point. Could you please share your thoughts on troubleshooting the issue as shown in the screenshots below. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/bk_security/content/ch02s09s01.html#saml_based_idp Federation provider: pac4j SAML IDP provider: Okta Service provider: KnoxSSO gateway-audit.log error:18/06/07 17:01:39 ||2c5194ce-fb4e-4049-bdb9-dac767934214|audit|172.20.100.241|KNOXSSO||||access|uri|/gateway/knoxsso/api/v1/websso?pac4jCallback=true&client_name=SAML2Client|failure| gateway.log : 2018-06-07 17:01:39,605 ERROR hadoop.gateway (GatewayServlet.java:service(146)) - Gateway processing failed: javax.servlet.ServletException: org.pac4j.saml.exceptions.SAMLException: Error decoding saml message javax.servlet.ServletException: org.pac4j.saml.exceptions.SAMLException: Error decoding saml message at org.apache.hadoop.gateway.filter.AbstractGatewayFilter.doFilter(AbstractGatewayFilter.java:70) at org.apache.hadoop.gateway.GatewayFilter$Holder.doFilter(GatewayFilter.java:332) at org.apache.hadoop.gateway.GatewayFilter$Chain.doFilter(GatewayFilter.java:232) at org.apache.hadoop.gateway.GatewayFilter.doFilter(GatewayFilter.java:139) at org.apache.hadoop.gateway.GatewayFilter.doFilter(GatewayFilter.java:91) at org.apache.hadoop.gateway.GatewayServlet.service(GatewayServlet.java:141) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.apache.hadoop.gateway.trace.TraceHandler.handle(TraceHandler.java:51) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.apache.hadoop.gateway.filter.CorrelationHandler.handle(CorrelationHandler.java:39) at org.eclipse.jetty.servlets.gzip.GzipHandler.handle(GzipHandler.java:479) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.apache.hadoop.gateway.filter.PortMappingHelperHandler.handle(PortMappingHelperHandler.java:92) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:499) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745) Caused by: org.pac4j.saml.exceptions.SAMLException: Error decoding saml message at org.pac4j.saml.sso.impl.SAML2WebSSOMessageReceiver.receiveMessage(SAML2WebSSOMessageReceiver.java:43) at org.pac4j.saml.sso.impl.SAML2WebSSOProfileHandler.receive(SAML2WebSSOProfileHandler.java:35) at org.pac4j.saml.client.SAML2Client.lambda$clientInit$0(SAML2Client.java:110) at org.pac4j.core.client.BaseClient.retrieveCredentials(BaseClient.java:61) at org.pac4j.core.client.IndirectClient.getCredentials(IndirectClient.java:125) at org.pac4j.core.engine.DefaultCallbackLogic.perform(DefaultCallbackLogic.java:79) at org.pac4j.j2e.filter.CallbackFilter.internalFilter(CallbackFilter.java:77) at org.pac4j.j2e.filter.AbstractConfigFilter.doFilter(AbstractConfigFilter.java:81) at org.apache.hadoop.gateway.pac4j.filter.Pac4jDispatcherFilter.doFilter(Pac4jDispatcherFilter.java:220) at org.apache.hadoop.gateway.GatewayFilter$Holder.doFilter(GatewayFilter.java:332) at org.apache.hadoop.gateway.GatewayFilter$Chain.doFilter(GatewayFilter.java:232) at org.apache.hadoop.gateway.filter.XForwardedHeaderFilter.doFilter(XForwardedHeaderFilter.java:30) at org.apache.hadoop.gateway.filter.AbstractGatewayFilter.doFilter(AbstractGatewayFilter.java:61) ... 32 more Caused by: org.opensaml.messaging.decoder.MessageDecodingException: This message decoder only supports the HTTP POST method at org.pac4j.saml.transport.Pac4jHTTPPostDecoder.doDecode(Pac4jHTTPPostDecoder.java:57) at org.opensaml.messaging.decoder.AbstractMessageDecoder.decode(AbstractMessageDecoder.java:58) at org.pac4j.saml.sso.impl.SAML2WebSSOMessageReceiver.receiveMessage(SAML2WebSSOMessageReceiver.java:40) ... 44 more
... View more
05-20-2014
11:14 AM
Hi Helen, Thanks for reply.Yes I am using Cloudera Manager. Thanks&Regards Vishal
... View more
05-12-2014
06:03 AM
Unhandled error java.lang.NoSuchMethodError: twitter4j.conf.Configuration.isStallWarningsEnabled()Z at twitter4j.TwitterStreamImpl.<init>(TwitterStreamImpl.java:60) at twitter4j.TwitterStreamFactory.<clinit>(TwitterStreamFactory.java:40) at com.cloudera.flume.source.TwitterSource.<init>(TwitterSource.java:64) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at java.lang.Class.newInstance0(Class.java:355) at java.lang.Class.newInstance(Class.java:308) at org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:42) at org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:327) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Hi All, I am using Cloudera manager to Configure the flume to do ywitter analysis.I making the all the required changes I am getting the above error after restarting the agent
... View more
Labels:
- Labels:
-
Apache Flume
-
Cloudera Manager