About khaslbeck

khaslbeck · ‎06-10-2016

Predict Stock Portfolio Gains Using Monte Carlo Why? Why create yet another VaR example? To demonstrate VaR running on a modern architecture that has no vertical limit. This is a functional, immutable, scaleable interpretation of a basic technique commonly used in finance. Code Available here and on github. https://github.com/kirkhas/zeppelin-notebooks/ link to Vlad's article for history of Monte Carlo and VaR - https://community.hortonworks.com/articles/36321/predicting-stock-portfolio-losses-using-monte-carl.html Some modifications from original posting include: scala calling Yahoo API directly, alleviating the need for shell scripting and adding interopability between variables. All data loaded dynamically in memory, removing the need to store files (which inherently adds manual customizations to a generic process). Code all in Zeppelin for readability. Visualizations in Zeppelin. Inputs built in using Zep forms so the user can interact with the model. Percentiles not only on what's at risk each day but also on final portfolio value. Figure 1.0 shows the risk you would take on per each day holding these 3 stocks. Figure 2.0 shows what a reasonable projected outcome might be after holding this position for 100 days. Checkout the code it has a lot more visuals. Key takeaway: "You should have purchased shares of HDP in mid Feb 2016!" Code View https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL2tpcmtoYXMvemVwcGVsaW4tbm90ZWJvb2tzL21hc3Rlci9Nb250ZUNhcmxvVmFyL25vdGUuanNvbg Report View https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL2tpcmtoYXMvemVwcGVsaW4tbm90ZWJvb2tzL21hc3Rlci9Nb250ZUNhcmxvVmFyL1JlcG9ydFZpZXcvbm90ZS5qc29u

khaslbeck · ‎05-26-2016

Thanks all @Artem Ervits @Tom McCuch for the comments. I did get it resolved by passing all the S3 jars properly on the classpath. The articles included in your threads helped.

khaslbeck · ‎05-26-2016

Unable to execute the queries on S3 data using SPARK and PYSPARK. It is throwing below error. : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2638) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651) …. …. Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193) we have tried it by adding below parameters but no luck. Parameter name: fs.s3a.impl Parameter value: org.apache.hadoop.fs.s3a.S3AFileSystem Added this paramter in hdfs.site.xml, core-site.xml, hive-site.xml and also added the aws jar files in mapred-site.xml (added to classpath)files.

khaslbeck · ‎05-25-2016

The final resolution was this: Ambari was only showing a "SERVER ERROR" msg on final step with no stack trace. After reading the log I saw there was a primary key constraint on the table "clusterservices". remove this row from table. Then re-install via ambari and it was successful. My hunch is that we got into this state by first trying to remove or edit a service that was already running.

khaslbeck · ‎05-25-2016

@Artem Ervits looks like you were right, once I got a hold of the logs looks like they did not stop the service first. 20 May 2016 11:05:21,732 INFO [ambari-heartbeat-processor-0] HeartbeatProcessor:603 - State of service component NODEMANAGER of service YARN of cluster HDPPOC2 has changed from UNKNOWN to STARTED at host ip-10-228-210-131 according to STATUS_COMMAND report 20 May 2016 11:05:33,535 ERROR [qtp-ambari-client-41] AbstractResourceProvider:338 - Caught AmbariException when modifying a resource org.apache.ambari.server.AmbariException: Cannot remove ZEPPELIN. Desired state STARTED is not removable. Service must be stopped or disabled. at org.apache.ambari.server.controller.internal.ServiceResourceProvider.deleteServices(ServiceResourceProvider.java:869) at org.apache.ambari.server.controller.internal.ServiceResourceProvider$3.invoke(ServiceResourceProvider.java:247) at org.apache.ambari.server.controller.internal.ServiceResourceProvider$3.invoke(ServiceResourceProvider.java:244) at org.apache.ambari.server.controller.internal.AbstractResourceProvider.invokeWithRetry(AbstractResourceProvider.java:450) 20 May 2016 11:06:52,501 ERROR [qtp-ambari-client-42] AmbariJpaLocalTxnInterceptor:180 - [DETAILED ERROR] Rollback reason: Local Exception Stack: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: org.postgresql.util.PSQLException: ERROR: update or delete on table "servicecomponentdesiredstate" violates foreign key constraint "hstcmpnntdesiredstatecmpnntnme" on table "hostcomponentdesiredstate" Detail: Key (component_name,cluster_id,service_name)=(ZEPPELIN_MASTER,2,ZEPPELIN) is still referenced from table "hostcomponentdesiredstate". Error Code: 0 Call: DELETE FROM servicecomponentdesiredstate WHERE (((cluster_id = ?) AND (component_name = ?)) AND (service_name = ?)) bind => [3 parameters bound] at org.eclipse.persistence.exceptions.DatabaseException.sqlException(DatabaseException.java:340)

khaslbeck · ‎05-23-2016

Removed all components related to Zeppelin from Ambari, and tried to reinstall again but everytime its failing with the error "Server error." used this cmd to remove the service. is there something else that required cleaning up? curl -u admin:admin -X DELETE -H 'X-Requested-By:1' http://10.228.210.175:80/api/v1/clusters/HDPPOC2/services/ZEPPELIN

khaslbeck · ‎05-16-2016

Results from using sqoop to move data from HAWQ to HIVE. @Artem Ervits and @cstanca HAWQ Hive Result int int worked text string worked date string write=string, onRead date operations work timestamp string write=string, onRead ts operations work bit boolean conversion does not work decimal double mostly works, precision loss > 9 double precision double works real double works interval Breaks! sqoop mapping error bit varying Breaks! sqoop mapping error time string write=string, onRead time operations work char string write=string, onRead you need wildcard expression, recommend trimming char varying string write=string, onRead holds whitespace, recommend trimming varchar string works boolean boolean works numeric double works %sh sqoop import --username zeppelin --password zeppelin --connect jdbc:postgresql://jdbcurl --query 'SELECT id,name,join_date,age,a,b,i FROM kirk WHERE $CONDITIONS' -m 1 --target-dir /user/zeppelin/kirk/t6 --map-column-java a=String,i=String,b=String -- select * select * from kirk ; -- int check between inclusive select age from kirk where age between 25 and 27; -- decimal check select dec from kirk where dec > 33.32; -- string like and wildcard select address from kirk where address like '%Rich%'; -- date is a string but operates like date select join_date from kirk where join_date between '2007-12-13' and '2007-12-15'; -- timestamp, works string on write but operates like TS select ts from kirk where ts > '2016-02-22 08:01:22' -- BIT NOT CORRECT select a from kirk where a =false or a = 1 -- character varying, without white space matches select cv from kirk where cv = 'sdfsadf'; -- character varying, with white space select cv from kirk where cv = 'white space'; -- not matching select cv from kirk where cv = 'white space '; -- matching -- character, doesn't match unless wildcard select c from kirk where c like 'we%'; -- boolean, both true/false and 1/0 are converted properly select id, isactive from kirk where isactive = true or isactive = 0

khaslbeck · ‎05-12-2016

Great feature @bbende . This is much easier to visualize now.

khaslbeck · ‎05-12-2016

If the case where NiFi is reading from 30 database tables in a single flow what is the best way to visually identify which processor is connecting to each database and table?

khaslbeck · ‎05-11-2016

You can now visualize any Zeppelin notebook using Zeppelinhub viewer. https://www.zeppelinhub.com/viewer personal likes: 1. No need to sign up or register just paste a link 2. I've been posting my zeppelin notebooks to github but everyone that wants to visualize them or interact with them needs to download, move to environment, import into their instance of zeppelin. Not anymore just paste the link. 3. Less of a need to take screenshots and create a powerpoint just send the hyperlink examples: Stock Variance Notebook github - https://github.com/kirkhas/zeppelin-notebooks/blob/master/stock-variance/note.json vs Stock Variance Notebook zephub - https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL2tpcmtoYXMvemVwcGVsaW4tbm90ZWJvb2tzL21hc3Rlci9zdG9jay12YXJpYW5jZS9ub3RlLmpzb24 Credit Card Fraud Transactions git - https://github.com/vakshorton/CreditCardTransactionMonitor/blob/master/Zeppelin/notebook/2BGDWYZV9/note.json vs https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL3Zha3Nob3J0b24vQ3JlZGl0Q2FyZFRyYW5zYWN0aW9uTW9uaXRvci9tYXN0ZXIvWmVwcGVsaW4vbm90ZWJvb2svMkJHRFdZWlY5L25vdGUuanNvbg

Online	Offline
Last Visited	‎08-02-2018 08:10 PM

Member Since	‎02-23-2016 02:08 AM
Last Visited	‎08-02-2018 08:10 PM
Posts	51
Kudos received	90

Cloudera Community

Re: Remove Zeppelin from Ambari then Add it back

Re: HAWQ to HIVE data type mapping

Re: Storage data in HDFS - What's next?

Re: HDP-2.3.4.0-3485 upgrade failed due to HDP-UTI...

Predicting Stock Portfolio Gains using Monte Carlo...

Re: Spark on S3

Spark on S3

Re: Remove Zeppelin from Ambari then Add it back

Re: Remove Zeppelin from Ambari then Add it back

Remove Zeppelin from Ambari then Add it back

Re: HAWQ to HIVE data type mapping

Re: What is the best way to organize many NiFi Pro...

What is the best way to organize many NiFi Process...

Zeppelinhub Viewer