Community Articles
Find and share helpful community-sourced technical articles.
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.
Labels (2)

Money management and retirement planning are extremely important issues. The conventional wisdom says to get a Financial Advisor who will invest your money and allow you to reach your financial goals. However, as of late, financial institutions have began to offer services generally referred to as Digital Advisors or "Robo Advisors". These services aim to offer a very low cost, broad spectrum, long term investment advice across a limited set of financial products. The advice is personalized based on basic input from the customer such as level of risk tolerance, initial investment, years to invest, ect. The basic premise behind the offering is diversification. Statistics have shown that few money managers are able to beat the S&P 500 over the long term, especially after fees. The Digital Advisor aims to spread the initial investment across a broad spectrum of securities at a much lower cost. The Digital Advisor provides another major advantage over the Human Financial Advisor. Using the power of distributed processing and Monte Carlo simulation methodology, the Digital Advisor is able to project the returns of the recommended portfolio. This capability allows the Digital Advisor to show a confidence range of likelihoods for future portfolio value. Depending on the level of sophistication, the Digital Avisor could use similar methodology to periodically re-run the projections to determine if the portfolio is on track. If not, the portfolio can be adjusted to address unforeseen market conditions.

The financial industry has been using the Month Carol method to project investment returns and evaluate risk for some time. The basic principle behind the Monte Carlo method is to establish a model(mathematical equation) that represents the target scenario. The model should include a representation of a range of possible outcomes. This is typically done using historical performance and current market data. The next step is to develop a pseudo-random number generator to create data that can be plugged into the model. The generator is pseudo-random because it must only generate numbers that would fit into the range of outcomes represented in the model. The last step is to generate and plug pseudo-random numbers into the model over and over again. This idea is that the outcomes that are most likely based on the model, will occur most frequently. Thus, the Monte Carlo simulation can project the likely outcome for an investment strategy.

Let's take a look at an example of a simplified Digital Advisor implementation.

This demonstration is implemented as an Apache Zeppelin notebook using Apache Spark on the Hortonworks Data Platform. Apache Spark is the ideal engine for Monte Carlo simulation. Spark has a very user friendly API, it can spread complex computations across many host's processors, and it can reduce completion time by caching intermediate result sets in memory. Apache Zeppelin is a great compliment to Spark. Zeppelin provides a clean and simple interface that can leverage SparkSQL to visualize complex result sets.

The first two sections create a form for user input and gather historical data from Yahoo. This data is required in order to create the model that the simulation is based on. Next, the Digital Advisor generates a portfolio based on the user's indicated risk tolerance. Real world portfolio selection can be extremely involved. For the purposes of this demonstration, the implementation only ensures that the user's risk tolerance is honored. Finally, the actual model that underpins the simulation is created. The time scale is adjusted to take into account annual as opposed to daily returns. Each instrument in the portfolio is a assigned a likely range of return based on historical standard deviation. When the simulation starts, the algoritm calculates the annual rate of return for each portfolio instrument by plugging in a pseudo-random value within the standard deviation. Using the rate of return, the actual value of each instrument is derived. This step is repeated for the number of years requested by the user, keeping track of total value year over year. The result of this entire process represents a single final possible outcome for the portfolio. In order to have any confidence in the prediction it is necessary to repeat the entire process many times. Each repeated simulation provides an new possible outcome. When the requested number of simulations are completed, Zeppelin leverages SparkSQL to create visualizations of each simulated portfolio path, possible range and confidence of final portfolio value.

Not applicable

Hi, this link in the article

isnt working is there any alternate resource/link. thanks

New Member



Don't have an account?
Version history
Last update:
‎10-02-2016 04:37 PM
Updated by:
Top Kudoed Authors