Monday, 21 March 2016

Machine Learning Algorithms in Stock Market - Part II

Thanks for reading the LinkedIn Posts and shown interest towards the technical understanding of the Markov Chain Modeling.

Basically, traders/analysts use several different approaches for prediction based upon fundamental analysis, technical analysis (TA), psychological analysis, etc. The technical analysis paradigm states that the price & volume relevant information is contained in market price itself. Researchers & Traders use these fundamental analysis / Technical Analysis specific metrics / indicators as their features and use different machine learning algorithms / statistical models based on Linear Regression / Logistic Regression / Decision tree etc., for price prediction.

In this post, I would like to share my experience of close price range Prediction (not exact the close price) with the previous days of OHLC historical data using Markov Chain Modeling.  Many researchers in the past have used this technique for predicting a stock price movement based on the following 3 states viz., Up, Down, Same i.e., Closing of the Stock price will be upward compared to the Previous day's close; or downward or remains the same.

I have defined a new Model (working towards the next steps of model refinement; even thinking of whether I can patent it) based on new transition states (not the traditional ones) that are based on the historical OHLC levels of the stock as well as the Volume of the data and usage of Markov Chain Process.

For Example, please find the below illustrations,

Taken a sample of Strides Shasun Limited (STAR) from NSE & BSE (Thanks to NSE & BSE for providing daily Bhavcopy data) for a specific date range i.e., from 1st Feb 2016 till 17th March 2016 and calculated the Pivot Points, Resistance 1, Resistance 2, Resistance 3.  Also defined the Transition state in a csv file as per the Definition provided above.

Attaching the snippet of the code in R to calculate the Markov Chain fit for the above and the expected High price range for 18th March 2016

As highlighted in blue, the expected Transstate for 18th March 2016 shall be R1 as it 0.5 (i.e., 50%) - so the high price for STAR on 18th March 2016 can be within range of 1042.58 to 1062.66 as highlighted in maroon in the snapshot.


On 18th March 2016, the high value for STAR in NSE is 1043.3 and BSE is 1043.

Based on the above similar transition states (defined by me), I am able to build a model using R which predicts the closing range of a specific stock based on previous day's data as well as historical data.

The markov process may not be powerful enough to produce the relevant accuracy of the closing price because it is important for you define the appropriate classification of states and the characterisation of the state (Transient / Closed states); and ensure the markov chain is composed of several transient and few Closed states.  Also in the markov model, a random component is present, so that the state of the system at any point of time is not wholly dependent on the previous event or events.  This particular aspect of Markov model when applied to Stock market (which is predominantly volatile at every minute) does not produce the desired results.  Many researchers / analysts fail miserably in defining the states without the appropriate business knowledge and in appropriate assumptions on the data.

However this technique applied with the minute data and any level breaking either in the Support or Resistance and definition of right transition states shall determine you the Closing price range of a particular day well in advance.
The below chart is the findings of the observation study(the % are not to be taken as actual)
In the case of change in the Closing price range, the traders can either take a call of converting to Delivery position if the loss is expected to be huge or even strategize in such a way that they dont make a loss on the intraday position.

Also, I want to mention that readers not to attempt these models or the findings /outcomes of the model directly in their actual trading and spending their money as the blog/posts covers only a snippet of the actual model and not advised to be used (either stocks or price movement and/or as recommendations for their trading).

Thursday, 18 February 2016

Machine Learning Algorithms in Stock Market - Part I


Since the date of new Modi government took the oath at PM’s office until the recent crash in the market (i.e, 15th February 2016), it has fallen to a level that is slightly lower to what it was when Narendra Modi-led government came to power after undergoing a roller coaster journey of Sensex touching 29300+ and that of Nifty 9000+ mark. 

For the people who have entered the market (either in the form of direct equity or Mutual fund purchase) during 29000+ or 9000+ mark and/or people who are new entrants to market, this volatile behaviour has pressed the panic button. One of the recent news article; reveals that some of the amateur investors stopped paying the SIP and establish an opinion that “Stock market is unpredictable; it’s a gamble; better to stay away from it”.

The reality is that everyone should understand the fact if somebody is losing in a trade (either by buying or selling) then there is somebody is gaining with the same trade as exchanges does not hold any stocks.  Stock market is not gamble; it has an art and science component in trading.  Researchers & many analysts have been working on various techniques in predicting the stock market in terms of Price movement, Trend, Crash etc., with various parameters.

This posts talks about the Science element of Trading; focus on the use of advanced technologies such as Big data, Machine Learning & Data Science / Mining, applications of such advanced technology in Stock market and the challenges in obtaining the accuracy in the stock market predictions of price movement or trend.

Applications of Machine Learning & Big data in Stock Market (not limited to),

·        Stock Market Prediction

o   Price Movement – Stock, Options, Futures

o   Trend / Direction

o   Volume Prediction

o   Momentum

o   F&O sentiments on Equity movement

·        High Frequency Trading Algorithms

·        High Frequency Trading Simulators

·        Prediction of stock market Crash and associated events / news

·        Traders Sentiment Psychology during the Trade cycle

·        Integration of Social media & News feed to predict market behaviors

·        Maintenance & retrieval of historical data

·        Etc., J

Let me share my experience in terms of the challenges in obtaining the accuracy on such predictions of price movement and trend/direction on day trading.

1.     Existence of many factors influencing the stock market – There are many underlying factors that has strong influence on the movement of stock / derivative prices and associated indices such as Economic growth, Inflation, related other markets (say US, UK, Japanese, Chinese markets, etc.,), National events (Budgets, Election results, Government performance and announcements, bills), Global events (US Fed hikes, Oil price, Economic growth results of other nations) etc.,

2.     Availability of Historical Data: Historical price and volume data of the stock market provides the detailed patterns and trends for predictions (if you observe most of the days the prediction seems to be fine in one direction i.e., predicted low would have been achieved but not the high or close) and this could be because of certain events causing the downtrend however there might be non-availability of critical historical data of such events to define the underlying patterns

a.     Historical Minute Level data

b.    Event data (say for example – on May 16, 2014 – the stock price & volume data might be available but the availability of event data i.e., in this case announcement of Lok Sabha Election results might be and/or not available for prediction)

c.      Global Market news / feed on respective dates

3.     Need for Real-time Integration: Based on the yesterday’s close price and volume traded, you can predict the stock price to some extent however events such as Gap up / Gap down; breaking a support/resistance level can adopt a specific different pattern which needs to be factored for the revised prediction on that particular day.  Hence there is a strong need for real-time integration and use of appropriate technical advanced techniques for such integration.

4.     Usage of hybrid models: One size does not fit all; hence one specific model (logistic regression, svm, randomforest, decision tree, cointegration etc.,) does not fit for the entire Stock prediction cycle such as,

a.     Stock selection

b.    Stock Price Prediction

c.      Stock Buy vs Sell direction

Hence it is important to adopt a hybrid machine learning techniques and integrate them in a more structured approach

5.     Back Testing & Continuous Improvement: Any analytical model requires back testing & continuous improvement to improve its accuracy however with respect to Stock market prediction this is more critical and to be performed on appropriate sampling of stocks (Index Stocks, F&O Stocks, Penny Stocks etc.,) & derivatives (Futures & Options) before confirming the corresponding hybrid analytical models.   Even availability of data; data preparation for back testing can also be challenging.

6.     Over confidence on the Model: Most of the Analysts or Predictors shall have overconfidence on the model and focusing on.ly on the Prediction methodologies/models rather than being cognizant with the market sentiments and the wisdom of being with the market trend.

Many people in the market strongly believe that stock market is something that can’t be predicted however the data scientists and statisticians claim that they have predicted near to 80-90% accuracy theoretically (or even sometimes programmatically or statistically prove them).  People who believe that stock market cannot be predicted argue that if analysts predict stock movement around 80-90% then they should have made profits (80-90%) and shall be richer by this timeframe.

Where is the issue? In the coming posts; I am planning to share about this detail with respect to this and how Big data techniques and machine learning models/algorithms can be leveraged to address them.

To be continued…