Thursday, 18 February 2016

Machine Learning Algorithms in Stock Market - Part I


Since the date of new Modi government took the oath at PM’s office until the recent crash in the market (i.e, 15th February 2016), it has fallen to a level that is slightly lower to what it was when Narendra Modi-led government came to power after undergoing a roller coaster journey of Sensex touching 29300+ and that of Nifty 9000+ mark. 

For the people who have entered the market (either in the form of direct equity or Mutual fund purchase) during 29000+ or 9000+ mark and/or people who are new entrants to market, this volatile behaviour has pressed the panic button. One of the recent news article; reveals that some of the amateur investors stopped paying the SIP and establish an opinion that “Stock market is unpredictable; it’s a gamble; better to stay away from it”.

The reality is that everyone should understand the fact if somebody is losing in a trade (either by buying or selling) then there is somebody is gaining with the same trade as exchanges does not hold any stocks.  Stock market is not gamble; it has an art and science component in trading.  Researchers & many analysts have been working on various techniques in predicting the stock market in terms of Price movement, Trend, Crash etc., with various parameters.

This posts talks about the Science element of Trading; focus on the use of advanced technologies such as Big data, Machine Learning & Data Science / Mining, applications of such advanced technology in Stock market and the challenges in obtaining the accuracy in the stock market predictions of price movement or trend.

Applications of Machine Learning & Big data in Stock Market (not limited to),

·        Stock Market Prediction

o   Price Movement – Stock, Options, Futures

o   Trend / Direction

o   Volume Prediction

o   Momentum

o   F&O sentiments on Equity movement

·        High Frequency Trading Algorithms

·        High Frequency Trading Simulators

·        Prediction of stock market Crash and associated events / news

·        Traders Sentiment Psychology during the Trade cycle

·        Integration of Social media & News feed to predict market behaviors

·        Maintenance & retrieval of historical data

·        Etc., J

Let me share my experience in terms of the challenges in obtaining the accuracy on such predictions of price movement and trend/direction on day trading.

1.     Existence of many factors influencing the stock market – There are many underlying factors that has strong influence on the movement of stock / derivative prices and associated indices such as Economic growth, Inflation, related other markets (say US, UK, Japanese, Chinese markets, etc.,), National events (Budgets, Election results, Government performance and announcements, bills), Global events (US Fed hikes, Oil price, Economic growth results of other nations) etc.,

2.     Availability of Historical Data: Historical price and volume data of the stock market provides the detailed patterns and trends for predictions (if you observe most of the days the prediction seems to be fine in one direction i.e., predicted low would have been achieved but not the high or close) and this could be because of certain events causing the downtrend however there might be non-availability of critical historical data of such events to define the underlying patterns

a.     Historical Minute Level data

b.    Event data (say for example – on May 16, 2014 – the stock price & volume data might be available but the availability of event data i.e., in this case announcement of Lok Sabha Election results might be and/or not available for prediction)

c.      Global Market news / feed on respective dates

3.     Need for Real-time Integration: Based on the yesterday’s close price and volume traded, you can predict the stock price to some extent however events such as Gap up / Gap down; breaking a support/resistance level can adopt a specific different pattern which needs to be factored for the revised prediction on that particular day.  Hence there is a strong need for real-time integration and use of appropriate technical advanced techniques for such integration.

4.     Usage of hybrid models: One size does not fit all; hence one specific model (logistic regression, svm, randomforest, decision tree, cointegration etc.,) does not fit for the entire Stock prediction cycle such as,

a.     Stock selection

b.    Stock Price Prediction

c.      Stock Buy vs Sell direction

Hence it is important to adopt a hybrid machine learning techniques and integrate them in a more structured approach

5.     Back Testing & Continuous Improvement: Any analytical model requires back testing & continuous improvement to improve its accuracy however with respect to Stock market prediction this is more critical and to be performed on appropriate sampling of stocks (Index Stocks, F&O Stocks, Penny Stocks etc.,) & derivatives (Futures & Options) before confirming the corresponding hybrid analytical models.   Even availability of data; data preparation for back testing can also be challenging.

6.     Over confidence on the Model: Most of the Analysts or Predictors shall have overconfidence on the model and focusing on.ly on the Prediction methodologies/models rather than being cognizant with the market sentiments and the wisdom of being with the market trend.

Many people in the market strongly believe that stock market is something that can’t be predicted however the data scientists and statisticians claim that they have predicted near to 80-90% accuracy theoretically (or even sometimes programmatically or statistically prove them).  People who believe that stock market cannot be predicted argue that if analysts predict stock movement around 80-90% then they should have made profits (80-90%) and shall be richer by this timeframe.

Where is the issue? In the coming posts; I am planning to share about this detail with respect to this and how Big data techniques and machine learning models/algorithms can be leveraged to address them.

To be continued…