In asset management, sentiment analysis deals with the evaluation of the opinions of market participants. It is derived from the field of behavioural finance and attempts to gain a profound understanding of investors psychology in order to draw inferences about the behavior of market participants.

Sentiment can be understood as the aggregation of the opinions of all investors to a market, sector or an explicit asset. In addition to the classic methods, such as the technical or fundamental analysis, sentiment analysis offers an optimal supplement for identifying yield drivers. Sentiment Analysis is a relatively new research and application field in asset management. However, due to the enormous rise of social media and new methods such as natural language processing, it has gained enormously in relevance in the recent past!

Sentiment Indicators

Numerous empirical studies show that investors sentiment is a good indicator for short-term price developments, as emotions have a strong influence on human decision making. Therefore, since the 1990s, market sentiment indicators have been available that try to reflect the sentiment of investors towards an entire market, such as the S&P 500. We would like to briefly touch on three of such heuristics:

  • VIX
  • Put/Call Ratio
  • Safe Havens

VIX

The CBOE Volatility Index (VIX) measures the implied volatility of options on the US S&P 500 Index. Investors buy options to hedge their portfolio. If they assume that volatility in the market will increase, they offer their options at a more expensive price and the index rises. The VIX therefore gives a good indication of the level of fear in the market. A high VIX value indicates that markets are unstable. A low VIX value indicates low price fluctuations. The VIX is also called the fear barometer.

Put / Call Ratio

The Put/Call Ratio offers a similar approach. It measures the ratio of put options to call options. If there are more puts than calls in circulation, investors tend to expect the market to decline and vice versa.

Safe Havens

When investors are anxious, they tend to shift their money into presumably safe investment sources such as precious metals, government bonds or relatively stable currencies. Strong increases in such assets often result from investors’ fear that the stock market could go down.

Limitations of such heuristics

Such heuristics allow a quick overview of the general sentiment in the market. However, this is also the crucial point. If you want to obtain detailed information on individual sectors or explicit investment assets, you have to use more complex methods.

NLP in Asset Management

State of the Art methods such as artificial intelligence allow to draw conclusions from the sentiment of the investors towards whole markets, single sectors or explicit investment assets on their future price development. A simple introduction to the topic of artificial intelligence can be found in our blog post Artificial Intelligence for Beginners. Through automated analysis of social media and forum posts, ad hoc news, annual reports or other text and voice data, we now have an immense variety of relevant sources at our disposal. This makes sentiment analysis an increasingly important instrument in asset management! However, machine learning models operate exclusively on numerical values. So the informations must first be represented in a numerical form in order to be processed at all. How does that work? This is where natural language processing comes into play. We introduce you to the most common methods!

What is NLP?

Natural Language Processing (NLP) is a process of automatic analysis and representation of text and speech. Among other things, NLP can be used to identify positive and negative poles in documents in order to derive sentiment indicators. Basically, one can differentiate between methods that examine lexical structures and methods that examine semantic structures.

Lexical features

Bag of Words

The simplest NLP approach is the bag of words method. After an initial preparation of the text, this approach captures the frequency of each word in a document.

n-gram

In this approach, not only the specific word but also the context in which it appears can be taken into account. For example, a bi-gram does not analyze one word, but two words. The n in n-gram stands for the number of the respective word pairs that are considered. Why does this make it easier to understand the context? Let us look at a simple example: If you would only look at the word “bank” in the sentence “I’m going to deposit money at the bank”, it would be difficult to conclude the

meaning of the word. After all, a “bank” can be both a place to sit and a financial institution, depending on the context. However, if you also consider the word “money”, you can better conclude the actual context.

Dictionaries

Another possibility is to define a dictionary with words and their corresponding associations. This way words can be linked with positive, neutral or negative associations. Either a separate dictionary can be defined for the task or existing, industry-specific dictionaries can be used.

Limitations of lexical features

Approaches based on lexical features can be very helpful in certain application areas. However, information is necessarily lost when the text is reduced. As explained in the example, the same words can have different meanings. The word “bank” is clearly defined as a credit institution in the context of the financial market. However, in the context of casual activities, the word “bank” is more likely to be defined as a place to sit. In order to understand the context even better, methods are needed that capture semantic features.

Semantic features

Word Embedding

According to the distributional hypothesis in linguistics, words that have a common context usually have a similar meaning. Thus, the context of words can be identified by semantic similarities and relations to other words. This approach is similar to transfer learning. A model is trained on different text corpora and learns the context of words. In order to define specific contexts, large text corpora from the respective domain are necessary. If the model is then applied to a new text document, it can replicate the context. In practice, this is made possible by methods such as Word2Vec or Global Vectors (GloVe). These algorithms represent all the words in the text corpus as vectors and arrange them in a multidimensional space based on their frequency and the surrounding words in such a way that words that frequently appear in a similar context also have a similar vector.

With the described NLP methods lexical and semantic informations can be provided in a numerical representation. Machine Learning Models can then be trained on this data and designed for their specific tasks, such as predicting the price movements of stocks.

Conclusion

In asset management, sentiment analysis deals with the evaluation of the opinions of investors in order to identify the appropriate course of actions. In addition to fundamental or technical analysis, it offers an optimal supplement for identifying yield drivers. For a long time, sentiment analysis was characterized by market indicators that reflect the sentiment of investors towards an entire market. Nowadays, however, AI enables the automated evaluation of huge text and voice data sets and their use for the prediction of price movements of entire markets, individual sectors or explicit assets. For this purpose, the information must first be transformed into a numerical representation so that machine learning models can process it. This is where natural language processing comes into play. With the methods described, thousands of text messages from social media or forum posts, annual reports or other text or voice data can be automatically analyzed and used to identify trading signals of stocks. Due to the enormous increase of available data (Big Data) and computing power as well as the new possibilities using natural language processing, sentiment analysis is becoming more and more important!