In North America, Vainu has been partnering with some of the largest institutions in the commercial banking, insurance, and telecom sectors to solve the challenging task of predicting customer retention, or churn, in their extensive client bases.
In tackling this complex problem, we've been working in close collaboration with the companies, offering them a machine learning approach that involves a key component of our global company database: millions of news articles collected, labeled, and indexed from the web.
In this post, we'll explain our approach and the results we've seen.
Machine learning to understand the news
Vainu’s global company database currently has over 130+ million business entities, ranging from the smallest businesses to publicly listed corporations in every market we operate in. As a part of building that database, we aim to read, collect and understand everything that is written about these companies on a daily basis.
One part of this project is collecting information from news articles globally (currently 1.5 million per day), and applying Natural Language Processing (NLP) to further understand what is being written about companies in those articles. In addition to verifying the articles' business relevancy, we use industry-leading Named-Entity Recognition (NER) to identify where companies are being mentioned, and which companies are being talked about.
Once the businesses in an article are identified, we use our ML-driven classification system to tag the articles based on the changes happening in them: growth or expansion, mergers or acquisitions, and changes in personnel to name a few examples. This tagging system has been taught with a massive internal dataset, collected through several creative crowdsourcing methods, including Amazon's Mechanical Turk.
Using news articles to predict churn
After many years of perfecting our understanding of what is being said about companies around the web, looking at millions of companies and thousands of news stories associated to any of them, Vainu has now proven an actual correlation between the collected news articles and future business decisions.
In our approach, we work side by side with a company's data scientists to build out custom models that predict a future business outcome, such as whether their clients will continue the customer relationship or not. In the example of client retention, the company provides us with internal historical data on customer churn. This internal data is combined with our news articles on those customers prior to the historical outcome, in order to form a teaching set—the basis for the model to predict future outcomes.
In churn prediction specifically, we use textual analysis and NLP to break down the content of each of the news articles before desired prediction. We've seen a correlation between the words in those texts and a customer leaving their existing vendor, quite successfully at that:
Across multiple clients, Vainu's machine learning approach has consistently seen a validation accuracy and test accuracy of over 80% in churn prediction.
Vainu and its customers have proven that their models and this approach are a successful method in identifying at-risk customers, an effective tool in guiding the customer success efforts of an organization.