Advanced Nlp For Competitive Analysis And Product Intelligence

13 Jul 2024

Motivation

There have been a couple of conferences lately such as ones from OpenAI and Google that have peaked my interest due to the high expectation to augment/apply AI in their product offerings. So when Apple, known to be late in the AI front had its annual Worldwide Developer’s Conference (WWDC), I wanted to get a pulse of people’s reactions.

Given that I am writing this on a MacBook Pro and recently got an iPhone, I feel more inclined to be up-to-date with Apple’s releases. I did have the option of just tracking Reddit’s WWDC 2024 live megathread or watch The Verge’s condensed version of the conference, but that is not as fun as scraping the data and performing NLP analysis on people’s reactions. 😃

Also, after performing sentiment analysis on LLM chatbots, in future iterations, I had wanted better topics generated as well as perform sentiment analysis on aspects versus overall comments. Lastly, I have been wanting to use a free, open-sourced LLM’s API’s to generate answers in my notebook.

Business Insights

The way I approached the analysis was to start at high-level and then go down to the weeds. That is why I started with topic modeling and ended with Aspect- Based Sentiment Analysis.

When extracting the most talked about topics from the conference, it turned out that an Apple presenter using competitor’s Samsung Notes app to present was highly joked about. Though noise, this topic did give some insight on what is a great Notes app even if it was not made by the company he worked for 😂 . Despite this top topic, I was still able to extract much on what people were buzzing about:

Screenshot 2024-07-14 at 9 39 19 AM

We see here that the most intriguing things were: Siri’s upgrade, iPad calculator app and of course their AI aka ‘Apple Intelligence’ 😂 integrations. At the same time, we see that there were some underwhelming sentiments regarding upgrades applied to non-15 base models only, icons/widgets/homescreen personalization and Apple’s brand in general when it came to solving real problems and being a top innovator.

Three different approaches in analyzing customer feedback was taken: overall comment sentiment (positive, neutral or negative), overall commment emotion (which was more granular in sentiment: sadness, fear, anger, surprise, joy or love), aspect based sentiment analysis and finally, on a smaller sample of comments- Fast Large Language Model (LLM) was used to determine overall comment sentiment (positive, neutral or negative).

What was found that all 4 performed differently when their scores were compared with my own human review. LLM outperformed all other sentiment predictions, but was only done on a small sample. Overall, comments came from conference day and emotion was positive and did not decay over time. General sentiment and emotion trends are depicted below. Caveat is that sentiment model accuracy was 20% while emotion model accuracy was at 40%. At the same time, the sample taken to test model accuracy was only 10 comments.

Screenshot 2024-07-14 at 9 18 59 AM

Screenshot 2024-07-14 at 1 55 57 PM

Also, I compared YouTube versus Reddit engagement (number of likes), average emotion and average aspect sentiment.

Screenshot 2024-07-14 at 9 32 05 AM

YouTube was more popular with engagement most likely due to MKBHD’s channel, but other measurements need improvement.

These coarse views, led me to pursue finer grain sentiment analysis with more precise sentiment scores. Aspect-Based Sentiment Analysis pulls out key aspects from a comment and the sentiment tied to it.

Screenshot 2024-07-14 at 9 33 47 AM

Now this is what we were after: what was talked about most and the sentiments tied to key aspects.

At the center, we can see the most talked about topics across YouTube and Reddit. The outer layer shows the different sentiments per topic. iOS and their competitor, Samsung had negative sentiment mostly tied to it. However, things looked good with iPad only (most likely due to the impressive Calculator app). There was positive sentiment for apple, but it was not the dominant emotion.

It looks like from the WWDC conference, iPhone, iPad, iOS, Android/Samsung (notes), app and features were the most talked about topics. Looks like Apple needs to work on their reputation even after catching up on the AI front.

Business Recommendations

In general:

  1. Apple needs to be more of a risk taker
  2. Be an innovator and not a copier
  3. Research problems people are actually having and solve them
  4. Build a watertight strategy and execute feature releases faster

More specifically:

Based on the sentiment analysis findings from the WWDC conference, here are some business recommendations for Apple to improve their reputation and address the feedback received:

1. Address Negative Sentiment Towards iOS and Competitors

2. Capitalize on Positive iPad Sentiment

3. Improve Overall Sentiment and Brand Perception

4. Leverage AI and Innovation

5. Improve Product and Feature Discussions

6. Enhance Customer Support and Service

Implementation Strategy

By taking these steps, Apple can improve its reputation, address negative sentiments, and capitalize on positive feedback to strengthen its position in the market.

Personal Growth as Data Professional/Fun Lessons

  1. LLM API and workarounds (data augmentation, sentiment classification) First off, LLM’s are versatile winners when it came to having to augment data by replacing words with synonyms and performing sentiment analysis on overall comments. Then again it has greater foundation of knowledge and bigger context window to handle nuanced/domain-specific questions.

  2. BERTopic is a library that really peaked my interest. Last time I tried topic modeling, it was messy and not helpful when using LDA. The difference is most likely due to BERTopic being transformer embeddings based whereas LDA is based on probabilities. I enjoyed the visualizations that helped with interpretation. Below was the most fascinating one for me because it had the different topics and upon hovering and playing with the slider, you can examine different topic and words that make up the topics.

Screenshot 2024-07-14 at 9 54 12 PM

  1. When it came to determining customer feedback sentiment, first pass involved a general approach where a whole comment was scored as positive, neutral or negative. The sentiment seemed too broad and coarse so I went ahead and tried out an emotion classifier that scored comments into 6 different categories: sadness, fear, anger, surprise, joy and love. This was better in theory, but when I audited it with my own sentiment scores, the overall sentiment classifier was 20% correct, the emotion classifier was 40% correct, PyABSA was 50% correct and Groq was 80% correct. 😀

  2. Last, but not least, I was so happy with using chatGPT 4o (omni) as a code helper (even though had couple more hiccups than preferred). However, it redeemed itself. When I showed it an image of a table and asked for a non-standard visualization like a stacked bar chart, it produced code for a sunburst graph. I am quite fond of these graphs because I used it one time to showcase user journeys to a business stakeholder and I like how it captured the answer to his business question in a concise and sophisticated way. 🙂

Code and Comments Notebook

Next Iteration

  1. Data Augmentation with LLM/Groq
  2. Sentiment/Emotion Classification using HuggingFace models, but fine-tuned on YouTube/Reddit data for greater prediction accuracy
  3. Groq prediction at scale. It is fast and free so limit on what it can take in is understandable. I will try to see if I can hit API differently, e.g. >100 rows at one time
  4. Named Entity Recognition (NER) for deeper analysis on things such as:
    • Brand Monitoring:

Track mentions of Apple, Samsung, and their products across various sources. Identify sentiment associated with each brand or product.

Compare the frequency and context of mentions between Apple and Samsung. Identify key people, locations, or events associated with each company.

Extract information about specific products (e.g., iPhone, Galaxy) and their features. Track new product launches and consumer reactions.

Identify emerging technologies or features that both companies are focusing on. Detect shifts in consumer preferences or market dynamics.

Automatically categorize customer comments or reviews by product, feature, or issue. Identify common pain points or praised features for each brand.