MKT 326 · Assignment 1

Social Listening &
Sentiment Analysis

CourseMKT 326 — Marketing Analytics
DataSephora — L'Occitane & La Mer
LanguageR
RVADERtidytext ggplot2wordcloududpipe sqldft-testtm
Analysis R Code

Data Overview

The Sephora dataset consists of three linked tables: reviews (1,609 records), products (8,494 SKUs), and authors (1,525 reviewers). After joining reviews to products via product_id, I filtered to the two target brands — L'Occitane and La Mer — yielding 1,536 total reviews for analysis. The dataset spans multiple years of Sephora.com submission dates and captures ratings, review text, helpfulness signals, and reviewer attributes including skin type and eye color.

1,536Total Reviews (Both Brands)
4.04Overall Mean Rating
1,525Unique Reviewers

Rating Distributions

Star Rating Distribution — L'Occitane vs. La Mer

Both brands skew heavily toward 5-star reviews, though La Mer's distribution is more pronounced. L'Occitane has a notably higher proportion of 1- and 2-star reviews relative to its total review count, suggesting a more polarized customer base. La Mer's mean rating (4.06) edges out L'Occitane's (3.88), but the difference is modest given both operate in the prestige skincare tier.

Brand1★2★3★4★5★TotalMean Rating
L'Occitane11232221821593.88
La Mer156951001908361,3774.06
Managerial Implication

La Mer's higher review volume (8.7× more reviews) signals stronger brand salience and customer engagement on Sephora's platform. L'Occitane's lower volume may represent an opportunity to incentivize post-purchase reviews, particularly among satisfied customers who make up 64% of their base (4- and 5-star combined).


Do Longer Reviews Express More Positive Sentiments?

I first computed the character length of every review text, then segmented reviews as long (above the mean of 404 characters) or short (at or below the mean). This threshold captures a natural split: short reviews tend to be pithy one-liners or brief reactions, while long reviews are detailed narratives with nuanced sentiment.

Histogram of Review Length (Characters)

Review lengths are right-skewed, with most reviews falling under 500 characters. The mean length is 404 characters (approximately 65–80 words). The long right tail represents highly detailed reviews — some exceeding 2,000 characters. Short reviews: ≤404 chars. Long reviews: >404 chars.

404Mean Review Length (chars)
327Median Review Length (chars)
220–506Interquartile Range (chars)

VADER Sentiment Analysis

VADER (Valence Aware Dictionary and sEntiment Reasoner) assigns a compound score between −1 (most negative) and +1 (most positive) to each review. I calculated VADER scores on the full review text for all 1,536 reviews, then compared compound scores across star rating groups and review length segments.

Average VADER Compound Score by Star Rating

VADER compound scores increase monotonically with star rating — exactly what we'd expect from a well-calibrated sentiment tool. 1-star reviews average +0.18 (still mildly positive because reviewers often soften criticism with hedges like "nice packaging but..."), while 5-star reviews average +0.76. The gap between 1-star and 5-star validates VADER's signal on this corpus.

1-Star Reviews: Long vs. Short (t-test)

GroupnMean VADER CompoundStd Dev
Long 1-star reviews460.40390.5976
Short 1-star reviews1210.09020.5105

t(165) = 3.38, p = 0.0009 SIGNIFICANT

Among 1-star reviews, longer reviews are significantly more positive in VADER score than shorter ones (0.40 vs. 0.09). This is a counterintuitive but explainable finding: customers who write long negative reviews tend to include mitigating language ("the packaging is beautiful, but the formula broke me out"), nuanced critiques, and comparisons to other products — all of which push the compound score upward despite the low star rating. Short 1-star reviews are more likely to be blunt expressions of displeasure with little qualifying language.

5-Star Reviews: Long vs. Short (t-test)

GroupnMean VADER CompoundStd Dev
Long 5-star reviews3450.85260.2552
Short 5-star reviews5730.70230.3573

t(916) = 6.83, p < 0.0001 SIGNIFICANT

Among 5-star reviews, longer reviews score significantly higher on VADER than shorter ones (0.85 vs. 0.70). Enthusiastic customers who write detailed positive reviews use more intensifiers ("absolutely love," "life-changing," "holy grail") that amplify the compound score. Short 5-star reviews may simply say "Great product!" — positive but lower-intensity language.

Key Finding — Review Length

Longer reviews express more extreme sentiment in both directions, but especially among positive reviews. The effect is statistically significant for both 1-star and 5-star segments. This suggests that engaged, invested customers — those who bother to write detailed reviews — are more linguistically expressive, which amplifies their VADER scores regardless of valence.


L'Occitane vs. La Mer — Brand Sentiment Comparison

L'Occitane
0.621

Mean VADER Compound
n=159 reviews · Avg rating 3.88

Pos/Neg ratio: 3.03×
La Mer
0.658

Mean VADER Compound
n=1,377 reviews · Avg rating 4.06

Pos/Neg ratio: 4.09×

La Mer edges out L'Occitane on both measures. Its VADER compound score (0.658) is higher than L'Occitane's (0.621), and its pos/neg ratio (4.09×) significantly exceeds L'Occitane's (3.03×). La Mer reviewers are more consistently enthusiastic; L'Occitane shows more sentiment variance, which is consistent with its more polarized rating distribution.

Word Cloud Analysis — High vs. Low Ratings

To understand what customers praise vs. complain about, I generated word clouds from 5-star (high) and 1-star (low) reviews for each brand, after removing stopwords, brand names, and generic product terms ("skin," "cream," "product").

5-Star Reviews — Top Words
love moisturizing hydrating gentle soft amazing luxurious worth smell smooth
1-Star Reviews — Top Words
breakout allergic reaction return disappointed price waste greasy sensitive heavy
Managerial Implication — Word Clouds

Satisfied customers center on sensory experience ("moisturizing," "hydrating," "soft," "smell") and emotional resonance ("love," "amazing"). Dissatisfied customers focus on skin reactions ("breakout," "allergic," "reaction") and value perception ("price," "waste"). This suggests the brands' primary risk is adverse skin reactions — a clinical/dermatological communication gap — not product efficacy per se.

Part-of-Speech (POS) Analysis — Adjectives

Using the udpipe package, I extracted adjectives (UPOS tag = "ADJ") from 5-star and 1-star reviews. Adjectives carry the clearest signal about customer sentiment because they directly modify the product experience.

5-Star — Top Adjectives
AdjectiveFrequency
greathigh
moisturizinghigh
softhigh
hydratinghigh
luxurioushigh
lightweightmoderate
smoothmoderate
1-Star — Top Adjectives
AdjectiveFrequency
disappointedhigh
greasyhigh
heavyhigh
sensitivemoderate
expensivemoderate
terriblemoderate
awfulmoderate

Customer Attribute Analysis — Skin Type & Eye Color

Average Rating by Skin Type

Dry-skin customers rate these brands highest (4.07), while oily-skin customers give the lowest ratings (3.89). This is consistent with prestige moisturizer formulations — these products are typically richer and more occlusive, which benefits dry skin but may feel heavy or pore-clogging for oily skin types. Brand managers should consider targeted messaging to oily-skin segments.

Average Rating by Eye Color

Eye color shows minimal variation in average ratings (range: 3.87–4.08), suggesting it is not a meaningful predictor of product satisfaction for this brand tier. Eye color may correlate with other demographic variables (skin tone, geographic origin), but the effect size here is negligible. Blue-eyed customers rate slightly lower (3.87), though this difference is not statistically meaningful.

VADER vs. Positive/Negative Ratio — Which Method Is Better?

BrandPos/Neg RatioVADER CompoundStar Rating Mean
L'Occitane3.03×0.6213.88
La Mer4.09×0.6584.06

Both methods agree on directionality — La Mer scores higher than L'Occitane on both measures. However, VADER is the superior tool for three reasons: (1) The pos/neg ratio is a blunt instrument that discards 3-star reviews entirely and ignores intra-review sentiment complexity. (2) VADER captures nuance within individual reviews — a 1-star review can contain genuinely positive language about certain product attributes, which the ratio method misses. (3) VADER's compound score is continuous and normally distributed, making it more amenable to statistical testing (t-tests, regression) than a simple ratio. The key caveat: VADER was trained on social media text, so it can misread domain-specific beauty vocabulary — the word "rich" (positive in beauty) may not score as positive as intended.


Additional Data Visualizations

Review Volume by Month

Review volume peaks in January, consistent with post-holiday gift unboxing behavior — customers receiving luxury skincare as gifts rush to write reviews. A secondary peak in August/September aligns with back-to-season shopping. Managerial implication: schedule social media campaigns in December–January and August to intercept peak engagement windows when review activity is highest and customers are most receptive.

Helpfulness Score Distribution

Helpfulness scores cluster at the extremes (0 or 1), suggesting that Sephora reviewers tend to find reviews either very helpful or not helpful at all — a clear polarization. High-helpfulness reviews likely contain more detailed product descriptions and specific skin type callouts. Managerial implication: prompt customers to include skin type and use-case context in reviews to increase perceived helpfulness and drive conversion among potential buyers with similar profiles.


AI-Augmented Workflow Reflection

I replicated the core R analyses (VADER scoring, word cloud generation, POS tagging) using Claude as an AI assistant. The workflow involved providing the professor's demo R code as context and asking the model to explain outputs, suggest interpretation frameworks, and identify analyses I hadn't considered — specifically the VADER vs. pos/neg ratio comparison and the skin type segmentation.

What I liked: The AI dramatically accelerated the interpretation phase. Writing the "so what" — translating raw output numbers into managerial implications — typically takes longer than the analysis itself. The model could draft first-pass interpretations in seconds, which I then refined with domain judgment. It was also useful for catching edge cases (e.g., the counterintuitive direction of the 1-star long vs. short VADER result) and suggesting the t-test framing.

What I disliked / caveats: The model occasionally hallucinated specific word frequencies and suggested visualizations that don't quite match what the actual R packages can produce. It also has a tendency to over-interpret small differences — I had to actively push back when it wanted to draw strong conclusions from the eye color analysis where the effect size was trivially small. The AI is a powerful accelerant for competent analysts, but it requires a human with domain knowledge to catch where it overreaches.

Bottom Line

AI tools work best as a co-pilot for interpretation and ideation, not as a substitute for running the analysis correctly. The biggest risk is that a non-expert might accept AI-generated interpretations at face value without stress-testing the statistical reasoning.


Summary & Social Media Strategy

How is the Brand? Key Findings

Both brands occupy a strong position in the prestige skincare category. La Mer leads on review volume (1,377 vs. 159), mean rating (4.06 vs. 3.88), VADER compound score (0.658 vs. 0.621), and pos/neg ratio (4.09× vs. 3.03×). The primary customer concern across both brands is skin reactions — "breakout," "allergic," and "reaction" dominate 1-star reviews, suggesting that adverse reactions, not poor efficacy, drive the most negative feedback. Dry-skin customers are the most satisfied segment. Review volume peaks in January (post-holiday) and August (seasonal transition).

Social Media Campaign Proposal

Content — Dragonfly Framework

Focus on emotion (sensory experience: how the product makes your skin feel) and story (the 30-day transformation narrative, dry skin to glowing skin). Feature UGC from dry-skin customers — the most satisfied segment — in short-form video. Avoid heavy clinical language; lean into the luxury sensory vocabulary customers themselves use: "soft," "hydrating," "luxurious."

Content — STEPPS Framework

Social Currency: "I use La Mer" is an identity signal. Triggers: Post-cleansing routine moments. Emotion: The joy of transforming dry, irritated skin. Public: Encourage visible before/after content. Practical Value: Skin-type matching guides (especially targeting oily skin, the lowest-rating segment). Stories: Long-term customer loyalty stories.

Channel & Timing

Primary channel: Instagram Reels and TikTok for short-form video (skin transformation content). Secondary: YouTube for long-form skincare routine integration. Launch campaigns in December–January (peak review/gift season) and August (seasonal skin transition). Post Tuesday–Thursday evenings EST for peak engagement.

Metrics & Caveats

Track: review volume growth, avg. star rating, VADER compound score trend, UGC share rate, skin-type segmented NPS. Caveats: High review volume ≠ positive sentiment (La Mer has 1-stars too). Customers who post reviews may be systematically more extreme in sentiment than the silent majority. The 5-star skew may partially reflect purchase rationalization — buyers of luxury products may unconsciously inflate ratings to justify the spend.


Starbucks × L'Occitane: Seasonal Beauty Subscription Box

Using the customer-centric framework and the sentiment data gathered above, I designed a seasonal fast-beauty subscription box in collaboration with Starbucks, where each box's theme maps to a seasonal Starbucks drink.

☕ Fall — Pumpkin Spice Latte Box

Theme: Warm & Nourishing. Products: rich facial oil, warm amber lip balm, spiced body scrub. Target: dry-skin customers entering winter (highest-rated segment). Campaign trigger: September launch aligned with PSL season drop. Color palette: burnt orange, cream, caramel.

❄️ Winter — Peppermint Mocha Box

Theme: Cooling & Refreshing. Products: cooling eye cream, hydrating mist, gentle exfoliant for post-holiday skin recovery. Targets the January review peak — high purchase intent season. Positions the brand as a self-care ritual anchor during high-stress gifting season.

🌸 Spring — Lavender Oat Milk Latte Box

Theme: Light & Calming. Products: lightweight SPF moisturizer (addresses oily-skin segment's "heavy" complaint), calming serum, floral mist. Ingredient-forward: oat, lavender, ceramides. This is the highest-opportunity season to convert oily-skin skeptics — lighter formulations align with what they've been requesting in 1-star reviews.

☀️ Summer — Strawberry Açaí Box

Theme: Bright & Protected. Products: vitamin C serum, SPF lip treatment, cooling gel moisturizer. Addresses the summer skin concern of UV protection and oil control. The Açaí association anchors antioxidant skincare benefits in consumer memory. Additional data to collect: purchase timing by season, SPF attachment rate, repeat purchase frequency by box theme.

Additional analyses I would run: Cross-tabulation of subscription renewal rates by box theme; CLV modeling by seasonal entry cohort to identify which "door" (season) acquires the most loyal customers; NLP on social media posts using the Starbucks seasonal hashtags to measure organic conversation overlap with skincare topics (identifying the Starbucks-to-skincare crossover audience); A/B test on box theme names (functional vs. emotional framing) to optimize open-rate and gifting conversion.

← back to all projects