Multimodal ML Analysis of Gender & Emotion Bias on Instagram

April 2025 - August 2025

Introduction

During the depths of COVID, I spent some time scrolling through Instagram news accounts across the political spectrum (yes I enjoy high blood pressure). And a pattern caught my eye: most images featured men, and certain demographics were used in strikingly deliberate ways.

For example, some accounts used images of women of color to convey positive issues, while others used similar demographics in negative contexts.

This pattern (whether intentional or not) shocked me because of how blatantly they associate certain topics or feelings with certain demographics. Surely this must influence the beliefs of the people consuming this content.

We already know that text can reinforce gender biases, and recent research suggests these effects are significantly stronger in images (source). However, modern social media platforms, like Instagram and TikTok, combine both text and images into a single medium enabling a powerful more nuanced form of bias reinforcement. For example, ungendered captions such as "This is the pinnacle of humanity" convey wildly different gender messages depending on if they are paired with an image of a man or a woman.

This realization led me to two questions:

1) How are men and women presented differently on these news accounts?

2) How do these patterns vary across the political spectrum?

Overview

I quantify gender portrayal on Instagram by pairing who appears in images with what the caption feels like. Concretely: I scraped 150k+ posts from 14 major outlets, detected whether images contained men or women, classified caption emotions, and then measured how emotion mixes shift when men vs women are pictured—overall and per outlet.

The core question I test throughout: Does the caption emotion mix change depending on the pictured gender?

Methodology

Data

Scope: 150,000+ posts from 14 Instagram news accounts (e.g., @bbcnews, @cnn, @foxnews).
Time window: Earliest posts range from mid-2012 (e.g., HuffPost, Politico, US News) to 2021 (Fox News), with continuous data collected through March 2024.
Political position: External bias scores used for ordering (Left → Right) from Ad Fontes Media.

Image Demographics (Vision)

Detector: OWL-ViT v2 base-patch16-ensemble , an open-vocabulary object detector that accepts free-text prompts. Queried with phrases such as “man” and “woman” to identify people.
Counting rule: For each post, detections are aggregated across all images. A post is tagged Men-only when >=1 man and 0 women are detected, and Women-only when >=1 woman and 0 men are detected.

Caption Emotions (NLP)

Model: distilbert-base-multilingual-cased-sentiments-student (Hugging Face), a multilingual DistilBERT fine-tuned for 22 fine-grained emotions.
Ekman mapping: The model's 22 output labels were collapsed into the Ekman-7 set (anger, disgust, fear, joy, sadness, surprise, contempt) plus a neutral category using a custom lookup table.
Decision rule: For each caption we take the highest-probability label.
Non-Neutral Normalization: When examining only non-neutral captions, we drop neutral cases and re-normalize probabilities across the seven Ekman emotions.

Key Metrics

Representation ratio: total men detected ÷ total women detected (overall and per outlet).
Emotion mix: share of posts per emotion bucket (overall; by gender; by outlet).
Gender gap (pp): Women-only minus Men-only in non-neutral share for a given emotion at an outlet.
Bias trend: Weighted linear fit of gender gaps vs outlet bias order (weights = min(N_women, N_men)).

Quality & Limitations

Data Simplifications: only the first image in a "carousel" of images was processed. Video representation was not included.
Open-vocabulary drift: prompts like “man/woman” are culturally loaded and imperfect. The counts collected are approximate, not perfect.
Single Label captions: forcing top-1 emotions simplifies nuance. Emotions like sarcasm or irony can be mislabeled.
Correlation != causation: outlet topic mix and current events may drive emotion patterns independent of pictured gender.

Executive Summary

More male faces: Images across 14 Instagram news accounts feature 0.00x as many men as women.
Neutral depends on who's pictured: Posts featuring men are more likely to be neutral, while posts featuring women more often express a clear emotion.

Outlet patterning: Left-leaning outlets more often pair women with happier captions and men with sadder ones, while right-leaning outlets show the reverse (more happiness for men and more sadness for women).

Findings

1) Who gets shown, and how often?

For each outlet, I summed the number of men and women detected across all posts, then computed a men-to-women ratio (men per 1 woman).

Across all posts, images feature 0.00x as many men as women.

Gender representation by outlet

Bars are ordered by external bias score (Left → Right). Solid line marks parity (1.0); dashed line marks the dataset mean of the ratios.

Political bias ratings from Ad Fontes Media.

Takeaway: Every outlet shows more men than women. The highest ratios are on right-leaning accounts.

2) What Is the Overall Emotion Mix of Captions?

To answer this question I classified each caption into the Ekman-7 emotions + neutral (Ekman-7: happiness, anger, disgust, fear, sadness, surprise, contempt) using a simple “top-1” rule. A neutral-margin was added to avoid borderline calls. The chart below shows the overall caption emotion distribution.

Loading chart…

Takeaway: Most captions are neutral. Happiness and sadness are the most common non-neutral emotions, but they still make up a minority of posts.

3) Does the Emotion Mix Change When Men vs. Women Are Pictured?

I split the dataset into two groups of posts: (a) Men-only (≥1 man, 0 women) and (b) Women-only (≥1 woman, 0 men). For each post I assign a single “top-1” emotion using the Ekman-7 + Neutral scheme.

Panel A Neutral in the mix, so the bars reflect the share of posts where Neutral wins vs any non-neutral emotion.

Loading…

Takeaway: Posts featuring women are less likely to be neutral and more likely to carry a clear emotion than posts featuring men.

The graph below drops Neutral and re-normalizes over the non-neutral posts only. This highlights how the composition of emotions differs when an emotion is present.

Loading…

Takeaway: Women-featured posts with non-neutral captions lean more toward happiness, while men-featured posts with non-neutral captions show relatively more sadness (Note: differences vary significantly by outlet).

4) How Does the Gendered Emotion Mix Differ Across Individual Outlets?

For each outlet we show side-by-side bars of non-neutral emotions for Men-only vs Women-only posts.

Loading chart data…

Takeaway: Happiness and sadness are the most common non-neutral emotions for all outlets. But each outlet has a very unique gendered emotion mix. For example, @fox leverages anger far more than @nytimes.

5) Where Are the Biggest Gender Gaps by Emotion and Outlet?

The heatmap shows the percentage-point difference in non-neutral caption share for each emotion and outlet, calculated as:( %_women − %_men ). Positive values mean the emotion appears more often when women are pictured and positive values mean it appears more often when men are pictured.

Tip: hover a cell to see sample sizes (N) and the exact pp difference.

Per-outlet differences (Women – Men, non-neutral)

No heatmap data yet…

Takeaway: Gender gaps are highly outlet-specific. The largest absolute differences appear in happiness and sadness because those emotions make up the biggest share of captions. Less common emotions (e.g., disgust or surprise) show smaller percentage-point gaps, but their relative shifts can still be substantial because a few extra cases can represent a large proportional change.

6) How Strongly Do Gender Gaps Correlate with Political Bias?

This scatter plot compares each outlet’s political bias to the gender gap in non-neutral caption share for a chosen emotion. The gap is computed as(%_women − %_men)in percentage points.

X-axis: Outlets ordered from left- to right-leaning (bias rank).
Y-axis: Gender gap for the selected emotion (positive = more common when women are pictured, negative = more common when men are pictured).
The black trend line is a weighted linear fit, where weights =min(N_women, N_men), so outlets with larger balanced samples influence the slope more.

Tip: hover a point to see the outlet, its bias rank, the exact pp gap, and N used for the weight.

No data for this selection.

Takeaway: Most emotions show weak alignment with political bias, but two stand out: sadness (R² ≈ 0.38) and happiness (R² ≈ 0.31). Given the noisy, large-scale data, this is a meaningful finding. Essentially, right-leaning outlets tend to show more sadness with women and more happiness with men, while the left leaning sources show the reverse pattern.

Next Steps

Add face attribution to distinguish public figures from crowds.
Move beyond a single top-1 label to some sort of multi-label emotion scores.
Model topic x gender interactions to capture nuanced content effects.

What I Learned

Building infrastructure for 150 k+ Instagram posts is no joke. Building a reliable scraping, storage, and image processing pipelines was 90% the battle.
Processing hundreds of thousands of images takes a long time. Multiprocessing and performance optimization are essential with any large dataset.
Moving data around takes forever... Making sure your storage and retrieval processes are efficient is crucial as your dataset grows.
Turning raw captions into usable sentiment metrics is hard. Defining thresholds and labels in different ways can change the results significantly.
Distilling massive datasets into a clear narrative was both the hardest and most rewarding part of the project.
Nivo charts are so beautiful!!

“Good Enough” lets you ship. It's not worth 100 more hours for a 1% improvement in product (especially when it's is just a hobby project lol).