Interactive Lesson: Cosine Similarity for Comparing Meanings in P

The idea

Suppose we represent a sentence as a vector like [0.8, 0.1, 0.4, 0.7]. Another sentence becomes another vector. Cosine similarity checks how much the two vectors point in the same direction.

If the vectors point in nearly the same direction, the cosine similarity is close to 1. If they are unrelated, it is closer to 0. If they point in opposite directions, it can be negative.

cosine_similarity(A, B) = (A · B) / (||A|| × ||B||)

Why this is useful for meaning: two sentences may use different lengths or frequencies, but if their vectors point in a similar direction, their meanings may be close.

Important: cosine similarity does not understand language by itself. It only compares the vectors you give it. Good vectors lead to better meaning comparison.

A simple mental model

Imagine three arrows starting from the same point.

Two arrows in almost the same direction → very similar meaning
Two arrows at 90° → mostly unrelated
Two arrows in opposite directions → opposite tendency

Checkpoint: cosine similarity cares about angle, not just magnitude.

Close to 1 → highly similar

Close to 0 → weak similarity

Below 0 → opposite direction

Starter code: cosine similarity by hand

import math

def cosine_similarity(a, b):
    if len(a) != len(b):
        raise ValueError("Vectors must have the same length")

    dot_product = sum(x * y for x, y in zip(a, b))
    norm_a = math.sqrt(sum(x * x for x in a))
    norm_b = math.sqrt(sum(y * y for y in b))

    if norm_a == 0 or norm_b == 0:
        raise ValueError("Cosine similarity is undefined for a zero vector")

    return dot_product / (norm_a * norm_b)

v1 = [1, 1, 0]
v2 = [1, 0.9, 0.1]
v3 = [0, 0, 1]

print("v1 vs v2 =", round(cosine_similarity(v1, v2), 4))
print("v1 vs v3 =", round(cosine_similarity(v1, v3), 4))

Try this: change v2 to [1, 1, 0]. What happens? Then try [-1, -1, 0].

What each line is doing

1

Zip the vectors
Pair corresponding values like (x₁, y₁), (x₂, y₂), and so on.

2

Dot product
Multiply each pair and add the results.

3

Norms
Find the length of each vector using square root of sum of squares.

4

Divide
Dot product divided by the product of the two lengths gives cosine similarity.

Mini meaning example

In real NLP systems, vectors often come from embeddings. For learning, we can invent small vectors representing features such as animal, pet, wild, and vehicle.

import math

def cosine_similarity(a, b):
    dot_product = sum(x * y for x, y in zip(a, b))
    norm_a = math.sqrt(sum(x * x for x in a))
    norm_b = math.sqrt(sum(y * y for y in b))
    return dot_product / (norm_a * norm_b)

vectors = {
    "cat":   [0.9, 0.9, 0.1, 0.0],
    "dog":   [0.9, 0.8, 0.2, 0.0],
    "tiger": [0.9, 0.1, 0.95, 0.0],
    "car":   [0.0, 0.0, 0.0, 1.0],
}

query = vectors["cat"]
for word, vector in vectors.items():
    score = cosine_similarity(query, vector)
    print(word, round(score, 4))

Expected idea: cat should be closest to dog, less close to tiger, and far from car.

Rank sentences by similarity

import math

def cosine_similarity(a, b):
    dot_product = sum(x * y for x, y in zip(a, b))
    norm_a = math.sqrt(sum(x * x for x in a))
    norm_b = math.sqrt(sum(y * y for y in b))
    return dot_product / (norm_a * norm_b)

sentence_vectors = {
    "I love programming": [0.9, 0.8, 0.1, 0.0],
    "Coding is enjoyable": [0.88, 0.79, 0.12, 0.02],
    "The sky is blue": [0.05, 0.02, 0.1, 0.95],
    "Python is fun": [0.86, 0.82, 0.15, 0.03],
}

query = [0.9, 0.8, 0.1, 0.0]
ranked = []

for sentence, vector in sentence_vectors.items():
    score = cosine_similarity(query, vector)
    ranked.append((score, sentence))

for score, sentence in sorted(ranked, reverse=True):
    print(f"{score:.4f}  {sentence}")

This ranking pattern is used everywhere: search, recommendation, semantic matching, question-answer retrieval, and clustering.

Practice tasks

Task 1: Build your own vectors

Create vectors for apple, banana, mango, and bus. Which fruit pair becomes most similar?

Task 2: Similarity search

Store five sentences in a dictionary. Given one query vector, print the top 3 most similar sentences.

Task 3: Error handling

Modify the function so it safely handles different lengths and zero vectors with friendly messages.

Task 4: Interpret the score

Write a helper function that says “high”, “medium”, or “low” similarity based on the score.

Common mistakes

Comparing vectors of different lengths
Forgetting to handle a zero vector
Assuming cosine similarity alone gives perfect meaning
Using badly designed vectors and blaming the formula

Final takeaway: cosine similarity is a clean and powerful way to compare vector direction. In NLP, that makes it a natural tool for comparing meanings once text has been converted into embeddings.

Extension idea

After this lesson, the next step is to show the same concept using TF-IDF vectors or real sentence embeddings, then compare search results using cosine similarity. That makes the lesson even closer to semantic search systems.