Judge A Book By Its Cover

Although idiom advises against doing so, is it impossible in the literal sense?

When choosing what to read, we often rely on first impressions: an intriguing title, an attractive cover, or the book’s popularity or reputation. However, these initial judgments can sometimes lead to disappointment. Are there ways to avoid such situations? If so, what elements might predict whether a book meets or falls short of our expectations?

To explore these questions, I collected a Goodreads dataset featuring books categorized as beloved, hated, or met with mixed feelings—the ones appearing on both sides. The goal is to determine whether a book’s “cover” and associated features can predict how people might like it and what factors influence it most.

I tested three main approaches for feature selection and model design, which was

Numeric-nominal only (such as publish year and genre) + logistic regression
Textual info only (such as title and description) + N-gram logistic regression
Everything combined (including book cover!) + GPT-4o 50 shot ICL

From the feature distribution, we can see that the “good” books are easier to distinguish than those “bad” or “mixed feeling” ones, and the feature distribution is not independent from class.

(1) Feature clustering result, the green (good) dots are farther away (2) Book cover color distribution (3) Book style distribution

And the best accuracy achieved by simple logistic regression was 66%! But the “mixed feeling” class was really challenging, almost always misclassified. What about LLMs?

Word cloud generated from the reasonings of GPT-4o's prediction. Keywords related to numeric or nominal values are frequently mentioned!

Surprisingly, the performance was even worse – actually the worst one, reaching only 40% of accuracy. An interesting observation is that the “mixed” class is predicted more often, which might be because LLMs see this class as being between “good” and “bad”, therefore choose it to be safe.

This behavior is similar to how we judge books: we notice details like ratings, genres, and awards, but we still give books a chance without making firm judgments. After all, the chance of a book being very bad is quite low. In this way, ICL reflects some of our decision-making. On the flip side, since ICL performs poorly, perhaps we really shouldn’t judge a book by its cover.

Before discovering a more systematic and reliable method, let’s just give the book a chance and read it ourselves