The five-star review system is pervasive across app stores, e-commerce platforms, and nearly every digital marketplace. It’s meant to give consumers a quick way to gauge quality, but increasingly, it’s failing both users and creators. The latest exhibit in this ongoing breakdown comes from Terry Godier, the developer behind a new RSS reader called Current. In a recent post, Godier pointed out a paradox that anyone who has ever left a four-star review will recognize: a four-star rating can simultaneously praise an app and drag down its average score.
Godier wrote that many four-star reviews contain phrases like “This is my favorite app!” or “Game changer!” — clear expressions of delight and endorsement. Yet because the scale only goes to five stars, a four-star rating subtracts from the app’s overall average, often pushing it below a 4.0. For many developers, a rating of 4.0 or higher is a key metric for visibility and credibility. Anything less can feel like a failure, even when the feedback is overwhelmingly positive.
This problem is not new. It has been discussed by developers, designers, and user experience researchers for years. The five-star system is a relic of an earlier era of online reviews, originally designed for product ratings where a perfect score was rare. But in today’s app economy, where millions of apps compete for attention, the pressure to maintain a perfect or near-perfect average has created a perverse incentive: users are reluctant to give anything less than five stars unless they have a major complaint, and developers become anxious about any review that isn’t a perfect score.
The Psychology of Star Ratings
To understand why the system is broken, it helps to look at the psychological mechanisms at play. Multiple studies have shown that consumers tend to gravitate toward extremes. In a five-star system, the middle options — two, three, and four stars — are used far less frequently than one or five stars. This creates a bimodal distribution that does not accurately reflect the spectrum of user experiences. A product that receives mostly four-star reviews, for example, may actually be excellent but is penalized because the scale implies that four is “good but not great.”
Furthermore, the interpretation of star ratings varies widely. What some users consider a three-star experience might be a four-star for others. Cultural differences also play a role: in some countries, four stars is considered excellent, while in others, anything less than five is a sign of significant fault. This inconsistency makes star ratings an unreliable measure of quality.
For app developers, the stakes are high. An app’s average rating directly influences download rates, featured placements, and even consumer trust. A drop from 4.5 to 4.4 can reduce conversions by double digits. As a result, developers often beg for five-star ratings and ignore the value of honest, constructive feedback that a four-star review can provide.
The History of the Five-Star System
The five-star rating system dates back to the early days of e-commerce. eBay and Amazon popularized it in the late 1990s as a way to build trust between buyers and sellers. The simplicity of the system made it easy to implement and understand. Over time, it became the default for nearly every review platform, from the App Store to Google Play to Yelp.
But as digital marketplaces scaled, the system’s flaws became more apparent. The rise of review bombing, fake reviews, and the “review culture” that pressures users to leave perfect scores or nothing at all has eroded its usefulness. Some platforms, like Netflix, abandoned star ratings in favor of a thumbs up/thumbs down system. Others, like Amazon, have experimented with more granular ratings and machine learning to surface helpful reviews.
Yet the app stores remain stubbornly attached to the five-star model. Apple and Google have made incremental changes — such as allowing developers to respond to reviews or asking for ratings after positive interactions — but the fundamental structure remains unchanged. The result is a system that rewards mediocrity and punishes high-quality but imperfect products.
The Current Case: An RSS Reader Caught in the Paradox
Terry Godier’s Current is a well-reviewed RSS reader that has garnered praise for its clean design and powerful features. Yet, as Godier noted, the app’s average rating is being held back by four-star reviews that are effectively five stars in sentiment. This is not an isolated incident. Many popular apps experience the same phenomenon. A four-star review that says “love this app, but it could be slightly better” is, in reality, a positive endorsement, but the system treats it as a demerit.
Godier’s frustration is shared by many developers who feel trapped by the system. Some argue that the solution lies in moving to a binary rating system, such as “like” or “dislike.” Others propose a continuous slider or a scale that allows for more nuance, like a 10-star system. But any change would require buy-in from the platform holders, who have shown little appetite for overhauling a core feature of their stores.
The problem also extends to consumer behavior. Users are often not aware that a four-star rating can hurt an app’s average. They may think they are being generous, not realizing that their review will be averaged in with the one-star reviews and will lower the overall score. Education could help, but it has to be paired with a system redesign.
Possible Solutions and Alternatives
Several alternatives to the five-star system have been proposed. One is a “thumbs up” or “like” system, similar to YouTube or Netflix, which removes the pressure to assign a precise score. Another is a combination of binary thumbs up/down with a separate review text field for detailed feedback. Some platforms have introduced “helpful” votes to surface the most useful reviews, regardless of star count.
From a data perspective, app stores could calculate average ratings using a different formula, such as a Bayesian average that accounts for the number of reviews and confidence intervals. This would prevent a single four-star review from dropping an app’s average significantly. Another approach is to display the distribution of ratings, not just the average, so users can see how many people gave four stars versus five.
Ultimately, the goal should be to align the rating system with the intent of the reviewer and the needs of the developer. A system that encourages honest, accurate feedback will serve everyone better than one that forces users to choose between a perfect score and a harmful one.
Until then, developers like Terry Godier will continue to watch their brilliant apps get knocked down by the very people who love them most. The five-star review system is broken, and the evidence keeps piling up. Exhibit 472,304 is just the latest example.
Source: The Verge News