Research | Project Three Star

For each year since 1936, Michelin has updated its guidebooks to award or deduct up to 3 stars for each restaurant it inspects. The inspection process is shrouded in secrecy to avoid conflicts of interest and stars are scarcely awarded. The stars have since become one of the most recognized and respected awards a restaurant can achieve, with thousands of chefs from all corners of the globe devoting their lives to the pursuit of just a single star.

In fact, Michelin-starred cooks take the stars so seriously that renowned chefs, such as Gordon Ramsay, have wept over the loss of one or more stars, with some even taking their own lives.

“I started crying when I lost my stars. It's a very emotional thing for any chef. It's like losing a girlfriend. You want her back. I think every top chef in the world, from Alain Ducasse to Guy Savoy, when you lose a star it's like losing the Champions League. There's next year. So it's not done forever that you can't win those things back. I got asked the question literally two weeks ago on holiday. What would you do if you ever lost your third star? Honestly? I would win it back. It's nice to stay focused.”

Gordon RamsayReflecting on losing Michelin stars in the same year, quote from Eater.com

And it’s no wonder. A single star can have customers flocking to that designated restaurant, bringing about both immense amounts of fame and profit. The late and great Joël Robuchon, who himself had an absolutely astounding record of 32 Michelin stars, once said that “with one Michelin star, you get about 20 percent more business. Two stars, you do about 40 percent more business, and with three stars, you’ll do about 100 percent more business.”

Customers and investors alike are therefore very keen to predict which restaurants eventually earn a Michelin star before the competition gets fierce.

As of May 21, 2026, out of millions of restaurants in the world, only 3847 of them have at least a star: 3149 have one star, 541 have two, and just 157 have three.

That makes correctly identifying a restaurant with a star so difficult that a standard ML model would probably be worse off than a basic line of code that constantly outputs a negative Michelin classification for a restaurant.

Data Aggregation and Wrangling

I aggregated all the data using Chrome’s Web Scraper extension on TripAdvisor in New York City and the Michelin Guide NYC, wrangling them using R.

The following variables were web scraped entirely from TripAdvisor: Cuisine, Overall.Rating, Dining.Type, sentiment, Review.count, Value.rating, Service.rating, Atmosphere.rating, years.with.tripadvisor.award and Food.rating.

I scraped Michelin.Stars directly from the Michelin guide and created the Have.Star variable by creating an if-else statement that outputted 1 if the restaurant has at least one Michelin star and 0 if not. This was done to create a binary response variable that would be easier for logistic regression and random forest models to handle.

I split Dining.Type into two categories: fine and casual dining. If the restaurant listed $$$ or above on TripAdvisor, then I categorized it under fine dining. Anything $$ or below would be considered casual dining.

To obtain the sentiment variable, I ran a sentiment analysis in R on 30 randomly selected TripAdvisor reviews from each and every restaurant that has at least 1 review. This not only simplifies the dataset but is also appropriate for improving the models' predictions on the condition that Michelin doesn’t normally review unpopular restaurants.

The years.with.trip.advisor.award is a continuous variable, which is a reflection of restaurants that have been recognized by TripAdvisor for at least 1 year.

Since TripAdvisor’s overall rating is not an entirely accurate representation of people’s ratings, I created a new weighted-average approximation, Overall.Rating, using reviews separated by the number of stars awarded.

I built a couple of models to predict whether a restaurant will receive/deserves at least one Michelin star. To start, I began by randomly selecting a training dataset with 75% of the original data and variables. The remaining 25% of the data would later be used as part of the holdout sample.

I constructed a random forest model on the training data before then applying the prediction model onto the holdout sample. For the random forest and logistic regression model, I set Have.Star as the response variable since I wanted the model to both identify which restaurants would have at LEAST one star and to also allow for a binary response, which would, in turn, be easier to fit around.

By tuning the model, I found the optimal number of variables randomly sampled at each split to be around 50. The output produces nothing too unusual. The out of bag error is thankfully not too high, meaning the model is quite solid.

The random forest model produces an AUC of 0.84, resulting in a model that is very good at discerning between restaurants that have and don’t have Michelin stars.

Random Forest Variable Importance Plot

Furthermore, the variable importance plot reveals that Cuisine is far and above the most important variable in the dataset, followed by Overall.Rating and Dining.Type.

Next, I built a logistic regression model, again with Have.Star set as the response variable. I used LASSO with 10-fold cross-validation to determine the desired sparsity of the model.

LASSO graph

Using the regularization output’s lambda minimum, I included 4 statistically significant variables into the predictive model, which were Cuisine, Dining.Type, Value.Rating andReview.count. The lowest AIC obtained was 237.56, implying that this model lost the least amount of information out of all others.

ROC curve on the training data

The AUC of 0.98 on the training data indicates a model that is extremely good with its predictions on the training data.

ROC curve on the testing data

On the other hand, the AUC of 0.74 on the testing data shows a much weaker model than anticipated. The AUC is still good, but not excellent like in the training data.

From left to right starting from the top: precision-focused confusion matrix on training and testing data, recall-focused confusion matrix on training and testing data

The training data’s confusion matrix is built on a conservative threshold of 0.65, which results in a precision of 0.91.

Applying the conservative threshold on the testing dataset reveals high precision but extremely poor recall. This is not necessarily a bad thing, as a model with high precision would result in fewer false positives, which, in this case, would be beneficial to low-risk investors.

To build a model that has a high recall, I set the threshold to 0.17, which, through weighted-classification, is equivalent to establishing a threshold as if a false negative prediction was 5 times more costly than a false positive. A model with a high recall would be best served to those who are searching for the next restaurant to earn a Michelin star.

The recall-based model does okay on the testing dataset, with a recall of around 0.455. Also note that the misclassification rate, while often relevant, is not extremely useful here since correctly classifying Michelin-starred restaurants is more important.

Logistic Regression vs. Random Forest ROC curves

Overall, based on the AUCs, it’s quite apparent that random forest generated a model that’s much better than the one the logistic regression built, so it’s worth placing more weight on the former’s predictions when looking for a lower misclassification rate.

Which restaurants are on the verge of gaining a Michelin star?

Precision-focused Logistic Regression Model Result:

According to the logistic model built around precision optimization, the following restaurants are extremely deserving of at least one Michelin star:

This type of prediction would be useful for risk-averse investors looking for a safe expected return on investment when looking for the next Michelin-starred restaurant. This particular model strongly suggests that Gran Tivoli, Perrine, and Sushi Ishikawa each deserves at least one Michelin star despite not having one already.

TripAdvisor reviews about Gran Tivoli

As expected, the three aforementioned restaurants all receive highly positive reviews and are categorized under fine dining. I would not be surprised if any of these restaurants are awarded a Michelin star in the coming years.

Recall-focused Logistic Regression Model Result:

The recall-focused model captures most of the restaurants with at least one Michelin star and can be interpreted as the model that represents Michelin’s standards the most. In other words, the following restaurants don’t have Michelin stars but are most similar to those that have one or more.

Since the main focus of this project is on predicting which restaurants will potentially earn a Michelin star, I’m more interested in the model’s false positives. As such, the table below is NOT representative of the model’s predictive abilities as all the true positives have been filtered out. The same approach applies to the random forest model.

Note how Gran Tivoli, Perrine, and Sushi Ishikawa show up here as well. This is expected since relaxing the threshold would include the restaurants from the conservative model and more. The list shown above would be most suited to foodies looking for a Michelin star experience without having to deal with as much demand.

Random Forest Model Result

The random forest model can be interpreted as a more accurate representation of the logistic regression model that was more focused on recall. In this case, fewer false positives are shown, giving a stronger sense of what a Michelin-starred experience is like without actually owning a star.

This time, only Gran Tivoli, which appears in both the precision and recall model, is predicted to earn a star, not Perrine or Sushi Ishikawa. This means that Gran Tivoli is the only restaurant to appear on all three models, making it a strong contender to eventually obtain a Michelin star.

It would be an extrapolation to apply these models to cities dissimilar to New York as the model was trained on specific parameters and potential confounding variables unique to the city. For example, New York City’s Japanese restaurants might be much better than some other places, so if the models were to be applied to a city such as Detroit, they would likely not fare as well given its heavy dependence on the Cuisine patterns they picked up in NYC.

Nevertheless, the models can still likely be applied to cities with a restaurant ecosystem similar to that of New York.

What is a Michelin Star?

The challenge

Using Machine Learning to Predict which Restaurant will receive a Michelin Star

Data Aggregation and Wrangling

About the variables

Preparing the dataset

Random Forest Model

Logistic Regression Model

So which restaurants should we be on the lookout for?

Precision-focused Logistic Regression Model Result:

Recall-focused Logistic Regression Model Result:

Random Forest Model Result

Closing thoughts