Istanbul and Food Delivery: An Inflated Affair

Insights from food delivery data: 100% inflation rate in 18 months

Efe Baslar true

Humble beginnings

As with my major data analysis/science projects, this one has followed a course of growth over the last couple of years. As my approach changed, the product manifested itself in different and more detailed forms (for the better, I would say). But the main reason I am interested in food data is still the same. That is, well, that there are fundamental differences in how people from different backgrounds make eating decisions on a daily basis. After all, we are what we eat.

Or, is it the other way around?

I am not only talking about religion-driven or simple income-induced differences. Even eating can get political under particular conditions and certain edible items can and will be perceived as signals for association with a social group, as these signals usually become even more salient in polarized social settings (DellaPosta, Shi, and Macy 2015). As I continue exploring food industries in various locations (next up: Berlin), I took a detour to revisit Turkey’s leading but increasingly infamous1 food2 delivery leviathan:

One and half a year ago, for a tweet of mine that became somewhat of a hit (which I linked just below), I had tapped into the aforementioned delivery service for the first time. Scraping the data off it was rather straightforward: crawling JavaScript-based content was not strictly necessary unless you wanted to have access to restaurants that are not open for delivery at that time of the day. This required (and still does) activating a simple check-box that triggered a JavaScript event that listed the restaurants regardless of their state. I did not particularly like that there was a possibility that some restaurants didn’t get picked up by the rvest script but I still went along with scraping the data during the peak hours because I did not actually have much free time then. Fast forward a couple of days, I had curated a data set that featured the menus of a sizable portion of the restaurants in Istanbul that partnered up with yemeksepeti, and the list of neighborhoods these restaurants served.

Speaking of web-scraping, here is the github repository for this most recent version of the project. I will try to make available all my projects unless I suspect that there might be possible terms of service violations with publicizing the codes and/or data. Whatever, back to the task at hand: This time around, I used Selenium in python to collect the links to each restaurant and rvest in R to collect the menus.

Alright, back to September 2020. if you are interested - which you probably are, if you are here reading this post -, just take a look at the tweet. The main idea is that there are considerable differences in average Lahmacun prices (a popular dough, spice and meat-based food, sometimes referred to as the “Turkish Pizza”, woefully) across different districts in Istanbul. Naturally, the richer and the more attractive is the district, the higher are the prices… ostensibly. But is it that simple? There are numerous deviations from what you would expect based on that intuition and we might want to investigate this further. So, one of my motivations was to improve the script and delve further into the dynamics that surround the pricing of Lahmacun and other practical foods.

In addition to my usual rationale for getting my hands dirty around the inspect functionality of my Firefox browser, I had to satiate my curiosity in one additional aspect. If you are keeping track of the news then you might possibly have heard of inflation soaring high3 in Turkey and if you are a little perceptive, you might also have noticed that the official figures for annual inflation4 are met with stark suspicion5, well, for a plethora of reasons. So, for taking a peek at what “true” inflation looks like, there isn’t a much better way than utilizing a service that Istanbulites have come to rely upon even more heavily in the Covid-19 world. Alright, without further ado, let us move on to the punchline.

The Punchline

Figure 1: Average lahmacun prices in each Istanbul district: Use the plotly controls on the upper right corner to navigate the plot!

OK. Take a look at the map above. Move your mouse cursor over the districts for more information, if you’d like. Take a look at the previous iteration of the same map. Yup. It is correct. The prices have increased around 100% and that happened within a span of 18 months. Let that sink in. And no, I have not changed my methodology greatly, barring some improvements for more accurate estimation. I’ll be detailing these improvements below but overall it is pretty similar to one I had then. Overall, inflation seems to be running rampant in the food industry and the prices seem to have inflated equally across districts: around a whopping 100%! The figures are based on 2198 yemeksepeti partner restaurants with Lahmacun on their menu (those that satisfy some qualifications I laid out below).

The districts with a sizable portion of secular, upper-middle-class residents (read: economically better off) seem to be the ones in which you can have the most pricey lahmacuns. But the number of restaurants serving in each district is not necessarily proportional to the population residing within that district. It shouldn’t come as a surprise when I tell you that it is mostly the younger inhabitants, especially university students that use food delivery services.

In need of a representative statistic

Now, let us dive into some details about how this plot was created. Creating the plots was definitely not as straightforward as the maps themselves seem. I mentioned just above that I had made some improvements in my approach. Improvements seldom come without any further complexity and this is one of those cases (not always, of course).

yemeksepeti lists 961 precinct equivalent divisions for its partner network in Istanbul. This is deceptively similar to the true number of precincts in Istanbul, which stands at 964. However, some precincts are arbitrarily divided by yemeksepeti for operational reasons, as precincts are hardly homogeneous across different districts in Istanbul and not necessarily equally reachable: they range from a few hundred inhabitants to a hundred thousand.

Precincts constitute the smallest possible Turkish administrative unit, as per the latest regulations. Districts, depicted on the maps in this post, house a number of districts. Basic math tells us that for each of the 39 districts in Istanbul we can expect to observe around 25 precincts. The way yemeksepeti stores its data doesn’t tell much about where a restaurant is located. It would have been amazing to be able to have access to geospatial data but a web-scraper must live with what he can get his hands on.

Even knowing about the precise locations of each restaurant would not be able to change a simple fact: restaurant deliveries are trans-precinct and most of the time trans-district. Each precinct is (literally) fed by a number of restaurants located in and around itself and some precincts are served by more restaurants than other precincts. Even though I was aware of this simple fact when I first delved into yemeksepeti data, I had only taken a simple average for each district, without giving any effort to create a more representative statistic under this particular structure of the data.

A sample listing of restaurants, the restaurant at the bottom is around 15 kilometers away from the focal precinct and has a huge minimum delivery amount for a lahmacun bakery.

Figure 2: A sample listing of restaurants, the restaurant at the bottom is around 15 kilometers away from the focal precinct and has a huge minimum delivery amount for a lahmacun bakery.

In addition, programmatically searching each restaurant’s menu for the product of interest (“Lahmacun”, in this particular case), has its own problems. If your goal is to filter everything with “Lahmacun” in its name, then you are in for a treat because you are going to get every form of seemingly relevant item in the results. The solution is to include some keywords or other elements you do not want to see. Since I wanted to focus as much as possible on the singular lahmacun, I employed that solution. You can refer to my github repository if you are interested in how I filtered out the “lahmacuns of interest”.

Furthermore, most of the partner restaurants have set a minimum threshold on the total amount of the order in TL, conditional on the precinct they serve. The customer therefore must surpass that threshold in order to guarantee delivery for the order. Although I sincerely doubt that these values are issued with any meticulous calculation, the underlying rationale probably draws from a simple and rational mechanism: cost-benefit trade-off.

If a restaurant sets too high a threshold, then it could mean that they regard that particular customer to be too far away and that it is not worth it to send out a rider for that delivery or that the restaurant is too fancy for a small delivery. There are some ridiculous combinations, e.g. a Lahmacun restaurant requiring a minimum 1000 TL delivery amount for some precincts (Since you have studied the map above you know you would need to order around 50 Lahmacuns on average to qualify for such delivery), or a fancy restaurant offering its gourmet Lahmacun for around 90 TL.

It is obvious that a person looking for a regular Lahmacun would hardly consider those kind of restaurants to be among viable alternatives for his lovely lahmacun-to-be. But, in theory, that particular restaurant is still part of the restaurants that serve that specific precinct and should somehow be incorporated into the statistic. My solution is to apply a weighted average based on this delivery threshold (you can see a sample of restaurants with different thresholds in the screenshot above).

Let \(p_i\) denote the weighted average of the restaurants serving the precinct \(i\), \(t_{ij}\) is the threshold associated with restaurant \(i\) serving to the precinct \(j\) and \(p_{j}\) corresponds to the price of a product at restaurant \(j\), to my knowledge the restaurants do not engage in variable pricing, therefore the prices are invariant to \(i\). Yes, the prices in each restaurant are weighted by the inverse of the threshold \(t_{ij}\). You can notice that some restaurants have a threshold of 0, so I just replaced those values with 10. In addition, because the number that serves each precinct can differ, it is worthwhile to keep the number of restaurants serving each precinct. I used the restaurant counts as weights for each precinct when calculating the weighted average for each district.

\[p_i = \sum_{j \in J}\frac{p_{j}/t_{ij}}{\sum_{j \in J} 1/t_{ij}}\]

Assuming that everything went according to the plan, we get a list of all the precinct-equivalent units at yemeksepeti, with all the desired statistics. Thankfully, within the URL to each of the precinct-equivalent units, we have the name of the district it belongs to. This gives us, with some regular expressions workaround, the possibility of extracting the names of each Istanbul district. After doing all the processing, we get a list of the 39 districts and the corresponding weighted-average lahmacun prices. Yep. That was that simple.

Next up, even though I cannot provide any benchmark for the severity of the consumer inflation of other food items, you can find below maps and charts for Hamburgers and Beef/Chicken Döners. In the section that follows, I’ll a bit about a couple of regression models to dig deeper into precinct-based, rather than district-based analyses.

While you are still here take a look at the histogram below of Lahmacun restaurant prices in the entire data set. The average price is around 16.30 TL, which is slightly over 1 euro.

Figure 3: Histogram of average restaurant prices for lahmacun

Moving Beyond Lahmacun

I’ll refrain from engaging in commenting in detail too much on the plots below as they follow the same principles I laid out above and I believe they can speak for themselves. These may not make much sense without a benchmark, but the price patterns observed for Lahmacun seem to persist across other deliverable food. The table below summarizes the number of restaurants that I used for the calculation of the weighted average prices.

Table 1: The number of restaurants offering each product
Product n
Lahmacun 2198
Burger 2799
Döner 1805
Tavuk Döner 1569