How Different are Air Carriers from One Another?

An Analysis of Non-Stop Flights To and From the United States

Akshay Natteri
11 min readSep 11, 2022

Introduction

The arrival of jet powered passenger aircrafts was one of the hallmarks of the twentieth century, enabling people to travel from one continent to another quickly and conveniently. Ever since, it has led to the flourishing of the airline industry which has grown by leaps and bounds. Today, the airline industry is one of the most competitive, where profit margins are low and risks are high.

Apart from day to day business risks such as volatile jet fuel costs and increasing competition, the industry is also severely exposed to global shocks such as pandemics (think COVID-19) and international military conflicts (think the Russia-Ukraine war). Hence, airlines have to walk a tight rope with little margin for error.

PAN AM is a prominent example of how exposed airlines are to global shocks. PAN AM, the pioneer of modern commercial passenger jet travel, was brought to its knees by the OPEC oil embargo in the 1970s. The embargo played a key role in driving the airline to bankruptcy.

An interesting piece about the rise and fall of PAN AM from Business Insider

Due to the high exposure to risks, airlines have to maintain maximum efficiency and consistently produce strong operational and financial results, to ensure that they can sustain any losses during the inevitable rainy days.

This need for optimal performance makes it interesting to analyze how the performance (from an operational standpoint) varies from one carrier to another, and try to understand what explains the difference (if any).

To this end, I try to compare two operational metrics of carriers with one another. The two operational metrics are, (a) number of passengers flown, and (b) load factor (number of passengers flown as a share of number of seats flown. This metric provides the capacity utilization of airlines).

The Data

The United States Department of Transportation maintains a public dataset that reports key statistics on the non-stop international flights to and from the United States, between 1990 and 2021. This is one of the few public-access databases available.

The dataset reports the aggregate number of passengers flown by each airline, across each route, for each of the months between 1990 and 2021. Similarly, data on the number of seats flown (i.e. capacity) is also available. While the dataset is restricted to just non-stop international flights to and from the United States, the data could still provide us some insights into the performance of various carriers (I use airline and carrier interchangeably in this article).

Some Initial Analysis

Prior to jumping into a detailed analysis of the performance of the carriers, it is essential to understand some of the basic trends in the data. I start with the simplest analysis, wherein, I rank the carriers by the aggregate number of seats flown in each year. I track the top three carriers for each year between 1990 and 2019.

Let’s begin with the decade from 1990 to 1999, American Airlines was the undisputed leader across the entire period in terms of seats flown. PAN AM ranked second in 1990, but sadly this was its final year of operation. Northwest Airlines ranked third in 1990, and occasionally featured among the top three carriers during this period. United and Delta were the more prominent names apart from American Airlines during this period.

Top Carriers — According to Number of Seats Flown 1990–1999

In the next decade from 2000 to 2009, apart from American Airlines, Delta, and United, Continental Airlines also featured regularly among the top three.

Top Carriers — According to Number of Seats Flown 2000–2009

In 2010, Continental and United entered into a merger, with United Airlines keeping its name, and Continental keeping its logo. This formed what was at the time, ‘the world’s biggest airline’. The effect of the merger shows in our data as well. Post 2011, the now merged United Airlines topped the charts till 2015. It is important to note that 2012 saw the consolidation of the current ‘big three’ carriers of the US — American, Delta, and United. Until 2021, the top three ranks are held by the big three.

Top Carriers — According to Number of Seats Flown 2010–2019

Choosing a Suitable Timeframe for the Study

Given the sheer dynamism in the sector, it is essential to choose a timeframe when there were no major mergers among the carriers. It would be very flawed to treat United Airlines in 2009, prior to its merger with Continental, as the same with the United Airlines of today.

The time frame has to be post 2012, to ensure that it’s post the merger of United and Continental. Similarly the timeframe should avoid 2020 and 2021 as the aviation industry underwent a massive shock due to the COVID-19 pandemic. This implies that the best and most latest timeframe we could use to compare airlines would be between 2012 and 2019, when the market was relatively stable.

Choosing a Suitable Set of Carriers for the Study

Having selected the timeframe for the study, I move to the next issue on which carriers to choose and which to not. There are a large number of airlines that operate or have operated non-stop routes to the US. Keeping all the carriers in the analysis would just involve too many variables and outliers, which may impair the quality of the results.

I hence impose two conditions for a carrier to be selected. The first is that it should have consistently flown non-stop international routes to and from the US between 2012 and 2019. In the airline industry, many airlines enter and leave markets, we only want to stick to those who have consistently stayed in the market. Only 57% of the carriers in the database have consistently flown non-stop international routes to and from the US between 2012 and 2019. I will refer to this 57% of carriers as the ‘consistent carriers’.

The second condition is based on the airline’s share of the total capacity. The big three airlines (American, Delta, and United) accounted for a staggering 38.4% of the total seats flown by consistent airlines. The share of the other airlines are substantially lower, as shown in the graph below.

Share of Total Seats Flown by the Top 50 Consistent Carriers

I only choose the top 52 carriers, which account for around 90% of the seats flown. 90% of the market is a very substantial share and captures most of the major players. I will refer to these 52 carriers as the ‘selected carriers’.

An analysis of the selected carriers in terms of their capacity and growth over the years between 2012 and 2019 is provided below. The chasm between the big three and the rest in terms of capacity is quite clear here.

Share of Total Seats Flown by the Top 50 Consistent Carriers

Choosing a Suitable Set of Routes for the Study

I restrict the analysis to routes that were consistently served by at least two carriers. The logic behind this is simple, if we want to compare carriers, we want to look at routes where they are in competition, not routes where they are the only service providers.

For example, for the period between 2012 and 2019, the service from New York’s John F. Kennedy (JFK) Airport and Delhi’s Indira Gandhi International Airport was only provided by Air India. Hence, it is better to drop this route from the analysis, as there is no competition in the route for Air India to compare with.

After dropping these single service provider routes, we end up with 106 routes.

Before jumping into a detailed comparison of carriers, I did a quick ranking of the top carriers that serve in the top 25 routes (by number of seats flown).

*Selected carriers only. Matrix showing the Top Carriers for the Top 25 Routes. International Air Transport Association (IATA) Airline and Airport Codes are used. Unique color codes are used for each airline.

The top route is undoubtedly the golden route between London’s Heathrow Airport and New York’s JFK Airport. British Airways and Virgin Atlantic fly more seats than American Airlines, Delta, or United on this route. While this may seem surprising, it actually isn’t. London Heathrow is a major base for British Airways, and acts as a hub for connecting traffic from the rest of the globe (primarily Europe and Asia) to the United States. The same can be seen in a number of other routes. Japan Airlines has the highest capacity for the Honolulu (Daniel K. Inouye) — Tokyo (Narita) route, Air France has the highest capacity for the JFK — Charles de Gaulle (Paris) route, Korean Airlines has the highest capacity for the Los Angeles — Seoul (Incheon) route, and so on. This is an important finding to keep in mind while going over the rest of the analysis.

Now that we have chosen a time frame and a comparable set of airlines and routes, its time to start with the analysis.

The Analysis

Before we begin the analysis, I would like to quickly recall the two operational metrics that are tracked. They are (a) number of passengers flown, and (b) load factor (number of passengers flown as a share of number of seats flown. This is a metric that provides the capacity utilization of airlines).

It is important to note that both the metrics are affected by the time of year (thanks to the seasonal nature of air travel demand) and the route. Clearly a JFK-LHR is going to have a substantially higher traffic on average than say a BOS-DXB (Boston to Dubai). Hence, we need to remove the impact of time and route from the metrics before comparing them across carriers.

I achieve this by ‘de-meaning’ the data at the route-time level. To see what this means, take the example of a particular route, JFK-LHR, for a particular time, January 2013. De-meaning implies that I will subtract the average load factor for the LHR-JFK route in January 2013 (across carriers) from each of the carrier’s load factor for the LHR-JFK route in January 2013. What this does is it compares apples with apples, if a carrier is recording load factors that are above the average (for the given time and route), then it implies that the carrier is performing better than its competitors in the route at that time.

Hence, for the remainder of this study we will focus on the deviation in load factor or deviation in number of passengers flown from the corresponding average for the route and time.

The distributions of the deviation in number of passengers flown and load factor are shown below.

The distribution of the deviation term for number of passengers flown. Reported in 10,000 passengers.
The distribution of the deviation term for load factor. Reported in percentage points.

I use a simple Bayesian Hierarchical Model to estimate the mean deviation for each carrier. Details of the model and usual checks are provided in the appendix. The Bayesian model gives us the average deviation of each carrier.

Results —Who’s flying more passengers?

First let us analyze the results for the deviation in passengers flown. Cathay Pacific on average flies around 12,500 more passengers than its competitors. A similar positive deviation is also seen for Copa, Air Canada, Emirates, Air France, Aeromexico, British Airways, Lufthansa, and Korean Airlines. This is not surprising as mentioned earlier, compared to US carriers, the foreign carriers are bound to fly more passengers as their airports serve as a hub for passengers from across the globe.

Results for Average Deviation in Passengers Flown

The one interesting finding is that jetBlue features among these international carriers. jetBlue is a US based carrier which flies to London and several destinations in Latin America. The low-cost and quality services might be a reason for jetBlue’s better performance when compared to its competitors.

Among the big three, United and American seem to perform better than Delta.

Results — Who has better capacity utilization?

When looking at the deviation in load share, Cathay Pacific slips to the negative. This means that while Cathay is carrying more passengers, it still has poorer load factors than its competitors as it is not adequately filling up its capacity. Emirates, Korean Airlines, Air Canada are other examples of the same phenomenon. However, Copa, British Airways, Lufthansa, and Air France perform well both in terms of passengers and load factors, indicating scale and efficiency.

Results for Average Deviation in Load Factor

When looking at the American carriers, Delta posts a higher load share than its competitors, while American and United slip into the negative. Again, while Delta flies fewer people, it is utilizing its capacity better than United or American. jetBlue on the other hand, has both higher load factors than its competitors, while also flying more people.

Conclusion

The tradeoff faced by airlines is very evident from the data. They could choose to fly more people, which would mean flying larger aircrafts and lower utilization, or they can fly lesser people but achieve better capacity utilization.

Very few airlines such as jetBlue, Copa, Air France, Lufthansa, and British Airways are able to perform well on both metrics. While these results are restricted to a narrow market (direct flights to and from the United States), it provides us a small glimpse into the world of aviation.

Appendix — The Bayesian Model

The deviation in passengers flown/load factor from route-time average is used for modelling.

I assume a t-distribution as the likelihood model. The degrees of freedom parameter (nu) has a Gamma(2,0.1) prior. The error parameter (epsilon) has a Half-Cauchy(0,3) prior. The mean parameter (beta) varies for each carrier and has a normal prior. The hyperpriors for the prior distribution of beta are Normal(0,5) for mean of the prior distribution (mu) and Half-Cauchy(0,5) for standard deviation of the prior distribution (sigma). I use a non-centered parameterization of N(mu,sigma), to ensure proper sampling. The model is summarized in the figure below.

The Bayesian Hierarchical Model — Only 48 of the 52 carriers remained in the dataset after the selection criteria was applied for choosing routes.

In terms of fit, we have a very good fit for the load factor model.

Posterior-Predictive Checks for Load Factor

The fit for passengers flown is not as good, but still arguably acceptable.

Posterior-Predictive Checks for Load Factor

Author: Akshay Natteri Mangadu — All views are personal.

--

--